<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
</style>
</head>
<body class='hmmessage'>
Hi,<br><br>I am very new to OAI, so I am limited help, and more looking to learn myself.<br><br>The burden might be on the harvester. The Implementation Guidelines for harvesters appears to address this in section 3, stating that harvesters should overlap requests. And it would seem a good implementation should base the "From" in subsequent requests based on previous harvesting timestamps found in the records themselves - not their own arbitrarily chosen "until" value.<br><br>The later problem you mention seems to be a problem that is solved by mirrors / load balancing. With a mirror, you essentially have 2 copies of the site, and use a 302 HTTP codes to stop requests to the site you are updating and redirect to the other copy. With a load balancer this site switching can be done invisibly to the harvester.<br><br><br><br><br><br><br><br>> Date: Tue, 2 Jun 2009 13:37:01 +0200<br>> From: Rozita.Fridman@FIZ-Karlsruhe.DE<br>> To: oai-implementers@openarchives.org<br>> Subject: [OAI-implementers] issues with OAI-PMH specifications for        OAI-Provider implementations using a cache<br>> <br>> Hello all,<br>> <br>> we developed an OAI-Provider for Escidoc repositories.<br>> Escidoc-OAI-Provider is based on the Fedora-OAI-Provider, which uses a<br>> cache to reduce a response time. Escidoc repositories intend to contain<br>> multiple millions of objects. The Escidoc-Core framework only requires<br>> that objects metadata stored in a Escidoc repository are well formed<br>> xml-structures. Therefore using of a cache in the Escidoc-OAI-Provider<br>> is essential to ensure validness of metadata in OAI-PMH response and an<br>> acceptable response time. <br>> <br>> But the current OAI-PMH protocol specification doesn't account for some<br>> issues, caused by the employment of a cache.<br>> <br>> The main problem is a time lag between a harvester request and a last<br>> cache update:<br>> A harvester asks the OAI-Provider for all records that have changed<br>> between T0 and T2 in the underlying repository. The last cache update<br>> was at T1.The harvester gets records that have changed between T0 and<br>> T1, but assumes that it got all changes between T0 and T2. Therefore in<br>> the next request it asks for records that have changed between T2 and T3<br>> and is missing all changes between T1 and T2. If cache update interval<br>> is long and the next cache update takes place after T3, the harvester is<br>> also missing all changes between T2 and T3 and so on.<br>> <br>> One proposal would be to put a date stamp of the last cache update into<br>> the OAI-PMH response, in order to inform a harvester about possibly<br>> missed records. <br>> <br>> Does anybody face the same problem? What do you think about it? Maybe<br>> there are better solutions for this problem?<br>> <br>> The other issue is that depending on the OAI-Provider implementation a<br>> cache may be in an inconsistent state while a cache update process is<br>> running. Are there means in the OAI-PMH protocol to respond to harvester<br>> requests during a cache update? A possible solution would be to respond<br>> with a HTTP-status code 503-Service unavailable (section 3.1.2.2 of the<br>> specification), but the problem is to specify Retry-After period. A<br>> duration of the cache update is not constant, it depends on the changes<br>> in the repository.<br>> <br>> Thanks a lot,<br>> Rozita<br>> <br>> <br>> <br><br /><hr />Insert movie times and more without leaving HotmailŪ. <a href='http://windowslive.com/Tutorial/Hotmail/QuickAdd?ocid=TXT_TAGLM_WL_HM_Tutorial_QuickAdd_062009' target='_new'>See how.</a></body>
</html>