[OAI-implementers] ListRecords request w/out an until..

Benjamin Anderson benanderson.us at gmail.com
Tue Feb 1 10:26:37 EST 2011


Thanks Simeon.  I'm looking over the section you linked to...

Repositories that implement resumptionTokens *must* do so in a manner that
> allows harvesters to resume a sequence of requests for incomplete lists by
> re-issuing a list request with the most recent resumptionToken
>

I'm having a hard time understanding this sentence. What is meant by
"incomplete list"?  What is meant by "re-issuing a list request"?

I was just thinking that my harvester assumption wouldn't work for the given
scenario:

Let's assume a provider that allows for updates during harvests and that
this provider only keeps the most recent updated date (not all update
dates).  If a record was updated before t0 and again after t0 (but before it
was included in the harvest initiated at t0), then the harvester will not
get the record even though it should have.  That's probably a rare case, but
nevertheless bound to happen.  Are there guidelines for the best way to use
an until as a harvester?

Thanks again,
Ben


On Tue, Feb 1, 2011 at 10:05 AM, Simeon Warner <simeon.warner at cornell.edu>wrote:

> Hi Ben,
>
> This is covered in the in section 3.5.1 of the specification:
>
> http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm#Idempotency
>
> I think your solution for the harvester is the correct one. Provided the
> harvester starts again with from=t0 all changes between t0 and t2 will be
> harvested, irrespective of whether or not they were included in the original
> response (modulo understood problems with items that move between sets for
> set selective requests).
>
> Cheers,
> Simeon
>
>
> On 02/01/2011 09:09 AM, Benjamin Anderson wrote:
>
>> Hi,
>>
>> I'm wondering what others are doing when a ListRecords request w/out an
>> until comes in.� Consider this scenario:
>>
>> t0 - harvest request (with no until) is initiated
>> t1 - record 101 is added to the repo
>> t2 - harvest is finished (it took multiple requests to complete)
>>
>> Should record 101 be included in the harvest data?� If not, will the
>> client better issue their next harvest with a from=t0 (a from=t2 would
>> be invalid because they'd miss out on record 101).
>>
>> We have implemented both oai-pmh harvesters and providers, so I have to
>> consider both ends of this.� Here's what I'm thinking...
>>
>> As a Provider
>> I will simply lock the repo so that the above scenario can't happen.� If
>> someone is already harvesting (there exist unexpired resumptionTokens)
>> then I will not update the repository.
>>
>> As a Harvester
>> I will always use the until parameter with the value of the time the
>> harvest was initially started.
>>
>> I think this keeps me clear of any problems.� Anyone else have thoughts
>> or care to share your solutions?
>>
>> Thanks,
>> Ben Anderson
>>
>>
>>
>>
>> _______________________________________________
>> OAI-implementers mailing list
>> List information, archives, preferences and to unsubscribe:
>> http://www.openarchives.org/mailman/listinfo/oai-implementers
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.openarchives.org/pipermail/oai-implementers/attachments/20110201/53b71e15/attachment.htm


More information about the OAI-implementers mailing list