[OAI-implementers] Harvesting- how to efficiently poll?
Simeon Warner
simeon at cs.cornell.edu
Mon May 1 09:38:05 EDT 2006
I'm not sure I understand your question properly. However, I think it
would be reasonable to assume that any repository that exposes only day
granularity datestamps is not updated more frequently than daily. I'd poll
at most once a day (for which you specify a 'from' parameter equal to the
previous day -- one increment of overlap is necessary to ensure nothing is
missed).
(As an aside, it is amazing to see how many RSS clients poll arXiv.org
very frequently when we do include the standard headers saying that we
update daily and give a time. One might have hoped that these headers
would increase efficiency but that does not seem to be playing out in
practice.)
--
Simeon
On Sat, 29 Apr 2006, steve racker wrote:
> If the granularity of an archive is YYYY-MM-DD and there are
> many records per day, how can one efficiently poll for the
> newest records? I would have expected there to be a way to
> specify the last seen record and get any newer records, but
> it appears the only method is to first make a request with the
> date then keep requesting on any encountered resumptionTokens.
> when a response is received with no resumptionToken, keep
> it until it expires, then the next poll starts with the date
> again. Is this correct? That seems to generate much repeated
> data in responses when polling with the last resumptionToken.
>
>
> ---------------------------------
> Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
More information about the OAI-implementers
mailing list