[OAI-implementers] Better resumption mechanism - more important than ever!
'Alan Kent'
ajk@mds.rmit.edu.au
Wed, 6 Mar 2002 17:44:12 +1100
On Tue, Mar 05, 2002 at 09:53:54PM -0500, Xiaoming Liu wrote:
> 1) Could the resumptionToken (in your case restartToken) be re-used?
>
> I agree the retry algorithem is theoretically unsafe in current protocol,
> thanks. However, the same question also exists in "restartToken" and
> must be addressed before we talk about question 2. If they can not be
> re-used, the harvester has to start from scratch. It looks like the OAI
> 1.1 doesn't give clear answer to this question. Hopefully it
> could be answsered in 2.0
*If* restartToken was introduced, it would be idempotent-ish. Its whole
purpose would be to allow it to be reissued and old results come back.
I would define it as "returning all of the records not returned so
far in the current transfer, possibly including other records that
have already been transferred".
> 2) If it is legal to re-use, should we introduce a restartToken concept?
>
> My personal opinion is restartToken will bring too much complexity, and
> it's not necessary.
I agree the concensus is to stick to resumptionToken's. That's fine.
Just pushing restartToken to see what problems/issues arrise.
I would therefore instead propose that there be a standard way in
the Identify response to say 'resumptionToken's are idempotent'
and also 'resumptionToken's can be rerequested' in case of network
failure. DP implementors *should* also try to make them long lifed
(days to weeks) for large repositories.
> In your case, I could imagine it can be done by current OAI
> resumptionToken: assume the proposed tokens in your suggestion are called
> alan_restartToken and alan_resumptionToken respectively.
>
> oai_resumptionToken=alan_restartToken + alan_resumptionToken
>
> So data provider (DP) can always parse the oai_resumptionToken, in most
> case, the session is valid and DP just uses alan_resumptionToken; if
> anything goes wrong, DP need redo the query, DP have the freedom to use
> the alan_restartToken. The harvester should not know what happens behind
> the scene. At this scenario, the time-to-live could be month, year ;-)
That works. I can use it with my Z39.50 result set/query. If the
result set is still around, reuse it. If it has timed out, redo the
query. I would not be able to return the number of records (when
redoing the query, the number might change), but overall things
would work.
Alan