[OAI-implementers] OAI identifier resolver

Tue, 21 Oct 2003 09:19:09 -0400

I agree that creating "cool" URLs is the main motivation I have for using a
resolver. I've used such "cool" URLs in association with OAI repositories in
several projects including the OpenURL registry
(http://www.openurl.info/registry) and several thesauri including GSAFD
Genre Terms at http://alcme.oclc.org/gsafd/.

Regarding distributed solutions, I've already offered to make the resolver
code available as open source for communities that want to host an
independent registry of repositories. I could also make a knockoff of it
that would work with a single repository if people are interested. I could
make the stand-alone version available as a Java Servlet and/or a CGI script
of some sort if asked.

I think the concern about algorithmically assigned repositoryIdentifiers is
overstated. If someone cares to give a specific example of where it's a
problem, I'm sure we can work out something reasonable.

Jeff

>  -----Original Message-----
>  From: Xiaoming Liu [mailto:liu_x@cs.odu.edu]
>  Sent: Monday, October 20, 2003 9:41 PM
>  To: Lonnie D. Harvel
>  Cc: OAI-implementers (E-mail)
>  Subject: Re: [OAI-implementers] OAI identifier resolver
>  
>  This is back to the problem why we need a resolver. If both 
>  baseURL and
>  record identifier are supplied, it doesn't make a lot sense 
>  to develop a
>  resolver. I think the motivation is to provide a "cool" URL for each
>  record, and make it easy to exchange information by REST model.
>  
>  OAI has no centralized mechanism to maintain unique 
>  repository name, it's
>  either done by one centralized registry -- like UIUC 
>  registry, or done
>  by a distributed way -- like hashing baseURL or other better 
>  ways. In the
>  distributed way, I can add a link to Purl-OAI resolver without prior
>  knowledge of how repository name is maintained in Purl-OAI resolver.
>  That's my reason of favoring distributed method.
>  
>  xiaoming
>  
>  
>  
>  
>  
>  >
>  > Adam Farquhar wrote:
>  >
>  > > Xiaoming,
>  > >
>  > > Selecting an approach that will be certain to fail, but 
>  unpredictably,
>  > > is not a good 'engineering' approach, especially when 
>  there are other
>  > > approaches that do not fail.  For example, taking a 
>  base64 encoding of
>  > > the base URL or just using the base URL itself will both 
>  provide a
>  > > unique identifier.
>  > >
>  > > Adam.
>  > >
>  > >>>Hash algorithms such as MD5 or CRC32 cannot be used to 
>  generate unique
>  > >>>identifiers.  These algorithms will occasionally 
>  produce the same output for
>  > >>>different input strings (this is why hash tables 
>  require a mechanism for dealing
>  > >>>with collisions).  Common approaches to generating 
>  unique identifiers use some
>  > >>>sort of a registration mechanism to appropriately 
>  partition the space of possible
>  > >>>values.  Successful ones will leverage an existing 
>  registration mechanism, such
>  > >>>as DNS.
>  > >>>
>  > >>>
>  > >>
>  > >>I agree hash algorithm is not a "perfect" way to generate unique
>  > >>identifier for a repository, but it may be acceptable in 
>  engineering
>  > >>perspect, the collision possibility will be pretty low 
>  in current scale of oai data
>  > >>providers (<500?).
>  > >>
>  > >>I think the basic problem is how to render OAI baseURL 
>  to a shorter,
>  > >>readable string in non-collision way. The algorithm 
>  should be repeatable
>  > >>-- Anyone can use same algorithm to generate same output 
>  given a baseURL.
>  > >>I will be glad to see other approaches.
>  > >>
>  > >>
>  > >>
>  > > _______________________________________________ OAI-implementers
>  > > mailing list List information, archives, preferences and to
>  > > unsubscribe:
>  > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>  >
>  >
>  >
>  
>  _______________________________________________
>  OAI-implementers mailing list
>  List information, archives, preferences and to unsubscribe:
>  http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>