[OAI-implementers] OAI identifier resolver
Young,Jeff
jyoung@oclc.org
Tue, 21 Oct 2003 09:19:09 -0400
I agree that creating "cool" URLs is the main motivation I have for using a
resolver. I've used such "cool" URLs in association with OAI repositories in
several projects including the OpenURL registry
(http://www.openurl.info/registry) and several thesauri including GSAFD
Genre Terms at http://alcme.oclc.org/gsafd/.
Regarding distributed solutions, I've already offered to make the resolver
code available as open source for communities that want to host an
independent registry of repositories. I could also make a knockoff of it
that would work with a single repository if people are interested. I could
make the stand-alone version available as a Java Servlet and/or a CGI script
of some sort if asked.
I think the concern about algorithmically assigned repositoryIdentifiers is
overstated. If someone cares to give a specific example of where it's a
problem, I'm sure we can work out something reasonable.
Jeff
> -----Original Message-----
> From: Xiaoming Liu [mailto:liu_x@cs.odu.edu]
> Sent: Monday, October 20, 2003 9:41 PM
> To: Lonnie D. Harvel
> Cc: OAI-implementers (E-mail)
> Subject: Re: [OAI-implementers] OAI identifier resolver
>
> This is back to the problem why we need a resolver. If both
> baseURL and
> record identifier are supplied, it doesn't make a lot sense
> to develop a
> resolver. I think the motivation is to provide a "cool" URL for each
> record, and make it easy to exchange information by REST model.
>
> OAI has no centralized mechanism to maintain unique
> repository name, it's
> either done by one centralized registry -- like UIUC
> registry, or done
> by a distributed way -- like hashing baseURL or other better
> ways. In the
> distributed way, I can add a link to Purl-OAI resolver without prior
> knowledge of how repository name is maintained in Purl-OAI resolver.
> That's my reason of favoring distributed method.
>
> xiaoming
>
>
>
>
>
> >
> > Adam Farquhar wrote:
> >
> > > Xiaoming,
> > >
> > > Selecting an approach that will be certain to fail, but
> unpredictably,
> > > is not a good 'engineering' approach, especially when
> there are other
> > > approaches that do not fail. For example, taking a
> base64 encoding of
> > > the base URL or just using the base URL itself will both
> provide a
> > > unique identifier.
> > >
> > > Adam.
> > >
> > >>>Hash algorithms such as MD5 or CRC32 cannot be used to
> generate unique
> > >>>identifiers. These algorithms will occasionally
> produce the same output for
> > >>>different input strings (this is why hash tables
> require a mechanism for dealing
> > >>>with collisions). Common approaches to generating
> unique identifiers use some
> > >>>sort of a registration mechanism to appropriately
> partition the space of possible
> > >>>values. Successful ones will leverage an existing
> registration mechanism, such
> > >>>as DNS.
> > >>>
> > >>>
> > >>
> > >>I agree hash algorithm is not a "perfect" way to generate unique
> > >>identifier for a repository, but it may be acceptable in
> engineering
> > >>perspect, the collision possibility will be pretty low
> in current scale of oai data
> > >>providers (<500?).
> > >>
> > >>I think the basic problem is how to render OAI baseURL
> to a shorter,
> > >>readable string in non-collision way. The algorithm
> should be repeatable
> > >>-- Anyone can use same algorithm to generate same output
> given a baseURL.
> > >>I will be glad to see other approaches.
> > >>
> > >>
> > >>
> > > _______________________________________________ OAI-implementers
> > > mailing list List information, archives, preferences and to
> > > unsubscribe:
> > > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> >
> >
> >
>
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>