[OAI-implementers] Experimental OAI Registry at UIUC
Thomas G. Habing
thabing@uiuc.edu
Thu, 09 Oct 2003 12:52:22 -0500
Hi all,
This is to announce the availability of a new experimental registry of OAI
providers. The registry can be found at:
http://oai.grainger.uiuc.edu/registry/
The registry was constructed by collecting the baseURLs of all the providers
we could from various ListFriends.pl sites, Hussein's repository explorer,
etc., as well as a search tool developed using the Google SOAP API to search
for possible baseURLs (surprisingly this yielded 30+ new provider sites).
Once this list of baseURLs was compiled, a crawler harvested select data
from each provider, such as Identify, ListSets, ListMetadataFormats, as well
as a collection of sample records, and record counts for each combination of
set and metadata format (if possible). The crawler also traversed to
baseURLs found in friends containers or via the provenance container from
sample records. This resulted in a list of about 340 OAI providers which
are able to respond, plus about 90 providers which seem to be down. Many of
these are versions of the same provider using different versions of the
protocol, still this list is about twice as big as any other list I've come
across.
Plus, it is searchable, which was my primary goal, to make it easier to find
relevant OAI providers. Essentially I have constructed a full-text index on
each repository's Identify, ListSets, ListMetdataFormats, and a collection
of sample records.
I have also done some analysis and made various reports available, such as
repositories which support compression, a count of the most frequently
occurring top-level domains, etc. Check out the "graph of friends" showing
a graphical representation of interconnections between repositories either
via friends or provenance. Did you know that there are 52 distinct XML
metadata schemas in use by OAI repositories!
Based on this database I also have developed an experimental OAI redirector.
If you have an OAI identifier (i.e. oai:PITTAEI.OAI2:558) but don't know
where it came from, submit it to
http://oai.grainger.uiuc.edu/registry/rx?oai:PITTAEI.OAI2:558
and you will be redirected to the oai_dc format record for that id, if an
appropriate baseURL can be found in the registry database. Unfortunately it
appears than many sites, especially the GenericEprints, are using the same
repo identifier (see http://oai.grainger.uiuc.edu/registry/ListRepoIds.asp),
so if there are multiple possible baseURLs for a given id, I have a ranking
algorithm that attempts to guess the best. I may do some more work on this
in the future, maybe looking at an OpenURL type resolver function.
Anyway, feel free to try it out and let me know of any problems or
suggestions you might have. Also, if you know of any more OAI providers I
should add let me know.
Kind regards,
Tom
--
Thomas Habing
Research Programmer, Digital Library Projects
University of Illinois at Urbana-Champaign
155 Grainger Engineering Library Information Center, MC-274
thabing@uiuc.edu, (217) 244-4425
http://dli.grainger.uiuc.edu