[OAI-implementers] SOAP-PMH
Young,Jeff
jyoung at oclc.org
Tue Dec 7 20:25:11 EST 2004
I tried to come up with a harvesting-based mechanism to work around the
limitations of DP9 for the DSpace community. Rather than work
interactively with a repository like DP9, I harvest the repositories and
create a set-based hierarchy of static HTML pages that I then expose to
search engines. You can see the prototype at
http://www.worldcatlibraries.org/DSpace/. This produces a bushier than
DP9 making it easier for Google et al to crawl in its entirety.
Jeff
> -----Original Message-----
> From: oai-implementers-bounces at openarchives.org
> [mailto:oai-implementers-bounces at openarchives.org] On Behalf
> Of Michael Nelson
> Sent: Tuesday, December 07, 2004 8:12 PM
> To: Pete Johnston
> Cc: oai-implementers at openarchives.org
> Subject: RE: [OAI-implementers] SOAP-PMH
>
>
> > I'm not sure it is strictly true that Google needs to invest in
> > OAI-PMH in order to "index on OAI resources".
> >
> > The Googlebot can crawl HTML representations of the
> metadata records
> > which are also exposed via OAI-PMH (assuming they are served at
> > persistent Google-friendly URIs etc) Isn't this exactly
> what services
> > like the DP9 gateway enable/provide?
> >
> > http://dlib.cs.odu.edu/dp9/
>
> yes, but there are a number of problems with this approach,
> esp. for large sites. many web crawlers are biased to go
> "wide", not "deep", and DP9 produces deep trees that crawlers
> don't always traverse.
>
> DP9 can also be unkind to repositories; the proxy between the
> robot and the repository obscures any throttling the repository does.
>
> DP9 is a neat trick, but it is not a complete solution.
>
> see also:
>
> http://www.cs.odu.edu/~liu_x/dp9/dp9.pdf
>
> regards,
>
> Michael
>
> >
> > As Jeff noted a couple of messages upthread, the issue is
> not SOAP v
> > OAI-PMH, or search v harvest, but whether an implementation
> of OAI-PMH
> > semantics over SOAP offers anything that is not available using the
> > current implementation over HTTP GET/POST.
> >
> > Pete
> >
> >
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://www.openarchives.org/mailman/listinfo/oai-implementers
> >
>
> ----
> Michael L. Nelson mln at cs.odu.edu http://www.cs.odu.edu/~mln/
> Dept of Computer Science, Old Dominion University, Norfolk VA 23529
> +1 757 683 6393 +1 757 683 4900 (f)
>
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://www.openarchives.org/mailman/listinfo/oai-implementers
>
>
More information about the OAI-implementers
mailing list