[OAI-general] Search Engine Coverage of the OAI-PMH Corpus
Frank McCown
fmccown at cs.odu.edu
Wed Mar 8 12:16:09 EST 2006
We have just published an article that many of you may find interesting:
Frank McCown, Xiaoming Liu, Michael L. Nelson, and Mohammed Zubair.
Search Engine Coverage of the OAI-PMH Corpus. IEEE Internet Computing,
March/April 2006, Vol. 10, No. 2, pp. 66-73.
http://doi.ieeecomputersociety.org/10.1109/MIC.2006.41
You may access the technical report at
http://library.lanl.gov/cgi-bin/getfile?LA-UR-05-9158.pdf
Abstract:
Having indexed much of the "surface" Web, search engines are now using
various approaches to index the "deep" Web. At the same time,
institutional repositories and digital libraries are adopting the Open
Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to expose
their holdings. The authors harvested nearly 10 million records from
OAI-PMH repositories. From these records, they extracted 3.3 million
unique resource URLs and then conducted searches on samples from this
collection to determine how much of the OAI-PMH corpus the three major
search engines have indexed.
--
Frank McCown
Old Dominion University
http://www.cs.odu.edu/~fmccown/
More information about the OAI-general
mailing list