[OAI-implementers] DP9 and HTML metadata
Michael L. Nelson
mln@ils.unc.edu
Thu, 24 Jan 2002 14:18:25 -0500 (EST)
Walter,
These are excellent suggestions, and ones that I'm sure Xiaoming Liu can
easily add.
But since you're on the line, I have some questions for you ;-)
1. Do you have an official or personal opinion that you can share about
OAI & spidering?
2. DP9 is great for spiders that don't know any better, but what are the
chances of "OAI-aware" spiders? Or is that such a special case that its
not worth accounting for...
Specifically, I maintain http://naca.larc.nasa.gov/. Spiders are
frequently churning around in the tens of thousands of possible pages
there. Of course, this is a good substitute:
http://arc.cs.odu.edu:8080/dp9/listidentifiers/NACA
but even better would be a spider that knew to use:
http://naca.larc.nasa.gov/oai/
thanks,
Michael
On Thu, 24 Jan 2002, Walter Underwood wrote:
> As a spider engineer, I'd like to suggest an improvement to DP9.
> I'm sending this to the whole OAI list partly to introduce myself,
> and partly because it is an interesting omission in DP9.
>
> DP9 should use HTML metadata standards to present the Dublin Core
> metadata. Right now, it prettyprints the info, but that is not
> useful for a spider.
>
> In addition to the pretty representation, the generated HTML should
> include meta tags for each DC element. I'd recommend also using
> native HTML/HTTP standards for a couple of the elements:
>
> dc.title:Hamlet --> <title>Hamlet</title>
> dc.language:en --> <meta http-equiv="content-language" content="en">
>
> Our engine (Inktomi Enterprise Search) will use that metadata for
> the information presented in the results page. In addition, the
> engine can be configured to use DC.identifier as the URL which is
> presented with the results.
>
> Finally, if there are browsable index pages with links to the
> generated GetRecord pages, those should probably include a
> noindex robots meta tag. Lists of URLs are usually not very
> useful search results. They are excellent roots (start pages)
> for spidering, though.
>
> wunder
> --
> Walter Underwood
> wunder@inktomi.com
> Senior Staff Engineer, Inktomi
> http://www.inktomi.com/
>
> _______________________________________________
> OAI-implementers mailing list
> OAI-implementers@oaisrv.nsdl.cornell.edu
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>
---
Michael L. Nelson
NASA Langley Research Center m.l.nelson@larc.nasa.gov
MS 158, Hampton, VA 23681 http://www.ils.unc.edu/~mln/
+1 757 864 8511 +1 757 864 8342 (f)