[OAI-implementers] hierarchical documents
Hussein Suleman
hussein@cs.uct.ac.za
Mon, 12 May 2003 11:41:42 +0200
hi
i haven't seen a "nice" solution to the problem of addressing hierarchy,
views and versions all at once. but if we consider the problems
independently, there are a few solutions that you can look at for ideas.
my favorite example of hierarchy-encoding is the IMS Content Packaging
specification, which makes embedded use of the IMS Metadata
Specification (see www.imsglobal.org for details). while the specs show
how to encapsulate content hierarchically, alas i don't know of anyone
actively exposing such packaged content through OAI-PMH - maybe someone
else will comment on this.
for views (such as the slide formats in your example), you could use a
metadata format that allowed for a list of simple objects. the CSTC
project that i used to work on did this with its native metadata format.
go to the following OAI record to see what we did:
http://www.cstc.org/cgi-bin/OAI/CSTC.pl?verb=GetRecord&metadataPrefix=oai_cstc&identifier=oai:CSTC:4
(of course, here the issue is: does a list denote parts or options? if
you want to be explicit, you should look at the IMS model.)
if you look at the same record in DC you will see that we used a "cover
page" to bind together the various files to form a single resource.
lastly, for versions, the OAI has a provenance container that allows you
to specify the relationship of a metadata record to an older version,
from a harvesting perspective. while it doesn't directly apply to
versions of the resource, the basic idea is similar ...
so, my take on your example is:
- the paper and slides could have separate identifiers as the resources
are themselves different (or use cover pages to hold them together)
- you can use "relation" tags (in DC or IMS-MS) to connect them if they
are separate
- instead of using metadata formats (which differentiate among views of
the metadata) or sets (which differentiate among categorisations of the
resource/metadata), you should use a metadata format with the inherent
ability to differentiate among views of the same item
the book example is more complicated and i haven't seen a solution i
like yet :) however, the principle i would apply is:
- expose the most useful granularity of data i.e. if a book is most
often used as a single unit, expose it as such ... and similarly for
separate chapters.
hope at least some of this makes sense :)
ttfn,
----hussein
Jakob Voss wrote:
> Hi!
>
> I am relatively new to OAI and going to set up a simple Data Provider
> for different kind of publications. Many publications consist of
> different parts in different file formats and publication types but
> I do not know how to deal with them. For instance:
>
> Title: All about nothing
> Author: Mr. Smith
> Files:
> Slides
> PowerPoint Source: all-about-nothing.slides.ppt
> PDF Output: all-about-nothing.slides.pdf
> Paper
> OpenOffice Source: all-about-nothing.paper.ooo
> RTF Exchange: all-about-nothing.paper.rtf
> PDF Output: all-about-nothing.paper.pdf
>
> I thought about several possibilities:
> - All files are one document with several dc:indentifier
> for each file (information about type and format is lost)
> - Each file is one document (much duplication)
> - Create sets for each publication and type (more sets than
> publications)
>
> How do you store information about hierarchical
> document-relationship? For instance an article (with several
> versions, types and formats) can be part of a book that
> is part of a series and so on.
>
> Thanks for your comments!
>
> Jakob Voß
>
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
>
--
=====================================================================
hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com
=====================================================================