[OAI-implementers] Re: OAI sets as new instances (Sets Proposal(from DLF))

Jeffrey A. Young jyoung1 at columbus.rr.com
Sat May 7 19:54:54 EDT 2005


I want to make sure Rob gets some credit for already knowing that OAI and
SRW play well together. The article that he co-wrote with Ralph LeVan and me
describes the general principles.

http://www.dlib.org/dlib/february05/sanderson/02sanderson.html

Jeff

> -----Original Message-----
> From: oai-implementers-bounces at openarchives.org [mailto:oai-implementers-
> bounces at openarchives.org] On Behalf Of Simeon Warner
> Sent: Friday, May 06, 2005 12:21 PM
> To: oai-implementers at openarchives.org
> Subject: Re: [OAI-implementers] Re: OAI sets as new instances (Sets
> Proposal(from DLF))
> 
> 
> A somewhat late response to this thread...
> 
> I think that they key to avoiding sets being misused as a poor man's
> search is education. We must keep repeating the mantra that this is not
> what OAI-PMH was designed to do and if search is desired then SRU/SRW or
> similar should be used. Shirley Hyatt and Jeff Young have demonstrated
> OAI-PMH and SRU/SRW playing nicely together at OCLC
> (http://www.dlib.org/dlib/march05/hyatt/03hyatt.html)
> 
> The problem of items moving out of sets is a real one, and one we need to
> address within the protocol.
> 
> Having said this, there may be situations where Robert's suggestion makes
> sense. However, just as with sets, I suspect deployment should be
> judicious -- to meet real selective harvesting needs.  I note that one
> additional requirement for implementation is that items MUST be identified
> using a recognized global URI scheme. Unless that is the case then
> harvesters should assume that ids within OAI-PMH responses are local, and
> then multiple repositories with the same content would be confusing.
> 
> Cheers,
> Simeon
> 
> On Mon, 25 Apr 2005, Dr Robert Sanderson wrote:
> > On Fri, 22 Apr 2005, Thomas G. Habing wrote:
> > > time articulating.  Perhaps the problem is that there are several
> different
> > > issues with sets, and I'm not sure which of these we are really trying
> to
> > > address.
> > >
> > > 1) The tendency of people to misunderstand sets as a sort of poor
> man's
> > > search.
> >
> > I think that by moving the set name into the URL it doesn't get rid of
> > this, but it does lessen the tendancy to think this way.  When it's a
> > parameter in the query, it's easy to cram any arbitrary value in there.
> > It's less intuitive to do this when the set name is part of the URL.
> >
> > > 2) Technical issues relating to how to signal that a record has been
> moved
> > > out of a set, but has not been deleted from the repository.
> >
> > This wasn't something I was thinking of when writing it up, but it does
> > fall out neatly from the proposal -- you simply set them deleted in the
> > set repository.
> >
> > > 3) How best to describe a set: there is a technical description such
> as how
> > > many items are in the set and what the updated frequency is.  There is
> also
> > > the conceptual description, such as the records in this set are all
> described
> > > by this subject heading, or they all belong to this "collection," or
> they all
> > > have this publishing status.
> >
> > The advantage here is that you have all of the best practices and
> schemas
> > for the Identify verb for the set descriptions. What exactly
> > to put in here is still in need of work, but I think it's a good start
> to
> > allow the full Identify information.
> >
> > > 4) Issues such as whether its a good idea to have overlapping sets,
> flat
> > > sets, hierarchical sets, and in which circumstances.
> >
> > Whether it's a good idea? I'm not going to comment on that, besides the
> > point that there are heirarchical collections and sub-collections, so
> it's
> > natural to describe these in a hierarchical tree of sets.
> > The main advantage here is that everything falls out neatly -- if you
> want
> > a tree, then design your URLs to be a tree.  If you want overlapping,
> flat
> > or any other design, then it's up to the design of the URL paths, not
> the
> > protocol to try and fit all of the requirements.
> >
> >
> > > 5) Variations in how different implementers have interpretted the OAI
> > > "data model".
> >
> > I don't think that the proposal addresses this.
> >
> > > Briefly some of my misgivings:
> > > Does Rob's model place an excessive burden on data providers, or
> service
> > > providers?
> >
> > The burden on the data providers can be done in at least two different
> > ways -- either multiple instances of the script, or one server which
> > handles everything.  Multiple instances is easier than the status quo
> (no
> > sets, no extra URLs).  One server is as hard as the status quo, but
> > depending on the underlying architecture it may be no more difficult, or
> > it may be quite a bit harder (at which point, there's always multiple
> > instances of the server code)
> >
> > For service providers, it should be easier, as they can simply follow
> the
> > links in the <friends> section, rather than having to construct
> parameters
> > from the listSets response.
> >
> >
> > > Does it fundamentally alter the underlying data model of OAI, for
> better or
> > > worse?  Previously, I think that items belonged to one or more sets,
> and
> > > records were disseminations of these items in a specific format.  I
> think
> > > Rob's model alters this to something like records being disseminations
> of
> > > items within the context of those items being contained in a
> particular set.
> >
> > Mmmmm. I have no real comment here.  There's nothing to prevent you from
> > having different representations of the same object disseminated in
> > different sets, but that's no different to today where some providers
> make
> > sets available per record schema.
> >
> > I think that's a best practice issue which should be addressed, but is
> > mostly orthogonal to the proposal?
> >
> > Rob
> >
> >        ,'/:.          Dr Robert Sanderson (azaroth at liverpool.ac.uk)
> >      ,'-/::::.        http://www.csc.liv.ac.uk/~azaroth/
> >    ,'--/::(@)::.      Dept. of Computer Science, Room 805
> > ,'---/::::::::::.    University of Liverpool
> > ____/:::::::::::::.
> > I L L U M I N A T I  Cheshire3 IR System:  http://www.cheshire3.org/
> >
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://www.openarchives.org/mailman/listinfo/oai-implementers
> >
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://www.openarchives.org/mailman/listinfo/oai-implementers





More information about the OAI-implementers mailing list