[OAI-implementers] Metadata Language Confusion

Gary Simons gary_simons@sil.org
Mon, 16 Jul 2001 14:09:25 -0400


This is in response to a thread from last Friday regarding
multilingual metadata.

You may want to look into the approach we are taking in the Open
Language Archives Community (www.language-archives.org) where
multilingualism is built into the metadata set.  We are a subset of
OAI data providers who are supporting an OLAC metadata set in addition
to the required DC metadata set.  In the OLAC metadata set we are
following the approach Diane Hillman suggested in her response to this
thread. Since each metadata element is repeatable, we use an optional
lang attribute on each element to encode the language of the element
content (with a default of "en"). We also have that attribute on the
container element for the entire metadata element, but in this case it
takes a space delimited list of language codes, namely, all languages
in which the record is designed to be presented.  Service providers
may then use that information to customize the display for users.
(N.B., The container level list is not just the union of all language
tags within the elements, since a record in English, for instance,
could have a foreign language title, which may or may not also be
translated into English; such a record would specify only English at
the record level, since the whole record is not available in the
original language of the title.)

The lang attribute is explained in section 2 of our OLAC Metadata Set
document:

     http://www.language-archives.org/OLAC/olacms.html

Gary Simons
SIL International (www.sil.org)
7500 W. Camp Wisdom Rd., Dallas, TX
Phone: (972) 708-7487