[OAI-implementers] implementation of non-English characters w/UTF-8?

Tue Sep 13 15:29:04 EDT 2005

How have other people implemented "non-UTF-8" characters in their DP 
records?

Meaning, we have non-English characters that are "choking" when we test 
our Data Provider.  [Think "e" with the accent over it 
http://lib-app1.usc.edu:8085/oaidp?verb=GetRecord&identifier=oai:usc:digitalarchive:bhe/bhe-m27&metadataPrefix=oai_dc 
(surname after first name of "Elmo").]  Eventually, we will have several 
Asian language character sets, as well as the current non-English 
characters.

I have looked over the protocol, looked at various tutorials, the 
oai-implementers archives, and the OAI Best Practices site, and have not 
seen any guidelines other than this thread:

http://www.openarchives.org/pipermail/oai-implementers/2001-April/000093.html

I'm also looking at OLAC and some of the DP implementations in Japan, 
but have not [yet] found the solution.  [Like this: 
http://mitizane.ll.chiba-u.jp/cgi-bin/oai/oai2.0?verb=ListRecords&metadataPrefix=oai_dc 
.]

Will we just have to locate the individual characters that are choking 
and encode those a specific way?

Thanks in advance,

Jewel

-- 
Jewel H. Ward
Program Manager, USC Digital Archive
Leavey Library, Information Services Division
University of Southern California
Tel: (213) 821-2298   Cell: (213) 219-2784