[OAI-implementers] character vs entity references

Todd White tmwhite@merit.edu
Tue, 4 Nov 2003 09:58:55 -0500 (EST)


in Perl 5.6.1, can anyone tell me why the following well-documented
transliteration produces the error that follows?

$string =~ tr/\0-\x{ff}//UC;


ERROR:
Bareword found where operator expected at test.pl line 14, near
"tr/\0-\x{ff}//UC"



On Tue, 4 Nov 2003, Hussein Suleman wrote:

> hi Todd
> 
> you are correct - a character reference is meant to be the numeric 
> version rather than textual.
> 
> and the validation problem you have experienced is precisely why OAI 
> requires the use of numeric references - so that your XML is 
> self-contained and does not import any external entity definition files.
> 
> if all you have are Latin-1 entities, it isn't too difficult to convert 
> them to numeric equivalents in a pre-processing stage. some of the 
> templates on the OAI website already have support for this (like the 
> VTOAI Perl package)
> 
> ttfn,
> ----hussein
> 
> 
> Todd White wrote:
> 
> > in the OAI-PMH 2.0 document:
> > http://www.openarchives.org/OAI/openarchivesprotocol.html
> > 
> > ...under "3.2. XML Response Format," it reads:
> > 
> > "Character references, rather than entity references, must be used."
> > 
> > i'm assuming that this means, for example, that the n-tilde (ene) should
> > be expressed with  ñ  instead of  ñ  
> > 
> > is this correct?  is the first the character reference and the latter the
> > entity reference?
> > 
> > i've been struggling as of late with the issue of character encodings (the
> > verifier always chokes on records like the one we have with an n-tilde
> > (ene) in the title).  i'm now assuming that i should just serve these
> > records with the numeric entity code (as in ñ for n-tilde).
> > 
> > can anyone confirm this?
> > 
> > 
> > Todd M. White 
> > Systems Research Programmer
> > 734.647.8649 (direct) ~~~ http://www.merit.edu/~tmwhite/
> > 
> > Merit Network, Inc.
> > 4251 Plymouth Road, Suite 2000, Ann Arbor, MI  48105-2785
> > 734.764.9430 (general) ~~~ 734.647.3745 (fax)
> > http://www.merit.edu/
> > 
> > Avoid people who say they know the answer.
> > Keep the company of people who are trying to understand the question.
> > 
> > _______________________________________________
> > OAI-implementers mailing list
> > List information, archives, preferences and to unsubscribe:
> > http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> > 
> 
> -- 
> =====================================================================
> hussein suleman ~ hussein@cs.uct.ac.za ~ http://www.husseinsspace.com
> =====================================================================
> 
> _______________________________________________
> OAI-implementers mailing list
> List information, archives, preferences and to unsubscribe:
> http://oaisrv.nsdl.cornell.edu/mailman/listinfo/oai-implementers
> 
>