[OAI-implementers] pointies in abstracts
Simeon Warner
simeon@lanl.gov
Wed, 4 Jul 2001 15:08:49 -0600 (MDT)
On Mon, 2 Jul 2001, Joe Futrelle wrote [excerpt]:
> The internal DTD subset strategy ought to work for any non-validating
> parser. However the XML spec is unclear on whether non-validating
> parsers are expected to process externally-referenced DTD's.
>
> Just a quick example to illustrate the internal DTD subset strategy;
> suppose you need to use a copyright symbol, which in HTML is ©
> and in ISO-8859-1 is 169. You could do it like this:
>
> <?xml version='1.0' encoding='ISO-8859-1'?>
> <!DOCTYPE myEntities [
> <!ENTITY copy "©">
> ]>
> <GetRecord ... etc ...
Note that the current OAI spec permits only encoding="UTF-8" (see section
3.1.2.1 Content-Type). The same could of course be done with UTF-8
encoding however.
On Mon, 2 Jul 2001, Hussein Suleman wrote [excerpt]:
> - XSV currently supports external entity references as follows:
> --- the external entity file must exist or the program crashes
> unceremoniously
> --- if the entity itself does not resolve its only a warning, not an
> error
>
> bottom line: unless all users of your OAI interface have the full
> complement of popular entity files they may run into problems ...
>
> suggestion: convert all named entities to unicode when generating OAI
> responses (thats what i do)
I strongly support this suggestion. We are, after all, trying to promote
interoperability and using Unicode (UTF-8) seems to be a positive step
in that direction.
Cheers,
Simeon.