[OAI-implementers] XML Schema problem?
Jeffrey A. Young
jyoung1@columbus.rr.com
Fri, 20 Apr 2001 19:10:48 -0400
This is a multi-part message in MIME format.
------=_NextPart_000_000C_01C0C9CD.9BB9D620
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Someone noticed that my OAIHarvester isn't working correctly lately. It
turns out that the Xerces XML parser is convinced that all the records I
harvest are flagged as status="deleted". Since this clearly isn't the case,
I started stripping the program down until I had a small example program
showing this effect. The Java source code is attached. Basically, if I do
DocumentBuilderFactory.setValidating(true) and then convert the XML to a DOM
Document, it silently "corrects" my records to status="deleted". If I dump
the Document, all looks fine, but when I actually query the status
attribute, it reports back with a value of "deleted". On the other hand, if
I specify setValidating(false), everything works fine. I suspect the problem
is that the XML Schema needs to make the status attribute optional. Another
possibility is that Xerces is processing the XML Schema incorrectly. I can
ignore the problem by always using setValidating(false), but that doesn't
seem right. If someone has a better solution, I would appreciate it. Thanks.
Jeff
---
Jeffrey A. Young
Senior Consulting Systems Analyst
Office of Research, Mail Code 710
OCLC Online Computer Library Center, Inc.
6565 Frantz Road
Dublin, OH 43017-3395
www.oclc.org
Voice: 614-764-4342
Fax: 614-764-2344
Email: jyoung@oclc.org
------=_NextPart_000_000C_01C0C9CD.9BB9D620
Content-Type: application/octet-stream;
name="Test.java"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="Test.java"
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.*;
import org.w3c.dom.*;
import org.xml.sax.*;
public class Test {
public static void main(String[] args) {
try {
DocumentBuilderFactory factory =3D =
DocumentBuilderFactory.newInstance();
String arg =3D "";
if (args.length =3D=3D 1)
arg =3D args[0];
if (arg.equals("true")) {
factory.setValidating(true);
} else if (arg.equals("false")) {
factory.setValidating(false);
} else {
System.err.println("Usage: java Test [true|false]");
System.exit(-1);
}
=20
factory.setNamespaceAware(true);
DocumentBuilder parser =3D factory.newDocumentBuilder();
String xml =3D "<?xml version=3D\"1.0\" =
encoding=3D\"UTF-8\"?><ListRecords =
xmlns=3D\"http://www.openarchives.org/OAI/1.0/OAI_ListRecords\" =
xmlns:xsi=3D\"http://www.w3.org/2000/10/XMLSchema-instance\" =
xsi:schemaLocation=3D\"http://www.openarchives.org/OAI/1.0/OAI_ListRecord=
s =
http://www.openarchives.org/OAI/1.0/OAI_ListRecords.xsd\"><responseDate>2=
001-04-20T14:48:40-05:00</responseDate><requestURL>http://orc:4342/etdcat=
/servlet/OAIHandler?metadataPrefix=3Doai_dc&verb=3DListRecords</reque=
stURL><record><header><identifier>oai:etdcat:ocm02999966</identifier><dat=
estamp>2001-02-02</datestamp></header><metadata><dc =
xmlns=3D\"http://purl.org/dc/elements/1.1/\" =
xmlns:xsi=3D\"http://www.w3.org/2000/10/XMLSchema-instance\" =
xsi:schemaLocation=3D\"http://purl.org/dc/elements/1.1/ =
http://www.openarchives.org/OAI/dc.xsd\"></dc></metadata></record><resump=
tionToken>987796143360:100:oai_dc</resumptionToken></ListRecords>";
StringReader sr =3D new StringReader(xml);
InputSource is =3D new InputSource(sr);
Document doc =3D parser.parse(is);
Element docEl =3D doc.getDocumentElement();
NodeList list =3D docEl.getElementsByTagName("record");
Element recEl =3D (Element)list.item(0);
System.out.println("status =3D " + recEl.getAttribute("status"));
} catch (Exception e) {
e.printStackTrace();
}
}
}
------=_NextPart_000_000C_01C0C9CD.9BB9D620--