Ben, I could not agree more with Mike Nelson's response to you. The Cornell Digital Library Research Group (Southampton partners) would also be interested in extracting references and citation data straight from the text of the papers. Southampton does arXiv and TeX and some PDF; Cornell does HTML and XML and some PDF. Donna