Open Archives Initiative
Protocol for Metadata Harvesting - Tools
Protocol for Metadata Harvesting - Tools
Open Archives Initiative -> PMH -> Tools
OAI-PMH Tools
The following table contains links to tools implemented by members of the Open Archives Initiative community. These tools are made available without guarantee as to their correctness. Questions about each tool should be directed to the individual implementer. All tools support the OAI-PMH v2.0, a few include legacy support for v1.0 and 1.1 and this is noted in the description.
Tool |
Implementer |
Description |
---|---|---|
Arc source | Old Dominion University | Arc is released under the NCSA Open Source License. Arc is a federated search service based on OAI-PMH. It includes a harvester, a search engine together with a simple search interface, and an OAI-PMH layer over harvested metadata. Arc can be configured for a specific community, and enhancements and customizations by the community are encouraged. Arc is based on Java Servlet technology and requires JDK1.4, Tomcat 4.0x, and a RDBMS server (tested with Oracle and MySQL). |
Archimede | Laval University Library | Archimede is an open-source software for institutional repositories. It features full text searching, multiplatform support, Web user interface, and more. Archimede fully supports OAI-PMH requests version 2.0. |
DSpace | HP Labs and MIT Libraries | DSpace is an open source digital asset managment software platform that enables institutions to capture and describe digital content. It runs on a variety of hardware platforms and supports OAI-PMH version 2.0. |
EnhancedOAIServer | National Documentation Center, Greece | The enhanced OAI server is a Java Servlet web application that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) v2.0. It is based on OAICat and the Biblio Transformation Engine. More than the flexibility it allows with BTE (custom filter and modifiers), it allows the administrator to define metadata mappings using XSLTs and that way to support easily more metadata formats. |
eprints.org | University of Southampton | Software to run centralised, discipline-based as well as distributed, institution-based archives of scholarly publications. The software is OAI compliant, i.e. metadata can be harvested from repositories running the software using the OAI metadata harvesting protocol. |
Fedora | Cornell University | An open source digital repository architecture that allows packaging of content and distributed services associated with that content. Fedora supports OAI-PMH requests on content in the repository. |
Goobi | Goobi developer team | The Goobi viewer is an open source solution for the presentation of digitized cultural assets. In addition to the pure object display, it also supports the merging of records into digital collections and a search in the entire database. The user interface of the Goobi viewer is available in various languages and can be completely adapted to individual requirements. The content can be curated and individual pages created using the integrated content management system. The OAI-PMH interface of the Goobi viewer currently supports the formats METS/MODS, MARCXML, DublinCore, LIDO, TEI, CMDI, ESE and xepicur. |
Kuha2 | Finnish Social Science Data Archive FSD / University of Tampere | Kuha2 is a metadata server that provides descriptive social science research metadata for harvesting via multiple protocols and a growing variety of metadata standards. The software is a collection of applications and consists of three server applications, a client application and a database. |
MARCXML framework | Library of Congress | A suite of tools, stylesheets, guidelines and XML documents to support MARC21 records in the XML environment. Includes Universitytools to support transformation/migration from oai_marc to MARCXML, including an XML schema for MARC21 records. |
MyCoRe | MyCoRe community | MyCoRe is an open source software solution that provides functionality for institutional repositories and archives. The software is to a great extent adaptable to meet distinct requirements. MyCoRe supports OAI-PMH requests on content in the repositories. |
Net::OAI::Harvester | Ed Summers | Net::OAI::Harvester provides an object-oriented client interface to the data found in OAI-PMH repositories (similar to what LWP::UserAgent does for HTTP). |
OAIA | University of Southhampton | Based on PERL and MySQL, OAIA is a simple mechanism for providing caching and aggregating of OAI repositories. |
OAI-PMH client for R | Scott Chamberlain | An OAI-PMH client for the R programming language. Source code available at https://github.com/ropensci/oai. |
OAI-PMH Spring Boot Starter | IT Department of the City of Munich | The current landscape of tools available for developing OAI-PMH data providers, especially in Java, often includes outdated, unsupported, or inflexible solutions. These limitations pose challenges for organizations requiring custom implementations. Our Spring Boot Starter bridges this gap by offering a modern, robust, and easy-to-integrate solution tailored for developers using the Spring Boot ecosystem. We believe it can significantly streamline the development process while adhering to OAI-PMH standards. |
OAI Java Implementation for Linux | University of Illinois, Urbana-Champaign | This is a simple, illustrative implementation of the OAI metadata protocol, using Java. The code is available on Source Forge (http://sourceforge.net/project/showfiles.php?group_id=47963). |
OAI Implementation for Windows NT/Windows 2000 | University of Illinois, Urbana-Champaign | This is a simple, illustrative implementation of the OAI metadata protocol, using Microsoft Windows NT server technologies. The code is available on Source Forge (http://sourceforge.net/project/showfiles.php?group_id=47963 |
OAICat | OCLC | OAICat is a Java Servlet web application providing an OAI-PMH v2.0 repository framework. The framework can be customized to work with arbitrary data repositories by implementing some Java interfaces. A demonstration implementation is available for download on the OAICat home page. |
OAI-PMH-Harvester-for-ObjC | National Documentation Center, Greece | A wrapper over OAI-PMH written in objective C that is used mostly in the iOS development. |
oai-perl library | University of Southampton | A library of PERL language classes that allow the rapid deployment of an OAI compatible interface to an existing web server/database |
oai-pmh2 | Open Culture Consulting | This is a stand-alone OAI-PMH 2.0 data provider. It serves records in any XML metadata format from a SQL database, supports deleted records, resumption tokens and sets. Written in PHP. |
OaiPmhNet | Roman Niklaus | OaiPmhNet is a .NET library implementing the OAI-PMH 2.0 specification by using C# as programming language. The library can be customized to work with arbitrary data repositories by implementing a few interfaces. A demonstration implementation can be found in the unit test project. |
OAI-PMH Pack | Infrae | Infrae has extended Silva so it allows users to browse and search
harvested metadata, further enriching the extensive feature-set
of this open source CMS. An organization that uses Silva can thus
easily become an OAI-PMH Service Provider. In the process, Infrae also developed a module for accessing OAI-PMH compliant repositories in Python, and developed a sophisticated harvesting and indexing system for using harvested metadata in Zope. These reusable components are designed to be building blocks for other Python or Zope-based applications. |
PEAR::OAI | ZZ/OSS Information Networking | A Perl implementation of the OAI-PMH Data Provider which is a PHP class library based on the PEAR classes. |
Perl Harvester | Virginia Tech. | Object-oriented harvester class with support for OAI-PMH v1.0, v1.1, and v2.0. Includes sample code to illustrate usage. |
PHP OAI Data Provider | University of Oldenburg | This implementation completely complies to OAI-PMH 2.0, including the support of on-the-fly output compression which may significantly reduce the amount of data being transfered. |
PHP OAI-PMH 2.0 Harvester library | Casey McLaughlin | This library provides an interface to harvest OAI-PMH metadata from any OAI 2.0 compliant endpoint. Features: PSR-0 thru PSR-2 Compliant; Composer-compatible; Unit-tested; Prefers Guzzle for HTTP transport layer, but can fall back to cURL; Easy-to-use iterator that hides all the HTTP junk necessary to get paginated records. |
PostgreSQL Foreign Data Wrapper for OAI-PMH (oai_fdw) | Jim Jones | A PostgreSQL Foreign Data Wrapper to access OAI-PMH repositories (Open Archives Initiative Protocol for Metadata Harvesting). This wrapper supports the OAI-PMH 2.0 Protocol. |
Rapid Visual OAI Tool | Old Dominion University | Rapid Visual OAI Tool (RVOT) can be used to graphically construct a OAI-PMH repository from a collection of files. The records in the original collection can be in any one of the acceptable formats. The formats currently supported are RFC1807, Marc subset & COSATI formats. RVOT helps to define the mapping visually from a native format to oai_dc format, and once this is done the tool can respond to OAI-PMH requests. The tool is self-contained; it comes with a lightweight http server and OAI-PMH request handler and is written in Java. The design of RVOT is such that it can be easily extended to support other metadata formats. |
Shell Harvester | Wim Muskee | The OAI-PMH Shell Harvester is able to harvest OAI-PMH targets. It supports multiple configurable targets which can be updated individually. Furthermore, it is able to execute a preset command for each record it updates or deletes. |
simple-oai-pmh | Open Culture Consulting | A stand-alone OAI-PMH data provider. It serves records in any metadata format from directories of XML files using the directory name as metadata prefix, the filename as identifier and the filemtime as datestamp. 0-byte files are considered deleted records and handled accordingly. Resumption tokens are managed using files. Written in PHP. |
Static Repository Gateway | LANL | An implementation of a static repository gateway that complies with the specification at http://www.openarchives.org/OAI/2.0/guidelines-static-repository.htm |
utf8conditioner | Cornell University | This is a small C program that will either check or 'fix' a UTF-8 byte stream. It was designed to be used within an OAI harvester to attempt to remove bad codes from supposedly UTF-8 byte streams so that they can then be parsed using a standard XML parser which would otherwise fail. |
VTOAI OAI-PMH Perl Implementation | Virginia Tech | This toolkit implements the skeleton of the OAI-PMH v2.0 in an object-oriented fashion, thus hiding the details of the protocol from code that is derived from the predefined class. |
ZMARCO | University of Illinois, Urbana-Champaign | ZMARCO is an Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) 2.0 compliant data provider. The 'Z' in ZMARCO stands for Z39.50; 'MARC' stands for MAchine-Readable Cataloging; and the 'O' stands for OAI, as in the Open Archives Inititive. ZMARCO allows MARC records which are already available through a Z39.50 server to relatively easily be made available via the OAI-PMH. |