MARC4J provide an easy to use API for working with MARC, MARCXML, MARC JSON in Java. MARC stands for MAchine Readable Cataloging and is a widely used exchange format for bibliographic data. MARCXML provides a loss-less conversion between MARC (MARC21 but also other formats like UNIMARC) and XML.
Features
The MARC4J library includes:
- An easy to use interface that can handle large record sets.
- Readers and writers for both MARC and MARCXML.
- A build-in pipeline model to pre- or post-process MARCXML using XSLT stylesheets.
- A MARC record object model (like DOM for XML) for in-memory editing of MARC records.
- Support for data conversions from MARC-8 ANSEL, ISO5426 or ISO6937 to UCS/Unicode and back.
- A forgiving reader which can handle and recover from a number of structural or encoding errors in records.
- Implementation independent XML support through JAXP and SAX2, a high performance XML interface.
- Support for conversions between MARC and MARCXML.
- Tight integration with the JAXP, DOM and SAX2 interfaces.
- Easy to integrate with other XML interfaces like DOM, XOM, JDOM or DOM4J.
- Command-line utilities for MARC and MARCXML conversions.
- Javadoc documentation.
MARC4J provides readers and writers for MARC and MARCXML. A org.marc4j.MarcReader
implementation parses input data and provides an iterator over a collection of org.marc4j.marc.Record
objects. The record object model is also suitable for in-memory editing of MARC records, just as DOM is used for XML editing purposes. Using a org.marc4j.MarcWriter
implementation it is possible to create MARC or MARCXML. Once MARC data has been converted to XML you can further process the result with XSLT, for example to convert MARC to MODS .
Although MARC4J is primarily designed for Java development you can use the command-line utilities org.marc4j.util.MarcXmlDriver
and org.marc4j.util.XmlMarcDriver
to convert between MARC and MARCXML. It is also possible to pre- or postprocess the result using XSLT, for example to convert directly from MODS to MARC or from MARC to MODS.
Parsing records
For parse records and interact with them you need to use to main Interfaces:
- MarcReader
- Record
Readers
For reading records this lib provide simple interface org.marc4j.MarcReader with many implementations for all types and formats of records.
Records
The main entity that describe all types of records is org.marc4j.marc.Record. It represent MARC document as plain java object with fields. Each MARC record after parsing have such fields:
- Leader leader;
- List<ControlField> controlFields
- List<DataField> dataFields
- List<MarcError> errors
Examples
Code snippet on Java to parse MARC file and print all information to console
public class App {
public static void main(String args[]) throws Exception {
// path to .mrc file
String path = "";
// Input Stream from file
InputStream in = new FileInputStream(path);
MarcReader reader = new MarcStreamReader(in);
while (reader.hasNext()) {
Record record = reader.next();
Leader leader = record.getLeader();
List<ControlField> controlFields = record.getControlFields();
List<DataField> dataFields = record.getDataFields();
System.out.println();
System.out.println("LEADER: " + leader);
System.out.println("Control fields: ");
controlFields.forEach(System.out::println);
System.out.println("Data fields: ");
dataFields.forEach(System.out::println);
System.out.println();
}
}
}