MARC4J provides an easy to use API for working with MARC (binary), MARCXML, MARC JSON in Java. MARC stands for MAchine Readable Cataloging and is a widely used exchange format for bibliographic data. MARCXML provides a loss-less conversion between MARC (MARC21 but also other formats like UNIMARC) and XML.
Features
The MARC4J library includes:
...
- Leader leader;
- List<ControlField> controlFields
- List<DataField> dataFields
- List<MarcError> errors
Writers
Marc4j lib testing experience
...
JSON representation of marc record entity
{
"leader": "00508cjm a22001813 4500",
"fields": [
{
"001": "10062588"
},
{
"005": "20171013073237.0"
},
{
"007": "sd fsngnnmmneu"
},
{
"008": "170825s2017 xx nn n zxx d"
},
{
"024": {
"subfields": [
{
"a": "00190295755553"
},
{
"2": "gtin-14"
}
],
"ind1": "7",
"ind2": " "
}
},
{
"024": {
"subfields": [
{
"a": "190295755553"
}
],
"ind1": "1",
"ind2": " "
}
},
{
"035": {
"subfields": [
{
"a": "(OCoLC)1002130878"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"035": {
"subfields": [
{
"a": "10062588"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"040": {
"subfields": [
{
"a": "BTCTA"
},
{
"b": "eng"
},
{
"c": "BTCTA"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"100": {
"subfields": [
{
"a": "Rossi, Daniele"
}
],
"ind1": "1",
"ind2": " "
}
},
{
"245": {
"subfields": [
{
"a": "Saint-Saens: Organ Symphony and Carnival of The Animals"
}
],
"ind1": "0",
"ind2": "0"
}
},
{
"260": {
"subfields": [
{
"b": "Wea Corp"
},
{
"c": "2017."
}
],
"ind1": " ",
"ind2": " "
}
},
{
"948": {
"subfields": [
{
"a": "20171013"
},
{
"b": "m"
},
{
"d": "batch"
},
{
"e": "lts"
},
{
"x": "deloclcprefix"
}
],
"ind1": "2",
"ind2": " "
}
}
]
}
Code snippet
to parse MARC file and print all information to console
public class App {
public static void main(String args[]) throws Exception {
// path to .mrc file
String path = "";
// Input Stream from file
InputStream in = new FileInputStream(path);
MarcReader reader = new MarcStreamReader(in);
while (reader.hasNext()) {
Record record = reader.next();
Leader leader = record.getLeader();
List<ControlField> controlFields = record.getControlFields();
List<DataField> dataFields = record.getDataFields();
System.out.println();
System.out.println("LEADER: " + leader);
System.out.println("Control fields: ");
controlFields.forEach(System.out::println);
System.out.println("Data fields: ");
dataFields.forEach(System.out::println);
System.out.println();
}
}
}
Anchor | ||||
---|---|---|---|---|
|
Transform MARCJson to MARCXML.
...
Code Block
language java title From Json to Dublin Core data String stylesheetUrl = "https://www.loc.gov/standards/marcxml/xslt/MARC21slim2OAIDC.xsl"; Source stylesheet = new StreamSource(stylesheetUrl); Result result = new StreamResult(System.out); InputStream input = new FileInputStream(INPUT_JSON_FILE); MarcReader reader = new MarcJsonReader(input); MarcXmlWriter writer = new MarcXmlWriter(result, stylesheet); writer.setConverter(new AnselToUnicode()); while (reader.hasNext()) { Record record = (Record) reader.next(); writer.write(record); } writer.close();
Code Block
title Dublin Core xml result collapse true <?xml version="1.0" encoding="UTF-8"?> <oai_dc:dcCollection xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"> <oai_dc:dc> <dc:title>Futures, biometrics and neuroscience research</dc:title> <dc:creator>Moutinho, LuizHerausgeberInedt(DE-601)509450954(DE-588)131450204</dc:creator> <dc:creator>Sokele, MladenHerausgeberInedt</dc:creator> <dc:type>text</dc:type> <dc:language>eng</dc:language> <dc:format>application/pdf</dc:format> <dc:description>Enthl̃t 9 Beitrg̃e</dc:description> <dc:subject>Betriebswirtschaftslehre</dc:subject> <dc:subject>Management</dc:subject> <dc:subject>Wissenschaftliche Methode</dc:subject> <dc:identifier>http://www.gbv.de/dms/zbw/101073931X.pdf</dc:identifier> <dc:identifier>URN:ISBN:3319643991</dc:identifier> <dc:identifier>URN:ISBN:9783319643991</dc:identifier> <dc:identifier>URN:ISBN:9783319644004 (electronic)</dc:identifier> </oai_dc:dc> </oai_dc:dcCollection>
Testing Marc4j writer
- Reading a Marc Json and printing in MARC format. Observation was that if encoding was not used, the MARC generated was invalid
Code Block |
---|
import java.io.FileInputStream;
import java.io.InputStream;
import org.marc4j.*;
import org.marc4j.marc.Record;
public class TestMarc4jWriter {
public static void main(String args[]) throws Exception {
InputStream input = new FileInputStream("MarcFileinjsonFormat");
MarcReader reader = new MarcJsonReader(input);
MarcWriter writer = new MarcStreamWriter(System.out, "UTF8");
while (reader.hasNext()) {
Record record = reader.next();
writer.write(record);
}
writer.close();
}
} |
Validating a MARC file locally
An opensource tool MARCEDIT can be used for validating MARC files, and also for conversion
Reference: https://marcedit.reeset.net/
REFERENCES:
https://github.com/marc4j/marc4j
Formateed Tutorial: http://projects.freelibrary.info/freelib-marc4j/tutorial.html