MARC4J provides an easy to use API for working with MARC (binary), MARCXML, MARC JSON in Java. MARC stands for MAchine Readable Cataloging and is a widely used exchange format for bibliographic data. MARCXML provides a loss-less conversion between MARC (MARC21 but also other formats like UNIMARC) and XML.
Features
The MARC4J library includes:
...
- Leader leader;
- List<ControlField> controlFields
- List<DataField> dataFields
- List<MarcError> errors
Writers
Marc4j lib testing experience
...
JSON representation of marc record entity
{
"leader": "00508cjm a22001813 4500",
"fields": [
{
"001": "10062588"
},
{
"005": "20171013073237.0"
},
{
"007": "sd fsngnnmmneu"
},
{
"008": "170825s2017 xx nn n zxx d"
},
{
"024": {
"subfields": [
{
"a": "00190295755553"
},
{
"2": "gtin-14"
}
],
"ind1": "7",
"ind2": " "
}
},
{
"024": {
"subfields": [
{
"a": "190295755553"
}
],
"ind1": "1",
"ind2": " "
}
},
{
"035": {
"subfields": [
{
"a": "(OCoLC)1002130878"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"035": {
"subfields": [
{
"a": "10062588"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"040": {
"subfields": [
{
"a": "BTCTA"
},
{
"b": "eng"
},
{
"c": "BTCTA"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"100": {
"subfields": [
{
"a": "Rossi, Daniele"
}
],
"ind1": "1",
"ind2": " "
}
},
{
"245": {
"subfields": [
{
"a": "Saint-Saens: Organ Symphony and Carnival of The Animals"
}
],
"ind1": "0",
"ind2": "0"
}
},
{
"260": {
"subfields": [
{
"b": "Wea Corp"
},
{
"c": "2017."
}
],
"ind1": " ",
"ind2": " "
}
},
{
"948": {
"subfields": [
{
"a": "20171013"
},
{
"b": "m"
},
{
"d": "batch"
},
{
"e": "lts"
},
{
"x": "deloclcprefix"
}
],
"ind1": "2",
"ind2": " "
}
}
]
}
Code snippet
to parse MARC file and print all information to console
public class App {
public static void main(String args[]) throws Exception {
// path to .mrc file
String path = "";
// Input Stream from file
InputStream in = new FileInputStream(path);
MarcReader reader = new MarcStreamReader(in);
while (reader.hasNext()) {
Record record = reader.next();
Leader leader = record.getLeader();
List<ControlField> controlFields = record.getControlFields();
List<DataField> dataFields = record.getDataFields();
System.out.println();
System.out.println("LEADER: " + leader);
System.out.println("Control fields: ");
controlFields.forEach(System.out::println);
System.out.println("Data fields: ");
dataFields.forEach(System.out::println);
System.out.println();
}
}
}
...
Anchor |
---|
...
|
Transform MARCJson to MARCXML.
Code snippet
Code Block
language java title convert MARCJson to MARCXML and print to console: InputStream fileInputStream = new FileInputStream(INPUT_JSON_FILE); MarcReader marcJsonReader = new MarcJsonReader(fileInputStream); StringWriter writer = new StringWriter(); MarcXmlWriter marcXmlWriter = new MarcXmlWriter(System.out, true); while (marcJsonReader.hasNext()) { Record record = marcJsonReader.next(); marcXmlWriter.write(record); } marcXmlWriter.close(); fileInputStream.close();
Code Block
language js title MARCJSON source collapse true { "leader": "01741nam a2200373 cb4500", "fields": [ { "001": "101073931X" }, { "003": "DE-601" }, { "005": "20180416162657.0" }, { "008": "180111s2018\\\\\\\\sz\\\\\\\\\\\\\\\\\\\\\\\\000\\0\\eng\\d" }, { "020": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "3319643991" }, { "9": "3-319-64399-1" } ] } }, { "020": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "9783319643991" }, { "9": "978-3-319-64399-1" } ] } }, { "020": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "9783319644004 (electronic)" }, { "9": "978-3-319-64400-4" } ] } }, { "035": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "(OCoLC)ocn992783736" } ] } }, { "035": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "(OCoLC)992783736" } ] } }, { "035": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "(DE-599)GBV101073931X" } ] } }, { "040": { "ind1": "\\", "ind2": "\\", "subfields": [ { "b": "ger" }, { "c": "GBVCP" }, { "e": "rda" } ] } }, { "041": { "ind1": "0", "ind2": "\\", "subfields": [ { "a": "eng" } ] } }, { "245": { "ind1": "0", "ind2": "0", "subfields": [ { "a": "Futures, biometrics and neuroscience research" }, { "c": "Luiz Moutinho, Mladen Sokele, editors" } ] } }, { "264": { "ind1": "3", "ind2": "1", "subfields": [ { "a": "Cham" }, { "b": "Palgrave Macmillan" }, { "c": "[2018]" } ] } }, { "300": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "xxix, 224 Seiten" }, { "b": "Illustrationen" } ] } }, { "336": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "Text" }, { "b": "txt" }, { "2": "rdacontent" } ] } }, { "337": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "ohne Hilfsmittel zu benutzen" }, { "b": "n" }, { "2": "rdamedia" } ] } }, { "338": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "Band" }, { "b": "nc" }, { "2": "rdacarrier" } ] } }, { "490": { "ind1": "0", "ind2": "\\", "subfields": [ { "a": "Innovative research methodologies in management" }, { "v": " \/ Luiz Moutinho, Mladen Sokele ; Volume 2" } ] } }, { "500": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "Enthält 9 Beiträge" } ] } }, { "650": { "ind1": "\\", "ind2": "7", "subfields": [ { "8": "1.1\\x" }, { "a": "Betriebswirtschaftslehre" }, { "0": "(DE-601)091351391" }, { "0": "(DE-STW)12041-5" }, { "2": "stw" } ] } }, { "650": { "ind1": "\\", "ind2": "7", "subfields": [ { "8": "1.2\\x" }, { "a": "Management" }, { "0": "(DE-601)091376173" }, { "0": "(DE-STW)12085-6" }, { "2": "stw" } ] } }, { "650": { "ind1": "\\", "ind2": "7", "subfields": [ { "8": "1.3\\x" }, { "a": "Wissenschaftliche Methode" }, { "0": "(DE-601)091401445" }, { "0": "(DE-STW)16727-0" }, { "2": "stw" } ] } }, { "700": { "ind1": "1", "ind2": "\\", "subfields": [ { "a": "Moutinho, Luiz" }, { "e": "HerausgeberIn" }, { "4": "edt" }, { "0": "(DE-601)509450954" }, { "0": "(DE-588)131450204" } ] } }, { "700": { "ind1": "1", "ind2": "\\", "subfields": [ { "a": "Sokele, Mladen" }, { "e": "HerausgeberIn" }, { "4": "edt" } ] } }, { "830": { "ind1": "\\", "ind2": "0", "subfields": [ { "a": "Innovative research methodologies in management" }, { "b": " \/ Luiz Moutinho, Mladen Sokele" }, { "v": "Volume 2" }, { "9": "2.2018" }, { "w": "(DE-601)1011380293" } ] } }, { "856": { "ind1": "4", "ind2": "2", "subfields": [ { "y": "Inhaltsverzeichnis" }, { "u": "http:\/\/www.gbv.de\/dms\/zbw\/101073931X.pdf" }, { "m": "V:DE-601;B:DE-206" }, { "q": "application\/pdf" }, { "3": "Inhaltsverzeichnis" } ] } }, { "900": { "ind1": "\\", "ind2": "\\", "subfields": [ { "a": "GBV" }, { "b": "ZBW Kiel <206>" }, { "d": "!H:! A18-1775" }, { "x": "L" }, { "z": "LC" }, { "s": "206\/1" } ] } }, { "954": { "ind1": "\\", "ind2": "\\", "subfields": [ { "0": "ZBW Kiel <206>" }, { "a": "26" }, { "b": "1740761685" }, { "c": "01" }, { "f": "H:" }, { "d": "A18-1775" }, { "e": "u" }, { "x": "206\/1" } ] } } ] }
Code Block
theme Confluence title MARCXML result collapse true <?xml version="1.0" encoding="UTF-8" standalone="no"?> <marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim"> <marc:record> <marc:leader>01741nam a2200373 cb4500</marc:leader> <marc:controlfield tag="001">101073931X</marc:controlfield> <marc:controlfield tag="003">DE-601</marc:controlfield> <marc:controlfield tag="005">20180416162657.0</marc:controlfield> <marc:controlfield tag="008">180111s2018\\\\sz\\\\\\\\\\\\000\0\eng\d</marc:controlfield> <marc:datafield ind1="\" ind2="\" tag="020"> <marc:subfield code="a">3319643991</marc:subfield> <marc:subfield code="9">3-319-64399-1</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="020"> <marc:subfield code="a">9783319643991</marc:subfield> <marc:subfield code="9">978-3-319-64399-1</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="020"> <marc:subfield code="a">9783319644004 (electronic)</marc:subfield> <marc:subfield code="9">978-3-319-64400-4</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="035"> <marc:subfield code="a">(OCoLC)ocn992783736</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="035"> <marc:subfield code="a">(OCoLC)992783736</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="035"> <marc:subfield code="a">(DE-599)GBV101073931X</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="040"> <marc:subfield code="b">ger</marc:subfield> <marc:subfield code="c">GBVCP</marc:subfield> <marc:subfield code="e">rda</marc:subfield> </marc:datafield> <marc:datafield ind1="0" ind2="\" tag="041"> <marc:subfield code="a">eng</marc:subfield> </marc:datafield> <marc:datafield ind1="0" ind2="0" tag="245"> <marc:subfield code="a">Futures, biometrics and neuroscience research</marc:subfield> <marc:subfield code="c">Luiz Moutinho, Mladen Sokele, editors</marc:subfield> </marc:datafield> <marc:datafield ind1="3" ind2="1" tag="264"> <marc:subfield code="a">Cham</marc:subfield> <marc:subfield code="b">Palgrave Macmillan</marc:subfield> <marc:subfield code="c">[2018]</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="300"> <marc:subfield code="a">xxix, 224 Seiten</marc:subfield> <marc:subfield code="b">Illustrationen</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="336"> <marc:subfield code="a">Text</marc:subfield> <marc:subfield code="b">txt</marc:subfield> <marc:subfield code="2">rdacontent</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="337"> <marc:subfield code="a">ohne Hilfsmittel zu benutzen</marc:subfield> <marc:subfield code="b">n</marc:subfield> <marc:subfield code="2">rdamedia</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="338"> <marc:subfield code="a">Band</marc:subfield> <marc:subfield code="b">nc</marc:subfield> <marc:subfield code="2">rdacarrier</marc:subfield> </marc:datafield> <marc:datafield ind1="0" ind2="\" tag="490"> <marc:subfield code="a">Innovative research methodologies in management</marc:subfield> <marc:subfield code="v"> / Luiz Moutinho, Mladen Sokele ; Volume 2</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="500"> <marc:subfield code="a">Enthält 9 Beiträge</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="7" tag="650"> <marc:subfield code="8">1.1\x</marc:subfield> <marc:subfield code="a">Betriebswirtschaftslehre</marc:subfield> <marc:subfield code="0">(DE-601)091351391</marc:subfield> <marc:subfield code="0">(DE-STW)12041-5</marc:subfield> <marc:subfield code="2">stw</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="7" tag="650"> <marc:subfield code="8">1.2\x</marc:subfield> <marc:subfield code="a">Management</marc:subfield> <marc:subfield code="0">(DE-601)091376173</marc:subfield> <marc:subfield code="0">(DE-STW)12085-6</marc:subfield> <marc:subfield code="2">stw</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="7" tag="650"> <marc:subfield code="8">1.3\x</marc:subfield> <marc:subfield code="a">Wissenschaftliche Methode</marc:subfield> <marc:subfield code="0">(DE-601)091401445</marc:subfield> <marc:subfield code="0">(DE-STW)16727-0</marc:subfield> <marc:subfield code="2">stw</marc:subfield> </marc:datafield> <marc:datafield ind1="1" ind2="\" tag="700"> <marc:subfield code="a">Moutinho, Luiz</marc:subfield> <marc:subfield code="e">HerausgeberIn</marc:subfield> <marc:subfield code="4">edt</marc:subfield> <marc:subfield code="0">(DE-601)509450954</marc:subfield> <marc:subfield code="0">(DE-588)131450204</marc:subfield> </marc:datafield> <marc:datafield ind1="1" ind2="\" tag="700"> <marc:subfield code="a">Sokele, Mladen</marc:subfield> <marc:subfield code="e">HerausgeberIn</marc:subfield> <marc:subfield code="4">edt</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="0" tag="830"> <marc:subfield code="a">Innovative research methodologies in management</marc:subfield> <marc:subfield code="b"> / Luiz Moutinho, Mladen Sokele</marc:subfield> <marc:subfield code="v">Volume 2</marc:subfield> <marc:subfield code="9">2.2018</marc:subfield> <marc:subfield code="w">(DE-601)1011380293</marc:subfield> </marc:datafield> <marc:datafield ind1="4" ind2="2" tag="856"> <marc:subfield code="y">Inhaltsverzeichnis</marc:subfield> <marc:subfield code="u">http://www.gbv.de/dms/zbw/101073931X.pdf</marc:subfield> <marc:subfield code="m">V:DE-601;B:DE-206</marc:subfield> <marc:subfield code="q">application/pdf</marc:subfield> <marc:subfield code="3">Inhaltsverzeichnis</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="900"> <marc:subfield code="a">GBV</marc:subfield> <marc:subfield code="b">ZBW Kiel <206></marc:subfield> <marc:subfield code="d">!H:! A18-1775</marc:subfield> <marc:subfield code="x">L</marc:subfield> <marc:subfield code="z">LC</marc:subfield> <marc:subfield code="s">206/1</marc:subfield> </marc:datafield> <marc:datafield ind1="\" ind2="\" tag="954"> <marc:subfield code="0">ZBW Kiel <206></marc:subfield> <marc:subfield code="a">26</marc:subfield> <marc:subfield code="b">1740761685</marc:subfield> <marc:subfield code="c">01</marc:subfield> <marc:subfield code="f">H:</marc:subfield> <marc:subfield code="d">A18-1775</marc:subfield> <marc:subfield code="e">u</marc:subfield> <marc:subfield code="x">206/1</marc:subfield> </marc:datafield> </marc:record> </marc:collection>
...
Code Block
language java title From Json to Dublin Core data String stylesheetUrl = "https://www.loc.gov/standards/marcxml/xslt/MARC21slim2OAIDC.xsl"; Source stylesheet = new StreamSource(stylesheetUrl); Result result = new StreamResult(System.out); InputStream input = new FileInputStream(INPUT_JSON_FILE); MarcReader reader = new MarcJsonReader(input); MarcXmlWriter writer = new MarcXmlWriter(result, stylesheet); writer.setConverter(new AnselToUnicode()); while (reader.hasNext()) { Record record = (Record) reader.next(); writer.write(record); } writer.close();
Code Block
title Dublin Core xml result collapse true <?xml version="1.0" encoding="UTF-8"?> <oai_dc:dcCollection xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"> <oai_dc:dc> <dc:title>Futures, biometrics and neuroscience research</dc:title> <dc:creator>Moutinho, LuizHerausgeberInedt(DE-601)509450954(DE-588)131450204</dc:creator> <dc:creator>Sokele, MladenHerausgeberInedt</dc:creator> <dc:type>text</dc:type> <dc:language>eng</dc:language> <dc:format>application/pdf</dc:format> <dc:description>Enthl̃t 9 Beitrg̃e</dc:description> <dc:subject>Betriebswirtschaftslehre</dc:subject> <dc:subject>Management</dc:subject> <dc:subject>Wissenschaftliche Methode</dc:subject> <dc:identifier>http://www.gbv.de/dms/zbw/101073931X.pdf</dc:identifier> <dc:identifier>URN:ISBN:3319643991</dc:identifier> <dc:identifier>URN:ISBN:9783319643991</dc:identifier> <dc:identifier>URN:ISBN:9783319644004 (electronic)</dc:identifier> </oai_dc:dc> </oai_dc:dcCollection>
Testing Marc4j writer
- Reading a Marc Json and printing in MARC format. Observation was that if encoding was not used, the MARC generated was invalid
Code Block |
---|
import java.io.FileInputStream;
import java.io.InputStream;
import org.marc4j.*;
import org.marc4j.marc.Record;
public class TestMarc4jWriter {
public static void main(String args[]) throws Exception {
# The json file used here is referenced above
InputStream input = new FileInputStream("MarcFileinjsonFormat");
MarcReader reader = new MarcJsonReader(input);
MarcWriter writer = new MarcStreamWriter(System.out, "UTF8");
while (reader.hasNext()) {
Record record = reader.next();
writer.write(record);
}
writer.close();
}
} |
OUTPUT
No Format |
---|
01741nam a2200373 cb4500001001100000003000700011005001700018008004100035020003000076020003700106020005000143035002400193035002100217035002600238040002000264041000800284245008900292264003700381300003700418336002600455337004600481338002500527490009600552500002500648650007700673650006300750650007800813700007700891700003900968830012301007856012101130900005301251954006301304101073931XDE-60120180416162657.0180111s2018\\\\sz\\\\\\\\\\\\000\0\eng\d\\a331964399193-319-64399-1\\a97833196439919978-3-319-64399-1\\a9783319644004 (electronic)9978-3-319-64400-4\\a(OCoLC)ocn992783736\\a(OCoLC)992783736\\a(DE-599)GBV101073931X\\bgercGBVCPerda0\aeng00aFutures, biometrics and neuroscience researchcLuiz Moutinho, Mladen Sokele, editors31aChambPalgrave Macmillanc[2018]\\axxix, 224 SeitenbIllustrationen\\aTextbtxt2rdacontent\\aohne Hilfsmittel zu benutzenbn2rdamedia\\aBandbnc2rdacarrier0\aInnovative research methodologies in managementv / Luiz Moutinho, Mladen Sokele ; Volume 2\\aEnthält 9 Beiträge\781.1\xaBetriebswirtschaftslehre0(DE-601)0913513910(DE-STW)12041-52stw\781.2\xaManagement0(DE-601)0913761730(DE-STW)12085-62stw\781.3\xaWissenschaftliche Methode0(DE-601)0914014450(DE-STW)16727-02stw1\aMoutinho, LuizeHerausgeberIn4edt0(DE-601)5094509540(DE-588)1314502041\aSokele, MladeneHerausgeberIn4edt\0aInnovative research methodologies in managementb / Luiz Moutinho, Mladen SokelevVolume 292.2018w(DE-601)101138029342yInhaltsverzeichnisuhttp://www.gbv.de/dms/zbw/101073931X.pdfmV:DE-601;B:DE-206qapplication/pdf3Inhaltsverzeichnis\\aGBVbZBW Kiel <206>d!H:! A18-1775xLzLCs206/1\\0ZBW Kiel <206>a26b1740761685c01fH:dA18-1775eux206/1 |
2. Defining fields using Record
Code Block |
---|
import org.marc4j.MarcStreamWriter;
import org.marc4j.MarcWriter;
import org.marc4j.marc.MarcFactory;
import org.marc4j.marc.Record;
public class TestMarc4jWriter2 {
public static void main(String args[]){
MarcWriter writer = new MarcStreamWriter(System.out, "UTF8");
MarcFactory factory = MarcFactory.newInstance();
Record record = factory.newRecord("00714cam a2200205 a 4500");
record.addVariableField(factory.newControlField("001", "12883376"));
record.addVariableField(factory.newControlField("005", "20030616111422.0"));
record.addVariableField(factory.newControlField("008", "020805s2002 nyu j 000 1 eng "));
record.addVariableField(factory.newDataField("020", ' ', ' ', "a", "0786808772"));
record.addVariableField(factory.newDataField("020", ' ', ' ', "a", "0786816155 (pbk.)"));
record.addVariableField(factory.newDataField("040", ' ', ' ', "a", "DLC", "c", "DLC", "d", "DLC"));
record.addVariableField(factory.newDataField("100", '1', ' ', "a", "Chabon, Michael."));
record.addVariableField(factory.newDataField("245", '1', '0', "a", "Summerland /", "c", "Michael Chabon."));
record.addVariableField(factory.newDataField("250", ' ', ' ', "a", "1st ed."));
record.addVariableField(factory.newDataField("260", ' ', ' ', "a", "New York :", "b", "Miramax Books/Hyperion Books for Children,", "c", "c2002."));
record.addVariableField(factory.newDataField("300", ' ', ' ', "a", "500 p. ;", "c", "22 cm."));
record.addVariableField(factory.newDataField("520", ' ', ' ', "a", "Ethan Feld, the worst baseball player in the history of the game, finds himself recruited by a 100-year-old scout to help a band of fairies triumph over an ancient enemy."));
record.addVariableField(factory.newDataField("650", ' ', '1', "a", "Fantasy."));
record.addVariableField(factory.newDataField("650", ' ', '1', "a", "Baseball", "v", "Fiction."));
record.addVariableField(factory.newDataField("650", ' ', '1', "a", "Magic", "v", "Fiction."));
writer.write(record);
writer.close();
}
}
|
Code Block |
---|
00714cam a2200205 a 45000010009000000050017000090080041000260200015000670200022000820400018001041000021001222450034001432500012001772600067001893000021002565200175002776500013004526500023004656500020004881288337620030616111422.0020805s2002 nyu j 000 1 eng a0786808772 a0786816155 (pbk.) aDLCcDLCdDLC1 aChabon, Michael.10aSummerland /cMichael Chabon. a1st ed. aNew York :bMiramax Books/Hyperion Books for Children,cc2002. a500 p. ;c22 cm. aEthan Feld, the worst baseball player in the history of the game, finds himself recruited by a 100-year-old scout to help a band of fairies triumph over an ancient enemy. 1aFantasy. 1aBaseballvFiction. 1aMagicvFiction. |
Validating a MARC file locally
An opensource tool MARCEDIT can be used for validating MARC files, and also for conversion
Reference: https://marcedit.reeset.net/
REFERENCES:
https://github.com/marc4j/marc4j
Formateed Tutorial: http://projects.freelibrary.info/freelib-marc4j/tutorial.html