Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

MARC4J provides an easy to use API for working with MARC (binary), MARCXML, MARC JSON in Java. MARC stands for MAchine Readable Cataloging and is a widely used exchange format for bibliographic data. MARCXML provides a loss-less conversion between MARC (MARC21 but also other formats like UNIMARC) and XML.

Features

The MARC4J library includes:

...

Types

For testing this library, we use used all types of marc MARC records. This These records you can be found here.
Types of records were tested:

...

Marc4j successfully parsed all types and convert converted them into the list of internal Record entities. For mixed records it works fine, but we can not select from the parsed list the necessary records by its type. It is possible only by analyzing record's fields. For the first iteration of Batch Loader, different types of MARC records should be in separate files.

Encoding and formats

During testing marc4j lib we use record used records with next these formats:

  • XML
  • Binary *.mrc files

Binary records was were encoded:

  • UTF-8
  • MARC 8

During parsing records, marc4j decode all data from records into UTF-8 format. It work works nice for all types and encodings and it understand understands non latin -Latin symbols. After parsing, the data in the list of records is stored in UTF-8 encoding. 

...

JSON representation of marc record entity

{
"leader": "00508cjm a22001813 4500",
"fields": [
{
"001": "10062588"
},
{
"005": "20171013073237.0"
},
{
"007": "sd fsngnnmmneu"
},
{
"008": "170825s2017 xx nn n zxx d"
},
{
"024": {
"subfields": [
{
"a": "00190295755553"
},
{
"2": "gtin-14"
}
],
"ind1": "7",
"ind2": " "
}
},
{
"024": {
"subfields": [
{
"a": "190295755553"
}
],
"ind1": "1",
"ind2": " "
}
},
{
"035": {
"subfields": [
{
"a": "(OCoLC)1002130878"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"035": {
"subfields": [
{
"a": "10062588"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"040": {
"subfields": [
{
"a": "BTCTA"
},
{
"b": "eng"
},
{
"c": "BTCTA"
}
],
"ind1": " ",
"ind2": " "
}
},
{
"100": {
"subfields": [
{
"a": "Rossi, Daniele"
}
],
"ind1": "1",
"ind2": " "
}
},
{
"245": {
"subfields": [
{
"a": "Saint-Saens: Organ Symphony and Carnival of The Animals"
}
],
"ind1": "0",
"ind2": "0"
}
},
{
"260": {
"subfields": [
{
"b": "Wea Corp"
},
{
"c": "2017."
}
],
"ind1": " ",
"ind2": " "
}
},
{
"948": {
"subfields": [
{
"a": "20171013"
},
{
"b": "m"
},
{
"d": "batch"
},
{
"e": "lts"
},
{
"x": "deloclcprefix"
}
],
"ind1": "2",
"ind2": " "
}
}
]
}

Code snippet

to parse MARC file and print all information to console


public class App {

public static void main(String args[]) throws Exception {

// path to .mrc file
String path = "";

// Input Stream from file
InputStream in = new FileInputStream(path);
MarcReader reader = new MarcStreamReader(in);
while (reader.hasNext()) {
Record record = reader.next();

Leader leader = record.getLeader();

List<ControlField> controlFields = record.getControlFields();
List<DataField> dataFields = record.getDataFields();

System.out.println();
System.out.println("LEADER: " + leader);
System.out.println("Control fields: ");
controlFields.forEach(System.out::println);
System.out.println("Data fields: ");
dataFields.forEach(System.out::println);
System.out.println();
}
}
}