LLM Assisted MARC Management

MARC Fields Range

Extraction

Generation

Evaluation

00X: Control Fields

 

 

 

01X-09X: Numbers and Code Fields

 

 

 

  • 041 Language Code

 

automation via python script - done

converting to code isn’t done

 

1XX: Main Entry Fields

 

 

 

20X-24X: Title and Title-Related Fields

 

 

 

25X-28X: Edition, Imprint, Etc. Fields

 

 

 

3XX: Physical Description, Etc. Fields

 

 

 

4XX: Series Statement Fields

 

 

 

5XX: Note Fields

 

 

 

  • 504 - Bibliography, Etc. Note

 

Biliography/discography section could be located by LLM. could be prototyped

 

  • 520: Summary, etc.

 

automation via python script - done

manual gpt test done - comparison against reference value from existing MARC

6XX: Subject Access Fields

 

 

 

  • 650 - Subject Added Entry - Topical Term

 

automation via python script - done

verification against LC vocabulary is not done

 

70X-75X: Added Entry Fields

 

 

 

76X-78X: Linking Entry and Description Fields

 

 

 

80X-83X: Series Added Entry Fields

 

 

 

841-88X: Holdings, Location, Alternate Graphics, Etc. Fields

 

 

 

Definitions:

  • Extraction - finding the value inside the catalogued instance and copying it as is.

    • Example: a book may be written in English and yet may not contain the word “English” at all, so defining the book’s language would be generation rather than extraction

    • Example: An abstract could both be generated or extracted, the latter is possible for a scientific articles which often contain abstracts

  • Generation - producing the value which might not be found inside the catalogued instance

  • Evaluation - measuring qualities of the value, which is either generated/extracted or existing or obtained with some other tools

Links: