LLM Assisted MARC Management
MARC Fields Range | Extraction | Generation | Evaluation |
---|---|---|---|
00X: Control Fields |
|
|
|
01X-09X: Numbers and Code Fields |
|
|
|
|
| automation via python script - done converting to code isn’t done |
|
1XX: Main Entry Fields |
|
|
|
20X-24X: Title and Title-Related Fields |
|
|
|
25X-28X: Edition, Imprint, Etc. Fields |
|
|
|
3XX: Physical Description, Etc. Fields |
|
|
|
4XX: Series Statement Fields |
|
|
|
5XX: Note Fields |
|
|
|
|
| Biliography/discography section could be located by LLM. could be prototyped |
|
|
| automation via python script - done | manual gpt test done - comparison against reference value from existing MARC |
6XX: Subject Access Fields |
|
|
|
|
| automation via python script - done verification against LC vocabulary is not done |
|
70X-75X: Added Entry Fields |
|
|
|
76X-78X: Linking Entry and Description Fields |
|
|
|
80X-83X: Series Added Entry Fields |
|
|
|
841-88X: Holdings, Location, Alternate Graphics, Etc. Fields |
|
|
|
Definitions:
Extraction - finding the value inside the catalogued instance and copying it as is.
Example: a book may be written in English and yet may not contain the word “English” at all, so defining the book’s language would be generation rather than extraction
Example: An abstract could both be generated or extracted, the latter is possible for a scientific articles which often contain abstracts
Generation - producing the value which might not be found inside the catalogued instance
Evaluation - measuring qualities of the value, which is either generated/extracted or existing or obtained with some other tools
Links:
MARC standard https://www.loc.gov/marc/bibliographic/
MARC samples https://catalog.loc.gov/
Crossref https://www.crossref.org/