2025-07-23 Data Import SIG Meeting Notes

2025-07-23 Data Import SIG Meeting Notes

Recordings are posted Here (2022+) and Here (pre-2022)                   Slack channel for Q&A, discussion between meetings

Requirements details Here                                                                    Additional discussion topics in Subgroup parking lot

 

Attendees: @Jennifer Eustis @Alissa Hafele @Charlotte Whitt @Christine Schultz @Emily Semenoff @Greg Edwards @Jeanette Kalchik @Vivian Gould @Lola Estelle@Brooks Travis @Kalli Mathios @Kalvin Van Gaasbeck @Katie Rahman @Khalilah Gambrell @Kim Wiljanen @Lisa Lorenzo @Magda Fathy Gad @Mary Aycock @Mary Campany @Nancy Lorimer @Robert Heaton @Robert Pleshar @Ryan Taylor @Sheila Torres-Blank @Tom Hanstra @Yael Hod
Notetaker: @Autumn Faulkner

Links:

Agenda: 

Topic

Who

Meeting Notes

Related Jira

Decisions/Actions

Announcements

ALL

  • Autumn replicated duplicate 035 issue, where a record with two different (OCoLC) numbers in the repeatable $a of the 035 will generate a second 035 with portion of 001 after running through Data Import.

  • Will send info to Ryan who will review existing Jira bugs or make a new one

 

 

Resources

 

Presentation

Authority Control Development documentation

 

 

Authority Control Work

@Jeanette Kalchik @Kalli Mathios @Alissa Hafele @Sheila Torres-Blank @Mary Campany @Emily Semenoff

What are you currently doing?

  • Stanford’s use case: overlay bibliographic records with updates to authorized headings, but also load authority records

  • No current plans to link authority records to bibs at any real scale

  • Hoped to not use Data Import at all for this process

  • Tested out EBSCO migration tools and Python script, and also Source Record Batch Update and Change Manager Endpoints, but Data Import tests proved more successful than these other options

  • Currently, all 500k bib records and 1 of 4m auth records have been loaded via Data Import

  • Alissa batching these incrementally during off hours

What are the challenges you face?

  • Encountering frequent errors and timeouts with these larger loads

  • Error handling isn’t great--decoding error messages can be tough

  • How to handle deletes via the UI?

  • No overrides of field protections available for authority records; this is problematic for a protected local field Stanford uses to facilitate workflows

  • Compiled issues from Stanford’s analysis of MARC authorities loads and processes:

Updates from other institutions loading authority records

  • Sheila Torres-Blank for Texas State – difficulty matching on 010 in authority records, can’t get this working despite the fact that it is supposed be operational

    • Mich State Libraries can get matches on 010 to work for loading authority records; Khalilah believes there are some underlying issues in TSU data that she wants to investigate

  • Jennifer Eustis for UMass/Five Colleges – not going to use MARC authorities, just sending bibs out for processing and loading them back in

    • Haven’t experienced any timeouts, though a few issues with smaller files throwing errors [Jennifer, make sure I got this right!]

  • Katie Rahman for [institution]

  • Lola Estelle, migration consultant from EBSCO – working with Notre Dame on migration

    • No automated batch linking possible until maybe Umbrellaleaf

    • Temporary workaround will use repeated Backstage processing loads, and then attempt wholesale linking when the functionality is available

    • What are others planning?

      • Stanford still busy with initial phases, no plans yet for automatic batch linking; Nancy Lorimer has thought some about partial matches to subject strings, tracking these in the system, but no action can be taken yet

      • Lisa Lorenzo, MSU: Big roadblock is outstanding question about subdivisions in headings and how that will affect linking logic

  • Question for Khalilah: is there a SIG or user group for MARC authorities?

    • Previously, activity was conducted within the QuickMarc group; development had to slowed down in favor of more urgent priorities

    • Khalilah hopes that MARC Authorities work will pick back up again with Umbrellaleaf

      • Clearly batch automatic linking is the number one need expressed in this discussion

      • Perhaps there is a case to make for a separate module that is solely devoted to managing linking, rather than trying to administer this via Data Import (obviously this would require resources which aren’t guaranteed)

      • Shared a slide deck that outlines potential next steps for MARC Authorities batch linking

  • Tom Hanstra from Notre Dame: Aleph does have a tool that runs as a background process to handle authorities linking

    • Would like to see similar functionality when ND migrates next year

  • Jennifer, UMass/Five Colleges

    • Wonders about indexing of authorized headings and how a separate module for batch linking would affect that?

    • Will get documentation from Khalilah and make a reference page in DI Confluence area

  • Khalilah

    • There is no current roadmap for Authorities beyond Umbrellaleaf

      • Khalilah and Christine S-R working on this now

    • Batch linking and refinements to authority files are on the docket for Umbrellaleaf

    • Vetch – could be some carryover work from batch development in Umbrellaleaf

Spreadsheet to track top priorities, use cases, and ideas for enhancements.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Upcoming meetings/agenda topics: --

 

Chat: