/
2025-2-5 Data Import Subgroup meeting

2025-2-5 Data Import Subgroup meeting

Recordings are posted Here (2022+) and Here (pre-2022)                   Slack channel for Q&A, discussion between meetings

Requirements details Here                                                                    Additional discussion topics in Subgroup parking lot


Attendees: Ryan Taylor, Christie Thomas, Jennifer Eustis

Notetaker: Jennifer Eustis

Links:

Agenda: 

TopicWhoMeeting NotesRelated JiraDecisions and Actions

Announcements:


We are looking for a new group of volunteers to take notes. Please consider helping out.

Save the date for WOLFcon 2025 9/23-9/25

EBSCO FOLIO Day is coming up May 2, 2025



Documentation Discussion: Identify gaps/needs

Ryan/All

Spreadsheet for tracking documentation needs:



Data Import Release Notes for Ramsons

All

Ramsons Flower Release

Links to documentation are also there when it is available. This time around, Ramsons issues for Data Export were added as this working group no longer meets.



Review Topics Tracker


All

Data Import Implementers Topic Tracker

  1. Create single log entry for a batch import jobs that are split rather than a log entry for each sub-batch of 1000 records (Submitted by Christie Thomas).

This is something for Univ. of Chicago which hasn't implemented file splitting because of the logs where you see the logs for the parts and not a log entry for the file that was uploaded before splitting. Without file splitting, they are limited to 1000 records at a time and the logs are not working as intended. They don't have enough resources to view each log corresponding to the parts. There is a performance issue with logs. The current approach to file splitting is ok as a permanent solution. 

Sara: Could this be a setting at the tenant level to switch to one log entry per file uploaded and not one log per split file? That's fair. The other issue is the default log page shows a 100 lines and there's a lot of noise. 

Sara: I've found it helpful to have the logs for each split file to focus on that log that has errors. I haven't found the new to review all the logs. Logs hogging the page - I just go over to find my stuff. 

Rob: For Chicago, it doesn't tell if there is an error if there is no action on a record. If there is a 100 field with an e but no e, then it doesn't load. We can't open up a log bigger than a thousand. There seems to be a performance issue when the log has trouble loading in the UI. 

Christie: We have a number of examples where records aren't loading but they don't show as errors. It is necessary to look at all the logs because of this. Data Import has been improved. We don't really know what the jobs really do and we rely on QA tasks to do this. Looking at the logs is a part of that. 

Jennifer: We should really look at the logs and do an overhaul themselves.

Christie: The logs do need to be reviewed. We encounter a lot of issues especially with MARC orders. Ideally there needs to be a setting. DI needs to be performative. I'm concerned about the thin implementation as it won't save time. We need a statement from the community. 

Jennifer: Here the DI has been working well and we can import more than we ever have before.

Christie: We didn't turn on splitting and DI is worse than ever. If splitting is the long term solution, then we have to turn it on. We haven't turned it on because of the logs. If splitting is on, then they need to rely on metadb.

Rob: Chicago is on Quesnelia csp6. Another question we need to ask is if other places are importing smaller files because they know it won't work. 

Sara: We try not to step on our toes. We base imports on size, how often a set is updated or deleted. We also use EDS Custom catalogs for some sets because of this.

Raegan: We do have consortium that have to deal with a large scale. The highest record count is 22 records. One log versus multiple logs, this isn't a concern. If we could authority work, this might change.

Robert: Is the reason don't load Safari books online or large ebook collections because you load them into a shared discovery layer?

Sara: All these things are shared. It's because the sets are highly changeable.

Christie: This has been a problem within the community. We need to make a decision about whether DI can't support large transactional imports. Chicago can't keep trying to make DI work for them. 

Ryan: This has been a good conversation. We need to be explicit about the expectations of large transactional imports. We need to look into a setting for when file splitting is enabled to consolidate the logs for a job. Smaller scale is a more general review where logs are at for us today. Being able to change the default display. No action isn't logged as an error. 

Jennifer: We need to be able to learn from Bulk Edit and how the logs are there such as being able to download a csv of log entries. 

Ryan: How do we get this into the backlog? We could look at a general feature for enhancements for DI logs and a separate feature for consolidated logs when splitting is turned on. 


  1. RE: MODDATAIMP-1058 – Script run to fix the single record overlay errors (Submitted by Raegan Wiechert).
    1. It did fix the records where there were already issues so that they can now be edited. How ever, I tried a new overlay and it won't allow me to edit the MARC record.

They ran the script. Now all the records they couldn't edit are now editable. When they did an overlay, there continued the same issue of the record not being editable. What does the script do? The fix is to clean up multiple srs marc bibs that have state as "Actual" and doesn't address the underlying issue. They are running on Quesneslia CSP 8. The overall fix is for Ramsons and the script was for Quesnelia to get people through to Ramsons. Ryan will bring this back to FOLIJET. 


  1. Set for deletion phase 2 enhancements (Discussion Continued), https://folio-org.atlassian.net/browse/UXPROD-4944
  2. Blocked issues
    1. https://folio-org.atlassian.net/browse/MODDICORE-386, Partial Matching
    2. https://folio-org.atlassian.net/browse/MODSOURMAN-1094, The number of invoices is displayed when all are errors
    3. https://folio-org.atlassian.net/browse/MODDICORE-386, Match on 035 when qualifier fails


















Notes from previous meetings...





Upcoming meetings/agenda topics: --


Chat: 


Related content