2023-11-15 Data Import Subgroup meeting
Recordings are posted Here (2022+) and Here (pre-2022) Slack channel for Q&A, discussion between meetings
Requirements details Here Additional discussion topics in Subgroup parking lot
Attendees: Ryan Taylor Ann-Marie Breaux (Deactivated) Autumn Faulkner Christie Thomas Corrie Hutchinson (Unlicensed) Kathleen Moore Lynne Fors Raegan Wiechert Sara Colglazier Yael Hod Jennifer Eustis Kim
Links:
- Poppy import planning dashboard
- Poppy timeline
- Quesnelia import planning dashboard (still being defined)
- Quesnelia timeline
- Folijet Current Development Board
- Folijet (Data import) Bug Dashboard
Agenda:
- Data-slicing: Discuss thoughts & feelings on Data-slicing, which is currently enabled in Poppy Bug fest for all to test.
- Duplicate Records in Imports: Discuss current & expected behavior for Duplicate Records in uploaded files.
- What is a duplicate? multiple copies of the same record in an incoming file that will affect the same Instance or SRS MARC; only relates to updates, not creates
- Jiras: - MODSOURCE-530Getting issue details... STATUS , - MODSOURMAN-898Getting issue details... STATUS
- Current solution: Process first copy, ignore subsequent copies
- Error message? Mark as an error, with message "Duplicate record in incoming file. Not processed"
- Jennifer also seeing for files that create records (at least back in Morning Glory); she will test in Poppy BF to see if it's still happening
- What if duplicate records are in a large file and end up split into separate files by FOLIO (e.g. file of 5,000 records and duplicates are record #150 and #2,465 and #4,876)
- Identify as duplicates (since they were in the same original file) and show the same error message
- Not available currently; aim for Poppy patch or Quesnelia, depending on level of effort
- Field mappings for statistical codes: we're making some adjustments in the context of https://folio-org.atlassian.net/browse/MODDICORE-304, and I'd like to confirm that the valid/invalid mappings are acceptable.
- A-M: Add link to spreadsheet and use it to update field mapping documentation
- Jira: - MODDICORE-304Getting issue details... STATUS
- Confirm "else" logic works with the dropdown list values (problematic because stat code category, code, and name are grouped together in the dropdown)
- Be able to use the UUIDs in addition?
- Confirm what incoming data FOLIO needs to support
- Else logic works
- When mapping a default, always select from the dropdown list
- Stat code (yes) - case sensitive
- Stat code name (yes) - case sensitive
- Stat code name + code (no) we support for locations and location codes, but stat codes are not presented in the UI dropdown in the same way
- Stat code category (no) extra complexity that we prefer not to have to support
- Suggestions:
- Sara Colglazier: Allow the use of vendor codes when setting up field mapping profiles for orders and invoices
- Lynne Fors: Add an info icon to indicate the ###REMOVE### mapping (and link to other mapping info?) A-M to talk with Ryan and UI devs
Upcoming meetings/agenda topics:
Chat
Lynne Fors to Everyone 1:06 PM
I haven't had a chance to do any DI testing as my library is in full swing to prep for our library closure for renovation.
Jennifer Eustis to Everyone 1:06 PM
Sorry … just running from another meeting
Kathleen Moore to Everyone 1:16 PM
for full transparency: queue management isn't working quite as we expected, either. In our smaller testing environments, we were consistently seeing smaller jobs being pulled throughout the running of a larger job. That's not, however, what we're seeing in Bugfest or in some of the other UAT environments. It appears like the queuing portion does work, however, it's unfortunately functioning in a limited way. Smaller jobs (including EDIFACT files) will be prioritized if they're submitted in the first few minutes after a larger job is submitted, otherwise they might not end up prioritized as expected. we'll have to follow-up with additional work to make queue management more robust, and function more as we thought this initial work would allow.
Christie Thomas (she/her) to Everyone 1:20 PM
Thank you for sharing that, Sara.
Jennifer Eustis to Everyone 1:21 PM
That's right.
For one we have parts 333, then 335, 337, 339, 338, 340
These parts all have 1000 records
Ann-Marie to Everyone 1:23 PM
When the file is split, the first 1000 records should be in file 1, then the next 1000 in file 2, etc. (assuming that 1000 is your split number). The chunks may get processed at different speeds, so chunk 27 may finish before chunk 26. And since the log is in the order of most recent finished at the top, the processed chunks won't necessarily be in numerical order
Jennifer Eustis to Everyone 1:24 PM
Though the records are probably of different lengths
Kathleen Moore to Everyone 1:26 PM
here's a draft story around improving the visibility of the files/logs/odering:
-
UIDATIMP-1561Getting issue details...
STATUS
I need to drop, but thanks to everyone who has provided feedback!
scolglaz to Everyone 1:35 PM
+1 like that too: Dupe
scolglaz to Everyone 1:36 PM
Can we be clearer in the error message: Duplicate in incoming file
Lynne Fors to Everyone 2:02 PM
I think the statistical code default list came from UChicago and their Ole system (legacy prior to FOLIO).
scolglaz to Everyone 2:04 PM
Vendor Codes
scolglaz to Everyone 2:09 PM
yes