2023-11-15 Data Import Subgroup meeting

Recordings are posted Here (2022+) and Here (pre-2022)                   Slack channel for Q&A, discussion between meetings

Requirements details Here                                                                    Additional discussion topics in Subgroup parking lot


Attendees: Ryan Taylor Ann-Marie Breaux (Deactivated) Autumn Faulkner Christie Thomas Corrie Hutchinson (Unlicensed) Kathleen Moore Lynne Fors Raegan Wiechert Sara Colglazier Yael Hod Jennifer Eustis Kim 

Links:

Agenda: 

  • Data-slicing: Discuss thoughts & feelings on Data-slicing, which is currently enabled in Poppy Bug fest for all to test.
  • Duplicate Records in Imports: Discuss current & expected behavior for Duplicate Records in uploaded files.
    • What is a duplicate? multiple copies of the same record in an incoming file that will affect the same Instance or SRS MARC; only relates to updates, not creates
    • Jiras: MODSOURCE-530 - Getting issue details... STATUS , MODSOURMAN-898 - Getting issue details... STATUS
    • Current solution: Process first copy, ignore subsequent copies
    • Error message? Mark as an error, with message "Duplicate record in incoming file. Not processed"
    • Jennifer also seeing for files that create records (at least back in Morning Glory); she will test in Poppy BF to see if it's still happening
    • What if duplicate records are in a large file and end up split into separate files by FOLIO (e.g. file of 5,000 records and duplicates are record #150 and #2,465 and #4,876)
      • Identify as duplicates (since they were in the same original file) and show the same error message
      • Not available currently; aim for Poppy patch or Quesnelia, depending on level of effort
  • Field mappings for statistical codes: we're making some adjustments in the context of https://folio-org.atlassian.net/browse/MODDICORE-304, and I'd like to confirm that the valid/invalid mappings are acceptable.
    • A-M: Add link to spreadsheet and use it to update field mapping documentation
    • Jira: MODDICORE-304 - Getting issue details... STATUS
    • Confirm "else" logic works with the dropdown list values (problematic because stat code category, code, and name are grouped together in the dropdown)
    • Be able to use the UUIDs in addition?
    • Confirm what incoming data FOLIO needs to support
      • Else logic works
      • When mapping a default, always select from the dropdown list
      • Stat code (yes) - case sensitive
      • Stat code name (yes) - case sensitive
      • Stat code name + code (no) we support for locations and location codes, but stat codes are not presented in the UI dropdown in the same way
      • Stat code category (no) extra complexity that we prefer not to have to support
  • Suggestions:
    • Sara Colglazier: Allow the use of vendor codes when setting up field mapping profiles for orders and invoices
    • Lynne Fors: Add an info icon to indicate the ###REMOVE### mapping (and link to other mapping info?) A-M to talk with Ryan and UI devs


Upcoming meetings/agenda topics:


Chat

Lynne Fors  to  Everyone 1:06 PM
I haven't had a chance to do any DI testing as my library is in full swing to prep for our library closure for renovation.

Jennifer Eustis  to  Everyone 1:06 PM
Sorry … just running from another meeting

Kathleen Moore  to  Everyone 1:16 PM
for full transparency: queue management isn't working quite as we expected, either. In our smaller testing environments, we were consistently seeing smaller jobs being pulled throughout the running of a larger job. That's not, however, what we're seeing in Bugfest or in some of the other UAT environments. It appears like the queuing portion does work, however, it's unfortunately functioning in a limited way. Smaller jobs (including EDIFACT files) will be prioritized if they're submitted in the first few minutes after a larger job is submitted, otherwise they might not end up prioritized as expected. we'll have to follow-up with additional work to make queue management more robust, and function more as we thought this initial work would allow.

Christie Thomas (she/her)  to  Everyone 1:20 PM
Thank you for sharing that, Sara.

Jennifer Eustis  to  Everyone 1:21 PM
That's right.
For one we have parts 333, then 335, 337, 339, 338, 340
These parts all have 1000 records

Ann-Marie  to  Everyone 1:23 PM
When the file is split, the first 1000 records should be in file 1, then the next 1000 in file 2, etc. (assuming that 1000 is your split number). The chunks may get processed at different speeds, so chunk 27 may finish before chunk 26. And since the log is in the order of most recent finished at the top, the processed chunks won't necessarily be in numerical order

Jennifer Eustis  to  Everyone 1:24 PM
Though the records are probably of different lengths

Kathleen Moore  to  Everyone 1:26 PM
here's a draft story around improving the visibility of the files/logs/odering: UIDATIMP-1561 - Getting issue details... STATUS
I need to drop, but thanks to everyone who has provided feedback!

scolglaz  to  Everyone 1:35 PM
+1 like that too: Dupe

scolglaz  to  Everyone 1:36 PM
Can we be clearer in the error message: Duplicate in incoming file

Lynne Fors  to  Everyone 2:02 PM
I think the statistical code default list came from UChicago and their Ole system (legacy prior to FOLIO).

scolglaz  to  Everyone 2:04 PM
Vendor Codes

scolglaz  to  Everyone 2:09 PM
yes