2023-04-12 Data Import Subgroup meeting

Recordings are posted Here (2022+) and Here (pre-2022)                   Slack channel for Q&A, discussion between meetings

Requirements details Here                                                                    Additional discussion topics in Subgroup parking lot


Attendees: Ann-Marie Breaux (Deactivated)  Jennifer Eustis Lynne Fors Christie Thomas Jeanette Kalchik Jenn Colt Kim leeda.adkins@duke.edu Lisa Smith Lloyd Chittenden Monica Arnold Taylor Smith 

Current development (Orchid)

Agenda: 

  • Announcements
    • Orchid DI overview at MM SIG tomorrow
    • Nolana
      • Imports with matching and branching taking longer in Nolana
      • Christie: File with 1 record in a profile with match and branch took 9.5 hours in prod. Finished immediately in test.
      • Jennifer E: Single record imports not finishing or taking 20ish minutes (or sometimes fly through). Test and dry run are working better than production.
      • Is there additional info that can be provided when reporting problems (besides developer tool info) that would help in the analysis?
      • Also consider running this script against Nolana profiles, to clean up debris in them: Script to remove child and parent elements from data import profiles
      • Lynne: testing upgrade and have a finance blocker; even when creating new profiles and then try to select see screenshot
  • Multiple Holdings and Items
    • Does anyone have examples of MARC files with
      • Data being used to create/update multiple holdings
        • Jennifer file
      • Data being used to create/update multiple items for 1 holdings
        • Christie - shelfready file for an ordered thing that has 1 item (for v.1) that needs to be updated, but needs 3 additional items (for vols 2-4) to be created
      • Data being used to create/update multiple items across multiple holdings
        • Lynne - update multiple holdings and items with stat codes; would want to exclude some locations/holdings
          • Maybe export all (be sure to include other identifying info like perm locs, not just HRIDs and UUIDs), then delete the holdings/items you don't want to update before re-importing
      • Data being used to create orders for multiple copies for 1 location
        • Christie:
          • Create instance, holdings, 5 items at point of order
          • Update instance, 1 holdings, 5 items at point of receipt
      • Data being used to create orders for multiple copies for more than 1 location
        • TBD
      • What about P/E mix?
        • Jennifer: maybe for 5C
    • Please send to Ann-Marie Breaux (Deactivated) via Slack or e-mail
  • Log subgroup work
    • Finish discussion of the background slides that we looked at briefly last week; finalize requirements
    • A-M discussing with designer and devs
    • Topics from last week:
      • Maybe add the action profile name at the top of the JSON screen? 
        • Will log as an idea, but concerned about slowing down the log, especially because of the coming multi-copy changes
        • A-M to add as future suggestion
      • Multiple matches
        • Per devs, OK to show UUIDs if up to 4 matches, then default message after that
        • Have the error message be a hotlink to a notepad or downloadable list of all matches
          • Per devs, that data is not currently stored in the log; would be a more significant change that we can make in Poppy
          • A-M to add as future suggestion
        • Show the search expression used for determining duplicates
          • Per devs, that data is not currently stored in the log; would be a more significant change that we can make in Poppy
          • A-M to add as future suggestion
          • Is there a way to better understand what is happening with the matching? 
          • If SMEs give us a couple profiles, could they give us documentation of what the SQL looks like - so that we could document better
            • Jenn and Christie will provide a couple example profiles
            • Does not match
            • Begins with
            • Etc
      • Log summary
        • Add the Job ID next to the # records in the subheader
          • Per devs, this is do-able. A-M adding story
        • Add the job profile name as a hotlink, under the subheader, above the summary
          • Per devs, this is do-able. A-M adding story
      • Hook up into the dashboard somehow
        • A-M to add as future suggestion
      • Add separate column and JSON tab for incoming parsed data vs SRS MARC
        • A-M still to discuss with devs

Upcoming meetings/agenda topics:

  • Misc
    • Discuss/review mockups for MARC updates refinements
    • POL/VRN matching
      • For invoices, we only consider open POs
      • For Instance, Holdings, Items, we currently (Orchid and before) only consider open POs. Should we change the Inventory matching logic to allow matching on closed POs as well?
      • MODDATAIMP-769 - Getting issue details... STATUS
    • Deleting outdated versions of SRS records
      • Can we define a cutoff date? 90 days ago? 1 year ago?
        • Different for records that are used during import and then not consulted again? (e.g. EDIFACT invoices, MARC bibs that only create/update orders, holdings, items)
      • Effects on the import log
    • Data import and consortia (cross-tenant importing)
    • OCLC number cleanup
      • Confirm 035 structure, aim for it to be consistent across all FOLIO tenants
    • Downloading log info
      • Lots of interest, especially for errors
      • Including identifiers for everything
      • What would UI look like?
      • What would output look like? 
    • Variation between PTF and production library performance results - why?
    • Revisit use cases for Updating individual MARC fields - what are the most common use cases?


Chat

Lynne Fors  to  Everyone 1:03 PM
Yeah. Something is definitely wrong with the HVAC.
Christie Thomas (University of Chicago, she/her)  to  Everyone 1:06 PM
I had a file with one record that took 9.5 hours to complete.
Jenn Colt  to  Everyone 1:07 PM
We only get that when we cancel jobs. Otherwise our complex profiles have gone from taking 20 min to 40ish
Lisa Smith - Mich State  to  Everyone 1:08 PM
We recently were upgraded to Nolana.  Yesterday, our DI was just not working at all.  Now it seems ok.
Jennifer Eustis  to  Everyone 1:08 PM
Our test is working better than production
Jenn Colt  to  Everyone 1:13 PM
Maybe your test Kafka is shared by fewer active tenants
Jennifer Eustis  to  Everyone 1:14 PM
I'm not sure
Jenn Colt 1:15 PM
If it helps, we have seen nothing like that at all. It seems like it has to be environmental?
Jenn Colt  to  Everyone 1:16 PM
It should be otherwise it’s not much of a dry run!
Lynne Fors  to  Everyone 1:17 PM
They tell us that it's a "snapshot" of our production environments. We are also EBSCO
Christie Thomas (University of Chicago, she/her)  to  Everyone 1:43 PM
Right. That makes sense.
Christie Thomas (University of Chicago, she/her)  to  Everyone 2:09 PM
My apologies, but I did not write down where to create the complex matching profiles. Is that on Nolana bugfest or Orchid bugfest? Or both?