2022-04-13 Data Import Subgroup meeting
Recordings are posted Here (2022+) and Here (pre-2022) Slack channel for Q&A, discussion between meetings
Requirements details Here Additional discussion topics in Subgroup parking lot
Attendees: Ann-Marie Breaux (Deactivated) Timothy Watters Jennifer Eustis Lynne Fors Jenn Colt leeda.adkins@duke.edu Lloyd Chittenden Monica Arnold Raegan Wiechert Taylor Smith
Lotus
- Lotus Folijet planning: dashboard where you can see the current scope and status of Data Import work for Lotus - DONE!!
- Current Data Import feature development dashboard and bugfix support
- Current work:
- Admin note bug just found today: hope to fix this week, either in Data Import or Inventory - TBD; also need to figure out if Holdings and Items have the same problem
- Some TestRails deferred to Morning Glory
- A-M: Check the multiple 856s in the holdings record especially - is it the Bugfest env or an actual bug? In Kiwi had to put the second 856 into different field, but was working properly with multiple 856s in Lotus.
- A-M: Check the MARC modification jobs that were getting stuck
- Jennifer E's E2E test worked fine except for the multiple 856s, so may be OK to pass it
- Updating Lotus release notes
Morning Glory
- Morning Glory Folijet and Spitfire planning: dashboard where you can see the current scope and status of Data Import work for Morning Glory
- Current work:
- Adding Admin notes to the 3 Inventory field mapping screens
- Log refinements: UI for deleting logs (no backend yet)
- Starting
- Field protection refinement (distinguish handling for repeatable and non-repeatable fields)
- Per Jenn/Christie, also make sure that the field protections do not prevent modifications on incoming records if the field is protected, e.g. 856 fields)
- Flow control (to help Single Record Imports and MARC Authority UI deletes not get stuck)
- Prep for Importing order data in MARC format (will be Nolana work)
- Field protection refinement (distinguish handling for repeatable and non-repeatable fields)
Agenda topics:
Extend MatchValueLoader implementations to allow filtering according to Qualifiers and MatchCriteria:
Identifier matching should allow for qualifiers, compare part, and match criteria
- 856$u/URI match working well
- Using Static value submatches for true or false works well for checkboxes
- For libraries making matches on OCLC numbers, what is the best way that you have found to construct a match?
- Some using whole string
- Raegan Wiechert
- using numerics-only for OCLC matches
- numerics only works fine, regardless of whether you put it in Incoming record, Existing record, or Both
- Use exactly matches
- Raegan mentioned bug that causes old records not to be updated via import, unless there's any change in quickMARC. Per Christie, it was related to the preceding/succeeding title and how that changed the Instance schema
- A-M check for the old records bug (different from the tags preceding/succeeding titles)
- MARC-MARC matching
- Lotus: Allows for any field in a MARC record except
- Are these needed in Morning Glory?
- Matching for 100-899 fields? Probably not many use cases for these; should work, but not heavily tested. Maybe a constructed title match, e.g. 3,3 which Innovative has as a secondary/confirmation match
- Repeatable fields (e.g. 024, 035, 9xx)
- Incoming record: Only first version of the field is considered (NOTE: It's the first field that has the indicator(s) and/or subfield specified in the match profile)
- Ind 1, Ind 2, Subfield taken into account (in addition to the data)
- Qualifier also is taken into account
- Does FOLIO need to check all incoming 035s against all 035s in the existing SRS records? Or just the first? ALL
- Wildcards for Ind 1, Ind 2 (repeatable or non-repeatable fields)
- Needed? Still TBD
- Jenn: Tested and was also able to use qualifier as a way to determine which incoming version of the repeatable field was used for matching, regardless of its position in the list of repeated fields
- Christie: 1st 024 is not sufficient for Chicago, and is a blocker, since need MARC/MARC match to update the MARC (protect fields)
- A-M: Add bug: can't add a cat date to the Instance and Update the SRS (protect) in the same job - is it because of the match or having Update Instance and Update SRS MARC actions in the same profile?
- Christie: add a TestRail describing the Chicago shelfready use case
- Jennifer E already has set up a similar one in TestRail:
- OCLC number formats
- (OCoLC)12345
- (OCoLC)ocm12345
- (OCoLC)ocn12345
- ocm12345
- ocn12345
- Can we harmonize/remove duplicates in the 035? Group agreed on the first format.
- Also would need an on-demand script to clean up in SRS
- Christie: 035s with other prefixes, e.g. ASP12345 (Alexander Street), but can't guarantee that this is the first 035 in the incoming record
- Use case: two 035s in an incoming record
- (OCoLC)12345
- ASP12345
- Current system: matches all incoming 035s against all existing 035s in a record
- If OCLC 035 matches 3 records and ASP 035 matches 1 record, system stops since it can't narrow to a single record
- Use case: two 035s in an incoming record
- Jennifer: uses various indicators for 035 (even though not valid MARC) to distinguish OCLC 035s vs vendor 035s; works well for them, as long as FOLIO does not validate the indicators. A-M confirmed that FOLIO does not validate indicators
- Additional info from A-M/Igor:
Let's pretend that these fields are in an incoming record: (Field Ind1 Ind2 Subfield)
- 024 _ _ $a 12345
024 1 1 $a 45678
024 1 _ $x 67890
024 2 2 $x 67890
- 024 _ _ $a 12345
And the fields in the existing SRS record are
024 2 2 $x 67890
024 _ _ $a 12345
024 1 _ $x 13579
024 1 1 $a 45678
024 1 _ $x 67890
I understand that for repeatable fields, FOLIO Lotus only pays attention to the first incoming field, not the rest, but compares to any matching fields in the existing record.
Now - setting up different match profiles, I want to be sure I understand the logic that is in place now:
If the match profile is 024 _ _ $a:
- Matches, because the incoming first 024 looks for an existing 024 with blank indicators and $a and the same value (even though that is the second 024 in the existing record)
If the match profile is 024 1 1 $a:- Matches, because the first incoming 024 with indicators 11 and $a (which is the second 024 in the incoming file) looks for an existing 024 with indicators 11 and $a and the same value (which is the fourth 024 in the existing record)
If the match profile is 024 1 _ $x:- Matches, because the first incoming 024 with indicators 1_ and $x (which is the third 024 in the incoming file) looks for an existing 024 with indicators 1_ and $x and the same value (which is the fifth 024 in the existing record)
If the match profile is 024 2 2 $x:- Matches, because the first incoming 024 with indicators 22 and $x (which is the fourth 024 in the incoming file) looks for an existing 024 with indicators 22 and $x and the same value (which is the first 024 in the existing record)
- However!
Let’s pretend the incoming record looks like this:- 024 1 1 $a 12345
024 1 1 $a 45678
- 024 1 1 $a 12345
- And the existing SRS record is
- 024 1 1 $a 45678
- If the match profile is 024 1 1 $a, SRS does not match, even though “024 1 1 $a 45678” is present in both incoming and existing records.
SRS starts searching a field, that is specified in match profile, scrolling the incoming record from the very beginning, as usual, and takes the first occurrence of <024 1 1 $a>. The first occurrence is “024 1 1 $a 12345". So, SRS takes “024 1 1 $a 12345” and can’t find it in the existing record
- Additional info from A-M/Igor: