Batch Importer (Bib/Acq)
(UXPROD-47)
|
|
| Status: | In Progress |
| Project: | UX Product |
| Components: | None |
| Affects versions: | None |
| Fix versions: | Quesnelia (R1 2024) | Parent: | Batch Importer (Bib/Acq) |
| Type: | New Feature | Priority: | P2 |
| Reporter: | Ann-Marie Breaux (Inactive) | Assignee: | Ryan Taylor |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | LC-priority2, arlef-di, data-import, discuss-with-subgroup, loc, match-details | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Release: | Quesnelia (R1 2024) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Epic Link: | Batch Importer (Bib/Acq) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front End Estimate: | Small < 3 days | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front End Estimator: | Olamide Kolawole | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front-End Confidence factor: | 60% | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back End Estimate: | XXL < 30 days | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back End Estimator: | Olamide Kolawole | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back-End Confidence factor: | 80% | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Development Team: | Folijet | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Kiwi Planning Points (DO NOT CHANGE): | 62 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| PO Rank: | 124 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Chalmers (Impl Aut 2019): | R3 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Chicago (MVP Sum 2020): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Cornell (Full Sum 2021): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Duke (Full Sum 2021): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: 5Colleges (Full Jul 2021): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: FLO (MVP Sum 2020): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: GBV (MVP Sum 2020): | R4 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Grand Valley (Full Sum 2021): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Lehigh (MVP Summer 2020): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: MO State (MVP June 2020): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: TAMU (MVP Jan 2021): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: U of AL (MVP Oct 2020): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Score: | 15 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Showstopper for Summer 2021 Implementers?: | No | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Showstopper December 11 Meeting Summary: | Not a 'showstopper', so not discussed at meeting. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Current situation or problem: We want to ensure that MARC-MARC matching works properly for repeatable and non-repeatable fields, especially 0XX/9XX fields, and that they can pair well with Inventory submatches. In scope:
Out of scope:
Use case(s):
Proposed solution/stories:
Links to additional info:
Questions:
|
| Comments |
| Comment by Ann-Marie Breaux (Inactive) [ 20/Nov/20 ] |
|
Convo with Mark Veksler, Hkaplanian, Magda Zacharska, VBar, Taras Spashchenko about Cornell SRS queries (
|
| Comment by Ann-Marie Breaux (Inactive) [ 24/Nov/20 ] |
|
Hi Taras Spashchenko Please create the rest of the stories for the endpoints (and maybe paging, to handle large data sets) by the end of this week, so that Folijet can include in sprint 103. Thank you! |
| Comment by Ann-Marie Breaux (Inactive) [ 02/Dec/20 ] |
|
Hi Taras Spashchenko Just checking in on this. Concorde has follow-on work that happens after this work. Do you think you will be able to finalize the Folijet stories this week? |
| Comment by Taras Spashchenko [ 03/Dec/20 ] |
|
Hello Ann-Marie Breaux, the stories will be ready today. |
| Comment by Ann-Marie Breaux (Inactive) [ 03/Dec/20 ] |
|
Sounds good - thanks, Taras Spashchenko |
| Comment by Taras Spashchenko [ 03/Dec/20 ] |
|
I added tech stories for MARC search functionality https://folio-org.atlassian.net/browse/MODSOURCE-221 |
| Comment by Ann-Marie Breaux (Inactive) [ 08/Dec/20 ] |
|
Discussed with Magda Zacharska, and we moved the prep stories from this feature to
|
| Comment by Lisa McColl [ 04/Feb/21 ] |
|
The workaround because this is not available is very time consuming. We've been doing this for five months at this point. I think without this feature a larger institution than ours could not sustain the workflow we've had to do outside of FOLIO in order to update our records. I see this is Blocked awaiting a dependency. Is there any estimate on which version we can expect this in? I would love to see this go up to a P1, to be honest. It would be interesting to get community feedback on it. |
| Comment by Jenn Colt [ 04/Feb/21 ] |
|
Hi Lisa- The UXPROD for the dependency that is blocking this is https://folio-org.atlassian.net/browse/UXPROD-2791 We are trying to get 2791 done for Iris (although it is at risk at this point) but after that it would still take time for Ann Marie's developers to take advantage of it. But it would be good to uprank 2791 if you think it is important to you, right now it as viewed as just a Cornell issue 🙂 That said, I am currently doing our 035$a matches with MARC to instance and then using a qualifier, and that seems like it will work for many of mine (doesn't help with the $z I realize.) |
| Comment by Lisa McColl [ 04/Feb/21 ] |
|
Thank you Jenn! I know you mentioned this in Slack to me too, so thank you for your time in both places. I added myself as a watcher to
An 035$a to OCLC identifier match would leave us with a lot of duplicate records in FOLIO. Just being able to query the SRS with the LDP would save a little time, and make the results better when we bring in new records. I'm pretty eager for all the above and attached SRS functionality to get into place. Right now when I get a new file from WorldShare, for example, I query by URL, OCLC number, and ISBN, to get any possible match out of the FOLIO. I perform the matches between what I find in FOLIO and the new file in OpenRefine. That leads to two files to two files to load "new to folio" and "merge with existing folio". For the "merge" file I just match on the instance hrid at that point. It's very time consuming and hard to spread the workload around since it's so weirdly specialized. |
| Comment by Ann-Marie Breaux (Inactive) [ 18/May/21 ] |
|
Discussed with Jenn Colt and reviewed the stories on
|
| Comment by Jenn Colt [ 23/Mar/23 ] |
|
ISBN case - I have a set of records with multiple ISBNs. The matches are not working if there is more than one ISBN on the incoming record that qualifies for matching. I can add a 978 qualifier which helps but these records have multiple 978 ISBNs in some cases and those matches do not work. My expectation was that if any incoming ISBN matches and existing ISBN, the match would be positive. |
| Comment by Jenn Colt [ 12/Apr/23 ] |
|
Ann-Marie Breaux I notice this is not scheduled. Is there any way to consider it for Poppy? We are doing so much more MARC to MARC now because of the field protection change that this is becoming more of a problem. |
| Comment by Jenn Colt [ 24/Apr/23 ] |
|
In Slack it was asked why this was a problem. I answered:
If I am incorrect about needing to match on MARC to update MARC that would be amazing to know, but as far as I know this is the situation that we have created. We have made it so that we have to update the entity that does not have adequate matching capability when we need to override field protections. And we have to do that all the time when loading vendor electronic resource records. |
| Comment by Jennifer Eustis [ 05/Jan/24 ] |
|
Hi Ryan Taylor , Here is one of our more frequent use cases for matching. This example can be done from an incoming marc field to an existing instance system control number field or to an existing srs marc field 035 with 1st indicator 9 and second indicator blank. L Use Case description: 5C has a shared bib environment. We also buy eResources that can be found in eresources packages that we buy. We have combinations where an eresource can be acquired by 1 or more of the schools in the 5C consortia. For eresources that we need to track or match on such as to identify all those in the Safari package have what we call a container code in the marc field 035 \9 which is an alphanumeric and unique code that we create. We create these by using the school's prefix (AC, HC, MH, SC, UM) or if it is for 1+ of the schools we use FC or 5C and then either the OCLC number or the document identifier or a vendor number or any unique value from the record. For example, in the marc file in the zip folder, you'll see ACYBP or Amherst College Yankee Book Peddler. FCDUGT or consortia loaded package for De Gruyter for 4 schools) and UMDEGT for the De Gruyter package for UMass. For the YBP stuff, this used to be 035 9\$aumypb or 035 9\$aacybpOCLCNumber. From the file. =035 9\$a(ACYBP)1373347480 We need to make an exact match on the specific container code. In this use case, the incoming file is for UMass. We need to check with a match from incoming marc 035 9\ to EITHER instance system control number OR existing srs 035 9\ for the exact match. If there is a match, no records are created. If there is no match, an instance (plus srs), holdings, and items are created.
Requirements: The container code must be unique and in the marc field 035 9\ which is mapped to the instance system control number. Location of container code in marc file: This can be the 1st or 2nd of the 035 9\'s. There can be many different container codes. My existing marc srs file has this where I included several examples of the match having a different ordinality. — Job Profile with Marc to Marc match: Exact Match incoming 035 9\$a to existing srs 035 9\$a. No Match: take no further action Match: create instance (and srs), create holdings, create item, modify incoming marc to remove 856, 876, 852, 877 (these fields are used to create holdings and items: 856$u holdings electronic access uri, 856$y link text, 856$z public note, 876$a barcode, 852$lpermanent loan type (item), 877 item material type, 852$h call number holdings, 852$t holdings call number type, 852$l holdings permanent location code. Holdings electronic access relationahip is set to resource in the mapping profile and item status is set to available in the mapping profile.)
Please note that 5C creates item records with fake barcodes for eResources. Be wary when reloading these records - will need to change barcode or remove barcode from testing.
In the folder: incoming file is for UM De Gruyter eResources with the container code in the 035 9. This should be the 1st of the 035's but not necessarily. This file also has the ISBN in another 035 9. Existing file is from our 5C prod and has our instance HRIDs and uuids in the file. To load them remember to remove the 001 and 999.
Expected behavior: The marc to marc AND marc to instance system control number match work with repeating fields where the field being matched might be 1st or not in ordinality.
|
| Comment by Jennifer Eustis [ 19/Jan/24 ] |
|
Ryan Taylor I realized that I got a little over rambunctious with repeatable Marc fields. The incoming file should have only 1 match point or 1 035 9\$a field. I remember speaking to Ann-Marie about this. If I remember correctly, there is logic that only the 1st of the repeating fields is considered. I'm not sure if this is the case or if you can verify this. Otherwise, I'm fine with specifying that there be only 1 match point in the incoming file. Though this still means we have to consider that this one match point will match to one of many marc srs fields in the existing srs. |