Batch Importer (Bib/Acq) (UXPROD-47)

[MODSOURMAN-675] Data Import handles repeated 020 $a:s in an unexpected manner when creating Instance Identifiers Created: 17/Jan/22  Updated: 04/Apr/23  Resolved: 09/Feb/22

Status: Closed
Project: mod-source-record-manager
Components: None
Affects versions: None
Fix versions: 3.3.0
Parent: Batch Importer (Bib/Acq)

Type: Bug Priority: P2
Reporter: Theodor Tolstoy (One-Group.se) Assignee: Khamidulla Abdulkhakimov
Resolution: Done Votes: 0
Labels: data-import, epam-folijet, folijet-support, has-testrail, sprint-133, support
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: File 024 028 test.mrc     File ISBN default refinement.mrc     File New identifiers.mrc     PNG File image-2022-01-17-15-06-25-364.png     PNG File image-2022-01-17-15-22-26-203.png     PNG File image-2022-01-17-15-24-15-238.png     PNG File image-2022-01-17-15-25-11-626.png    
Issue links:
Defines
defines UXPROD-3463 NFR: Data Import R1 2022 Lotus Suppor... Closed
Relates
relates to MODSOURMAN-847 The not-expected handling of repeated... Closed
Sprint:
Story Points: 5
Development Team: Folijet Support
Release: Lotus R1 2022
Affected Institution:
!!!ALL!!!
Epic Link: Batch Importer (Bib/Acq)
RCA Group: Data related (ex. Can be detected with large dataset only)

 Description   

Overview:
Steps to Reproduce:

  1. Log into Bugfest-Kiwi
  2. Import the attached MARC record using the profile "Create instance and holdings AMB"
    The record has the following 020:

Expected Results:
After bugfix

  • If the number is in $a, it is assigned Identifier type ISBN, along with any qualifier information that directly follows it in the same subfield or in $q
  • If the number is in $z, it is assigned Identifier type Invalid ISBN, along with any qualifier information that directly follows it in the same subfield or in $q
  • With this data in the MARC record:
    • 020 $a9780471622673$q(acid-free paper)$a0471725331$q(electronic bk.)
    • 020 $z9780471725336$q(electronic bk.)$z0471725323$q(electronic bk.)
    • 020 $a9780471725329 (electronic bk.)$z0471622672$q(acid-free paper)
  • The following should be created in the Instance
    • ISBN 9780471622673 (acid-free paper)
    • ISBN 0471725331 (electronic bk.)
    • Invalid ISBN 9780471725336 (electronic bk.)
    • Invalid ISBN 0471725323 (electronic bk.)
    • ISBN 9780471725329 (electronic bk.)
    • Invalid ISBN 0471622672 (acid-free paper)

Based on the Default mapping rules, one of the following should have happened:

  • The record fails since 020$a cannot be repeated according to MARC21
  • The repeated (second) 020$a is discarded
  • Data migration follows the default mapping rules in the tenant, and concatenates the identifiers like so: "0870990004 (v. 1) 0870990020 (v. 2)"
    (but note, that neither is a desired behaviour)

Actual Results:

Additional Information:
There is likely A VERY SIMPLE solution to this. Just like with the other identifiers in the default mapping rules, add the entityPerRepeatedSubfield flag to the entity mappings for 020s. Both 020 $a and $020 $z.

The EBSCO ICs just discovered this behavior in their tools. The current set of Default rules, forces us to edit all rules to avoid the above-described behavior.

URL:
These are links to the results of the import in Bugfest-Kiwi. Similar behaviour is in Bugfest-Juniper as well.
https://bugfest-kiwi.folio.ebsco.com/data-import/log/8c150a5a-e5be-4f24-bbb2-d2b00a548f81/f215d190-69f2-4e42-a50d-5c17bad29fb2
https://bugfest-kiwi.folio.ebsco.com/inventory/view/f215d190-69f2-4e42-a50d-5c17bad29fb2?qindex=id&query=f215d190-69f2-4e42-a50d-5c17bad29fb2&sort=title

The Kiwi Mapping rules for 022 has the flag set:

while it lacks it for 020 $a:



 Comments   
Comment by Anya [ 18/Jan/22 ]

Ann-Marie Breaux - not sure what mod-data import to re-assign this to - Theodor Tolstoy (One-Group.se) and I are guessing it should be mod-source record - manager.... I have left it in SUP for now... 

cc Kateryna Senchenko

Comment by Ann-Marie Breaux (Inactive) [ 18/Jan/22 ]

Hi Anya Thanks for the comment. I've updated it.

Theodor Tolstoy (One-Group.se) Would the desired behavior be to put each 020$a (and any qualifier for it) in its own Identifier field? So for the above example:

ISBN 0870990004 (v. 1)
ISBN 0870990020 (v. 2)

And then also do the same with 020 $z if repeated in the same 020 field?

Comment by Theodor Tolstoy (One-Group.se) [ 18/Jan/22 ]

Ann-Marie Breaux Yes. And I believe the rules allow this given you add the above-mentioned entityPerRepeatedSubfield flag into the rules for 020. Most Identifier mappings already use them.

But it would be good if it was tested, of course. And I do now want to tamper with the bugfest rules....

Comment by Ann-Marie Breaux (Inactive) [ 18/Jan/22 ]

Hi Theodor Tolstoy (One-Group.se) Please don't tamper with Bugfest!

Comment by Ann-Marie Breaux (Inactive) [ 18/Jan/22 ]

Devs - please use the attached file: *ISBN default refinement.mrc *to test

It has

  • an 020 with 2 $as
  • an 020 with 2 $zs
  • an 020 with a $a and a $z

Thank you!

Comment by Khamidulla Abdulkhakimov [ 31/Jan/22 ]

Hello Ann-Marie Breaux. I've tested on folio-snapshot and got these results:

  • Invalid ISBN 0471622672 (acid-free paper)
  • Invalid ISBN 0471725323 (electronic bk.)
  • Invalid ISBN 9780471725336 (electronic bk.)
  • ISBN 0471725331 (electronic bk.)
  • ISBN 9780471622673 (acid-free paper)
  • ISBN 9780471725329 (electronic bk.) (acid-free paper)

Could you confirm that this behavior is appropriate?

Comment by Ann-Marie Breaux (Inactive) [ 31/Jan/22 ]

Hi Khamidulla Abdulkhakimov Close! I added the expected results after Bugfix in the description. All in the example are correct except for the last one. I'll move it back to In progress.

Comment by Khamidulla Abdulkhakimov [ 04/Feb/22 ]

Hello Ann-Marie Breaux. I compared the results with the expected results from the description. Everything looks great.

Can you help pick up some test cases related to fields "020" and "024"? The logic of the parameter concat_subfields_by_name has been changed and I want to make sure that the changes affect nothing. Thank you.

cc: Kateryna Senchenko, Oleksandr Bashtynskyi

 

Comment by Ann-Marie Breaux (Inactive) [ 07/Feb/22 ]

Hi Khamidulla Abdulkhakimov I updated the ISBN default refinement.mrc file to have a couple more 020 examples, and added the 024 028 test.mrc file. Please let me know if you see any issues or if all looks good. Thank you!

Comment by Khamidulla Abdulkhakimov [ 08/Feb/22 ]

Hello Ann-Marie Breaux, No errors were noted. Thank you, moving to in review.

Comment by Ann-Marie Breaux (Inactive) [ 09/Feb/22 ]

Hi Khamidulla Abdulkhakimov Tested on folio-snapshot-load, and all looks great. Thank you!

Assigned RCA of Data-related, since the requirement for this is based on incorrectly fielded/subfielded MARC data (e.g. cannot have 2 $a subfields in an 020 field, but many legacy records do, so mapping rules need to account for it.

Generated at Thu Feb 08 22:22:14 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.