Preventing bulk editing corrupted MARC instances through error handling
Description
Environment
Potential Workaround
Attachments
- 10 Mar 2025, 03:19 PM
- 10 Mar 2025, 03:19 PM
- 10 Mar 2025, 03:19 PM
- 10 Mar 2025, 03:19 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
- 07 Mar 2025, 02:53 PM
relates to
Confluence content
Checklist
hideActivity
Mikita SiadykhMarch 11, 2025 at 11:56 AM
@Magda Zacharska moving to closed (need to close a sprint) based on @Tatsiana Hryhoryeva verification, please do post-review
Tatsiana HryhoryevaMarch 10, 2025 at 3:19 PM
Hi @Magda Zacharska , @Oleksandr Bozhko
Checked on https://folio-etesting-snapshot-diku.ci.folio.org/ environment - regression issue mentioned in the previous comment is fixed
Also, testing of upload was continued, and 100K of Holdings, Items successfully uploaded to Central tenant (including mix of identifiers from 4 member tenants):
The reason of long upload mentioned in previous comment was that all 100K Holdings went to Errors as “No match found“, so the job took ~6h:
Tatsiana HryhoryevaMarch 7, 2025 at 2:53 PM
Hi @Magda Zacharska , @Oleksandr Bozhko
Errors #1 and #2 from the ticket verified on https://folio-etesting-snapshot-diku.ci.folio.org/ , https://folio-etesting-sprint-fs09000000.ci.folio.org/ environments, works as expected. Error #3 was not implemented as this seems to be not real case (find details in Sasha’s first comment).
Besides, after this ticket a regression issue with incorrect status of job was introduced, needs to be verified after fixing:
Also, this ticket may affect process of uploading, so some testing on upload files done on Eureka sprint testing environment, find results in the table:
The issue (too long upload, 20K in 1h) arose with upload 100K Holdings in Central tenant, separate ticket opened to set up appropriate values for some variables on the environment https://folio-org.atlassian.net/browse/RANCHER-2168
Scenario 1 - Instance has source MARC but the underlying SRS record has status old
upload via file with identifiers
Query
Scenario 2 - Instance has source MARC but the underlying SRS record is missing
upload via file with identifiers
Query
Scenario 3 - Instance has source MARC but associated with more than one active SRS record
upload via file with identifiers
Query
Scenario 4 - Make sure affected records are excluded from bulk edit job and are not a part of .csv and .mrc files but are reported as errors and are available in the .csv file with errors
Oleksandr BozhkoMarch 6, 2025 at 9:12 AM
Verified on https://folio-edev-thunderjet-diku.ci.folio.org environment:
Make a second SRS record for one of the MARC instances (for example, 9f9ef0b9-af30-4043-8f7d-16aa75c62b63). It can be done via Postman:
make GET request /source-storage/source-records?instanceId=9f9ef0b9-af30-4043-8f7d-16aa75c62b63&deleted=true and copy recordId value 508dbbb5-df79-41f9-871f-c57768976cf1
(this is SRS id of 9f9ef0b9-af30-4043-8f7d-16aa75c62b63 instance);
make GET request /source-storage/records/508dbbb5-df79-41f9-871f-c57768976cf1 and copy response;
make POST request /source-storage/records with the copied previously response as a body (also remove unrecognized symbols in ‘content’ field):
Update one of the SRS records with underlying Instance (a791fb25-e284-45ff-b5eb-aa94debb0ea7) in the database and change state from ACTIVE to OLD:
Go to Bulk edit → Instances → Instance UUIDs → upload file Loading file... with instance UUIDs including 9f9ef0b9-af30-4043-8f7d-16aa75c62b63 and a791fb25-e284-45ff-b5eb-aa94debb0ea7:
Go to Actions → Download Matched records (CSV): Loading file...
Go to Actions → Download Errors (CSV): Loading file...
As you can see, both of instance ids (9f9ef0b9-af30-4043-8f7d-16aa75c62b63 and a791fb25-e284-45ff-b5eb-aa94debb0ea7) are not present in matched records and present only in the errors file.
Magda ZacharskaMarch 4, 2025 at 4:33 PM
Hi @Oleksandr Bozhko - of course the error message cannot include srs uuid - I think it was just unfortunate copy and paste issue. I corrected the Jira description and the error message should read: “SRS record associated with the instance is missing.” Thank you for pointing it out.
Bulk edit should not allow for editing or attempting to edit records that are considered bad data. User should be informed early about existing problems so that the data can be corrected before next bulk edit job.
Requirements/Scope
The story covers instances with source MARC and underlying SRS records that store the MARC data.
If instance has source MARC but the underlying SRS record is missing or has status old, then the instance cannot be bulk edited and is reported under Errors with the following data:
Identifier column contains Instance UUID
Reason column contains following error: SRS record associated with the instance is missing.
If instance UUID is associated with more than one active SRS record, then the instance cannot be bulk edited and is reported under Errors with the following data:
Identifier column contains Instance UUID
Reason column contains following error: Multiple SRS records are associated with the instance. The following SRS have been identified: <srs-uuid1>, <srs-uuid2>… <srs-uuidn>.
If instance UUID is associated with one active SRS record but the SRS record is associated with more than one Instances UUIDs, then the instance cannot be bulk edited and is reported under Errors with the following data:
Identifier column contains Instance UUID
Reason column contains following error: SRS record <srs-uuid>is associated with multiple instances: <instance-uuid1>, <instance-uuid2>… <instance-uuidn>.
The same behavior is for the bulk edit jobs triggered by the list of identifiers (.csv file) or by a query.
The affected records are excluded from bulk edit job and are not not a part of .csv and .mrc files but are reported as errors and are available in the .csv file with errors.
Out of scope:
Malformatted SRS records will be handled inhttps://folio-org.atlassian.net/browse/MODEXPW-565 .
Acceptance criteria:
All requirements are met
Karate and unit tests are added/updated and they are passing.