Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-1383] Ability to roll back a load if problems Created: 03/Dec/18  Updated: 28/Dec/23

Status: Draft
Project: UX Product
Components: None
Affects versions: None
Fix versions: Trillium (R1 2025)
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: P3
Reporter: Ann-Marie Breaux (Inactive) Assignee: Ryan Taylor
Resolution: Unresolved Votes: 0
Labels: LC2, cataloging, crossrmapps, data-import, delete_record_functionality, delimited_files, loc, marcimport, post-v1, round_iv
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
defines UXPROD-47 Batch Importer (Bib/Acq) Analysis Complete
is defined by MODSOURCE-68 Create endpoint and CLI process to re... Closed
is defined by MODSOURCE-529 Data import: Instances were not delet... Closed
Relates
relates to MODDATAIMP-702 Updating "MARC" record by matching pr... Closed
relates to MODSOURCE-531 Can't update "MARC" record, which was... Closed
relates to MODDATAIMP-428 Cancelled data import jobs sometimes ... Closed
Release: Trillium (R1 2025)
Epic Link: Batch Importer (Bib/Acq)
Analysis Estimate: Large < 10 days
Analysis Estimator: Niels Erik Nielsen
Front End Estimate: Medium < 5 days
Front End Estimator: Niels Erik Nielsen
Front-End Confidence factor: Low
Back End Estimate: XXL < 30 days
Back End Estimator: Niels Erik Nielsen
Estimation Notes and Assumptions: I think this is two distinct requirements, a test load and a rollback of a real load being very different things.

It's not clear if any analysis work has been done already, but a sound estimate would definitely need such an analysis, including thorough considerations about possible technical solutions.

Given the time constraint, the immediate estimates here are not based on any such serious analysis of the two requirements, so confidence is low.
Development Team: Folijet
Kiwi Planning Points (DO NOT CHANGE): 2
PO Rank: 68
PO Ranking Note: preview (UXPROD-669) helps keep this lower priority; aim for Q1 2020; per MM SIG convo, would definitely need batch delete if this is not yet available
Rank: Chalmers (Impl Aut 2019): R4
Rank: Chicago (MVP Sum 2020): R4
Rank: Cornell (Full Sum 2021): R1
Rank: Duke (Full Sum 2021): R1
Rank: 5Colleges (Full Jul 2021): R3
Rank: FLO (MVP Sum 2020): R4
Rank: GBV (MVP Sum 2020): R4
Rank: Grand Valley (Full Sum 2021): R2
Rank: hbz (TBD): R4
Rank: Hungary (MVP End 2020): R1
Rank: Lehigh (MVP Summer 2020): R4
Rank: Leipzig (Full TBD): R4
Rank: Leipzig (ERM Aut 2019): R4
Rank: MO State (MVP June 2020): R1
Rank: TAMU (MVP Jan 2021): R1
Rank: U of AL (MVP Oct 2020): R4

 Description   

Have the ability to completely roll back a load and its resulting actions in SRS, Inventory, MARCcat, Orders, and Invoices, if problems detected.

See convo in the comments below

Architectural wiki page: https: https://folio-org.atlassian.net/wiki/display/~vbar/Data+consistency+options



 Comments   
Comment by Ann-Marie Breaux (Inactive) [ 06/Dec/18 ]

Added this feature as a split from UXPROD-669 Draft , and revised that feature to only be for test loads

Comment by Ann-Marie Breaux (Inactive) [ 02/Aug/19 ]

From Slack conversation 26 July:

Anne L. Highsmith [1:58 PM]
since I had a problem with the load I did yesterday, in that the process didn't create 999 $i as it should have, I'd like to delete all those records and try again. I have already deleted the instance records. Is there a single endpoint I can use to delete everything that was created in source record storage? there would be records in srs.marc_records, srs,records, srs.raw_records, and srs.snapshots, at least.

Ann-Marie Breaux [1:12 PM]
Hi here I'm just getting back from a few days vacation and catching up. Looks like there is some ugly conditioning in 2.2 that will be simpler in 3.1. Anne L. Highsmith did you get a response on your single endpoint question from yesterday? If not, I'll follow up on it.

Anne L. Highsmith [1:18 PM]
Hi ann-marie. No, didn't get an answer on that single endpoint question; thanks for following up.

Ann-Marie Breaux [11:49 AM]
Anne L. Highsmith and here The Data Import developers have confirmed that we do NOT currently have a single delete endpoint to wipe out everything in SRS. It seems like this would be useful for libraries that are testing and loading, so I can write up a story for it. Before I do that, I just want to doublecheck the requirements:
1. Could folks confirm that this would indeed be helpful?
2. And then the requirement would be to have an endpoint that would
a) allow completely wiping out everything in SRS (srs.marc_records, srs,records, srs.raw_records, and srs.snapshots), and
b) would not affect any instances or any other non-SRS records that may have been created during loading (assume they are deleted via some other method).
Does that sound good? Anything that I've missed?

Anne L. Highsmith [11:54 AM]
I would state point 2a to something like 'allow completely wiping out everything in SRS associated with a specific load job'. As for 2b, I think you might WANT it to also delete the instances associated with that load – gives you a complete opportunity to delete a load job and start over.

Jason Root [11:56 AM]
Maybe a flag-able option would be useful?

Ann-Marie Breaux [11:57 AM]
Thanks Anne L. Highsmith We ultimately want to allow for backing out a load, and I could see dealing with the instances and other related records at that point. Is it useful if we just start with the SRS wipeout, not the instances? And with regards to cleaning out SRS, do we need to allow for wiping out just a specific load's worth, or all of SRS? We need to keep this lean and simple for now. This is not creating a complete rollback, just trying to assist with this specific use case that Anne raised.

Ann-Marie Breaux [12:15 PM]
Jason Root would this be useful to you in general? So far, I've only heard a definite yes from Anne L. Highsmith. I want to make sure it would be useful to multiple libraries if we pursue it.

Jason Root [12:19 PM]
Yes very. For now, we can just leave the SRS only stuff in the db, since you can’t get to it in the UI anyways. Anne just deleted the instances from the endpoint in the db. Clearing out either the whole SRS or just the specific load would be useful. For now, I would think clearing out all of SRS would be the most useful.

Jason Root [12:25 PM]
Eventually, having the option to clear out what instances were loaded with what job, as well as the SRS only stuff (all or job-only), would be best.

Ann-Marie Breaux [3:18 PM]
Anne L. Highsmith for this first pass, do we need to allow for clearing out a particular job's worth from SRS, or all of SRS? If a particular job, would you expect to use the job number that shows in the log (starts from 1 and increments), or the job UUID?

Dale Arntson [3:38 PM]
I agree with Jason. There are good use cases both for being able to back out only the previous load, say, because you are working with a problematic data set, etc. But also for clearing SRS to begin a new load, to work on load procedures, to do timings, to change a profile, etc.

Jennifer Eustis [3:44 PM]
I agree as well. There are times when batch loads don't go as planned or perhaps the wrong set of marc records have been provided which has happened to me. When this happens it is necessary to remove everything and start over. When I do need to delete and reload, I try to time it with how this might affect users or making sure that everything was indeed removed before starting any reload. Another consideration I take into account is how large the original load was. If possible it would be great to have a flexible environment. I can see situations where you just need to remove the SRS loaded marc bibs but perhaps only update instances/hol/items or other cases were you need to remove everything. An example of updated might be you loaded the wrong set, you remove the marc srs and reload those records and update associated FOLIO inventory records. An example of needed to remove everything is when you no longer subscribe to a resource and all the records need to be removed as your institution no longer has access to them.

Dale Arntson [3:51 PM]
To your last remark Ann-marie, I'd say the ability to clear SRS and start over would be the most useful. But my needs may not be representative.

Chris Creswell [3:57 PM]
i also think it's useful to be able to clear it and start over, but it's simple enough to do at the database level i think
just delete everything from the tables mentioned above, and every instance created with the appropriate "source" value

Jennifer Eustis [3:58 PM]
There could be those who won't have access to the database level.

Anne L. Highsmith [5:44 PM]
Ann-marie I can agree with others that it would be useful to have an endpoint that clears out all of srs, although that’s not my immediate use case. Beyond that, I’m willing to wait for a more fully-developed ability to back out an entire load, which was in the planning anyway. For the moment, I’ll probably just ignore the fact that there are orphaned records in srs and re-submit the load. It’s more important to me now to figure out why the 999 $i subfields weren’t created than to divert resources to a partial ability to clear out a specific load result.

Ann-Marie Breaux [2:09 PM]
Hi Here Thank you for all the comments - I'm going to submit a request for an endpoint to allow for completely emptying SRS, and hopefully we can get to it in the next few sprints. We have a separate feature for a more fine-tuned roll back of individual imports (https://folio-org.atlassian.net/browse/UXPROD-1383), which we will get to at some point.
Anne L. Highsmith glad that you can just ignore the orphans in SRS for now. And I'll check on the matchedRecordId and get back to the group.

Comment by Holly Mistlebauer [ 17/Jun/20 ]

TAMU comment from Round IV Outliers spreadsheet: Otherwise, have to manually delete erroneously records. -Lisa Furubotten

Generated at Fri Feb 09 00:15:09 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.