Spike Overview
Jira link: - MODDATAIMP-744Getting issue details... STATUS
Spike Status: IN PROGRESS
Objective: Incomplete/disposable records from incoming file that are supposed to be used only as a carrier of data for creating/updating entities other than Instances, should not be saved in SRS.
Background
There is no explicit action to save the SRS MARC record, it is implicit and happens for each incoming file (with a couple of exceptions implemented as "bug fixes"). According to the original design, DI record from the incoming file is considered new and valid record that should be saved prior to any other actions and serve as a single source of truth. In fact, there are indeed scenarios where records that are coming should be saved in SRS and referenced by other entities that are derived from it. However, there are also multiple use cases (usually some kind of updates or creates on Holdings and/or Item, creating Orders and Invoices), where incoming record is considered to be disposable, it might contain only partial data, and if it is saved we end up either with lost data (when original record is overridden) or with messed up links to corresponding inventory entities (when we save the record as new one).
Problem Statement
SRS contains a lot of clutter - records that are not used after import is completed, as well as broken records that are not linked to any FOLIO entity as a result of failed imports from the past. Post-processing mechanism is redundant and can be avoided if prior saving of the records is not mandatory and the problem of generation identifier for Instances (Holdings and Authority?) is resolved. Removing the mandatory step of saving the MARC in SRS prior to any other actions would also significantly simplify the DI flow. Stated problems if addressed would lead to improvements in DI performance - create scenarios would be simplified, update scenarios should benefit from quicker search if SRS DB is not piling up clutter.
Scope
In Scope
Main focus is on importing MARC Bibs, but flows for MARC Holdings and MARC Authority should also be reviewed and either left as is for now, or same changes applied as for MARC Bibs (if applicable).
Out of Scope
Cleaning up SRS DB from clutter is out of scope. Deleting OLD and broken records should be addressed in other spike. Piling up of OLD records as a result of multiple updates on SRS MARC and the overall versioning mechanism is our of scope of this spike.
Research Questions
- What DI scenarios require saving SRS MARC Bib other than Create Instance action?
- What if Save batches of incoming records are not saved in SRS by default? Incoming MARC json will not be accessible in the DI logs, but we could save parsed content in SRM as part of a journal record.
- If SRS MARC record is saved in SRS only as an implicit part of Create Instance action should it happen before or after Creating of the Instance? If before - can Instance UUID and HRID be generated prior to creating the Instance to get rid of Post-Processing step. At what point manipulations on 001+003 → 035 fields should be done? SRM before Saving of the record in SRS or before mapping of Instance in Create Instance action?
- How MARC Modify action should fit in the changed flow?
- MARC Holdings and MARC Authority - should it follow the same flow as MARC Bib?
- Linking MARC Bibs and Authority - any adjustments/risks there?
Deliverables
Updated diagrams of DI flow for main scenarios. New feature and stories for refactoring in Jira.
Simplified* diagram of DI flow for creating SRS MARC Bib and Instance
*Error handling is omitted, DI_ERROR event can be sent at any step in case of errors.
Option 1
Option 2 - SELECTED
Remove step when initial records are saved in SRS (in batches).
Save incoming parsed content in SRM (it will be required for DI log) - it should be cleaned up when JobExecution is deleted
Move on to the action in profile. Create MARC would an implicit step for Create Instance action
001 + 003 → 035 logic should be done in mod-inventory before creating Instance
After inventory entity is created - make a post request to SRS instead of sending the message
Pros
- Straightforward decision about action in SRM - following the profile (see diagram below)
- Simplified flow for scenarios that do need the SRS entity to be created
- No clutter in SRS (incoming records for the jobs that do not require SRS MARC to be created and linked with other entities, will not be saved)
- Removed post-processing step
- Performance benefits
Cons
- Error handling - in case MARC Bib was not created, we end up with Instance record (source=MARC), but no underlying SRS MARC. Need to make sure such Instance is editable
- Store all incoming records in SRM to be referenced from DI logs
Option 3
Risks & Assumptions
- Risk 1
- Risk 2 ...
- Assumption 1
- Assumption 2 ...
Conclusion
Summarize the results of the spike, key findings, and any recommendations or next steps