- Spike Overview
- Background
- Problem Statement
- Scope
- Research Questions
- Deliverables
- Risks & Assumptions
- Conclusion
Spike Overview
Jira link: - MODDATAIMP-744Getting issue details... STATUS
Spike Status: IN PROGRESS
Objective: Incomplete/disposable records from incoming file that are supposed to be used only as a carrier of data for creating/updating entities other than Instances, should not be saved in SRS.
Background
There is no explicit action to save the SRS MARC record, it is implicit and happens for each incoming file (with a couple of exceptions implemented as "bug fixes"). According to the original design, DI record from the incoming file is considered new and valid record that should be saved prior to any other actions and serve as a single source of truth. In fact, there are indeed scenarios where records that are coming should be saved in SRS and referenced by other entities that are derived from it. However, there are also multiple use cases (usually some kind of updates or creates on Holdings and/or Item, creating Orders and Invoices), where incoming record is considered to be disposable, it might contain only partial data, and if it is saved we end up either with lost data (when original record is overridden) or with messed up links to corresponding inventory entities (when we save the record as new one).
Problem Statement
SRS contains a lot of clutter - records that are not used after import is completed, as well as broken records that are not linked to any FOLIO entity as a result of failed imports from the past. Post-processing mechanism is redundant and can be avoided if prior saving of the records is not mandatory and the problem of generation identifier for Instances (Holdings and Authority?) is resolved. Removing the mandatory step of saving the MARC in SRS prior to any other actions would also significantly simplify the DI flow. Stated problems if addressed would lead to improvements in DI performance - create scenarios would be simplified, update scenarios should benefit from quicker search if SRS DB is not piling up clutter.
Scope
In Scope
Main focus is on importing MARC Bibs, but flows for MARC Holdings and MARC Authority should also be reviewed and either left as is for now, or same changes applied as for MARC Bibs (if applicable).
Out of Scope
Cleaning up SRS DB from clutter is out of scope. Deleting OLD and broken records should be addressed in other spike. Piling up of OLD records as a result of multiple updates on SRS MARC and the overall versioning mechanism is our of scope of this spike.
Research Questions
- What DI scenarios require saving SRS MARC Bib other than Create Instance action?
- What if Save batches of incoming records are not saved in SRS by default? Incoming MARC json will not be accessible in the DI logs, but we could save parsed content in SRM as part of a journal record.
- If SRS MARC record is saved in SRS only as an implicit part of Create Instance action should it happen before or after Creating of the Instance? If before - can Instance UUID and HRID be generated prior to creating the Instance to get rid of Post-Processing step. At what point manipulations on 001+003 → 035 fields should be done? SRM before Saving of the record in SRS or before mapping of Instance in Create Instance action?
- How MARC Modify action should fit in the changed flow?
- MARC Holdings and MARC Authority - should it follow the same flow as MARC Bib?
- Linking MARC Bibs and Authority - any adjustments/risks there?
Deliverables
Updated diagrams of DI flow for main scenarios. New feature and stories for refactoring in Jira.
Diagram of regular DI flow for creating SRS MARC Bib and Instance
Option 1
Remove step when initial records are saved in SRS (in batches).
Save incoming parsed content in SRM (it will be required for DI log) - it should be cleaned up when JobExecution is deleted
Provide endpoint in mod-inventory-storage(?) to generate ids for the Instance (Holdings/Authority?) prior to creating those records in inventory (in batches?)
Revisit 001 + 003 → 035 logic
Save MARC (Edifact records are also not needed in SRS) in SRS only as implicit part of Create instance (Holdings/Authority?) action. Save one by one?
Move on to inventory - basically finish the action there. Post-processing won't be required as ids are generated already, and underlying MARC contains them.
Pros
- Simplified flow, removed Post-Processing step for Create Instance action (see updated diagram below)
- Declutter SRS (incoming records for the jobs that do not require SRS MARC to be created and linked with other entities, will not be saved)
Cons
- Need to generate inventory identifiers (reserve the hrid sequence) before creating inventory entities (step 16 on the diagram)
- Allow saving Inventory entities with already assigned identifiers (uuid and hrid), step 25
- For other flows (that do not require saving SRS MARC records) we'll need to add storage for initial records to be referenced in DI logs
Option 2
Remove step when initial records are saved in SRS (in batches).
Save incoming parsed content in SRM (it will be required for DI log) - it should be cleaned up when JobExecution is deleted
Move on to the action in profile. Create MARC would an implicit step for Create Instance action.
Revisit 001 + 003 → 035 logic
After inventory entity is created - reuse existing post-processing step, but update to save MARC with inventory identifiers (or make a post request to SRS instead of sending the message)
Pros
Straightforward decision about action in SRM - following the profile
Cons
How to handle errors in Instance creation? If underlying MARC is not saved, there will be no way to update the Instance
Option 3
Make Create SRS MARC action explicit - either additional action in the profile, or a checkbox - Save (or not) SRS MARC Bib
Pros
Shifting decision making on the user side
Straightforward flow
Cons
Error prone - we'll have to validate profiles thoroughly, shifting responsibility to the step of profile creation basically hardcoding the same principles, but earlier
Risks & Assumptions
- Risk 1
- Risk 2 ...
- Assumption 1
- Assumption 2 ...
Conclusion
Summarize the results of the spike, key findings, and any recommendations or next steps
Attachments
Include any relevant attachments, such as documents, diagrams, or presentations that support the spike