Spike Overview
...
...
...
Spike Overview
Jira link: Jira Legacy server System JiraJIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODDATAIMP-744
Spike Status: IN PROGRESSCOMPLETED
Objective: Incomplete/disposable records from incoming file that are supposed to be used only as a carrier of data for creating/updating entities other than Instances, should not be saved in SRS. Background Investigate if DI flow can be changed to remove the first step of saving incoming records in SRS, define how and when SRS records should be created and persisted in SRS, reflect changes in sequence diagrams, create stories for refactoring.
Background and Problem Statement
There is no explicit action to save the SRS MARC record, it is implicit and happens for each incoming file (with a couple of exceptions implemented as "bug fixes", ex. Job Profile contains action for Update Instance or Update Holdings). According to the original design, DI record from the incoming file is considered new and valid record that should be saved prior to any other actions and serve as a single source of truth. In fact, there are indeed scenarios where records that are coming should be saved in SRS and referenced by other entities that are derived from it. However, there are also multiple use cases (usually some kind of updates or creates on Holdings and/or Item, creating Orders and Invoices), where incoming record is considered to be disposable, it might contain only partial data, and if it is saved we end up either with lost data (when original record is overridden) or with messed up links to corresponding inventory entities (when we save the record as new one).
Problem Statement
In addition, SRS contains a lot of clutter - records that are not used after import is completed, as well as broken records that are not linked to any FOLIO entity as a result of failed imports from the past. Post-processing mechanism is redundant and can be avoided if prior saving of the records is not mandatory and the problem of generation identifier for Instances (Holdings and Authority?) is resolved. Removing the mandatory step of saving the MARC in SRS prior to any other actions would also significantly simplify the DI flow. Stated problems if addressed would lead to improvements in DI performance - create scenarios would be simplified, update scenarios should benefit from quicker search if SRS DB is not piling up clutter.
Scope
In Scope
Main focus is on importing MARC Bibs, but flows for MARC Holdings and MARC Authority should also be reviewed and either left as is for now, or same changes applied as for MARC Bibs (if applicable).
Out of Scope
Cleaning up SRS DB from clutter is out of scope. Deleting OLD and broken records should be addressed in other spike. Piling up of OLD records as a result of multiple updates on SRS MARC and the overall versioning mechanism is our of scope of this spike.
Research Questions
- What DI scenarios require saving SRS MARC Bib other than Create Instance action?
- What if Save batches of incoming records are not saved in SRS by default? Incoming MARC json will not be accessible in the DI logs, but we could save parsed content in SRM as part of a journal record.
- If SRS MARC record is saved in SRS only as an implicit part of Create Instance action should it happen before or after Creating of the Instance? If before - can Instance UUID and HRID be generated prior to creating the Instance to get rid of Post-Processing step. At what point manipulations on 001+003 → 035 fields should be done? SRM before Saving of the record in SRS or before mapping of Instance in Create Instance action?
- How MARC Modify action should fit in the changed flow?
- MARC Holdings and MARC Authority - should it follow the same flow as MARC Bib?
- Linking MARC Bibs and Authority - any adjustments/risks there?
Deliverables
Updated diagrams of DI flow for main scenarios. New feature and stories for refactoring in Jira.
...
Simplified* diagram of
...
DI flow for creating SRS MARC Bib and Instance
*Error handling is omitted, DI_ERROR event can be sent at any step in case of errors.
Option 1
Expand |
---|
Remove step when initial records are saved in SRS (in batches). Save incoming parsed content in SRM (it will be required for DI log) - it should be cleaned up when JobExecution is deleted Provide endpoint in mod-inventory-storage(?) to generate ids for the Instance (Holdings/Authority?) prior to creating those records in inventory (in batches?) Revisit 001 + 003 → 035 logic Save MARC (Edifact records are also not needed in SRS) in SRS only as implicit part of Create instance (Holdings/Authority?) action. Save one by one? Move on to inventory - basically finish the action there. Post-processing won't be required as ids are generated already, and underlying MARC contains them. Pros
Cons
|
Option 2 - SELECTED
Remove step when initial records are saved in SRS (in batches).
...
Move on to the action in profile. Create MARC would an implicit step for Create Instance action.
Revisit 001 + 003 → 035 logic should be done in mod-inventory before creating Instance
After inventory entity is created - reuse existing post-processing step, but update to save MARC with inventory identifiers (or make a post request to SRS instead of sending the message)
Pros
- Straightforward decision about action in SRM - following the profile (see diagram below)
...
- Simplified flow for scenarios that do need the SRS entity to be created
- No clutter in SRS (incoming records for the jobs that do not require SRS MARC to be created and linked with other entities, will not be saved)
- Removed post-processing step
- Performance benefits
Cons
- Error handling - in case MARC Bib was not created, we end up with Instance record (source=MARC), but no underlying SRS MARC. Need to make sure such Instance is editable
- Store all incoming records in SRM to be referenced from DI logs
Option 3
Expand |
---|
Make Create SRS MARC action explicit - either additional action in the profile, or a checkbox - Save (or not) SRS MARC Bib ProsShifting decision making on the user side Straightforward flow ConsError prone - we'll have to validate profiles thoroughly, shifting responsibility to the step of profile creation basically hardcoding the same principles, but earlier |
Risks & Assumptions
- Risk 1
- Risk 2 ...
- Assumption 1
- Assumption 2 ...
Conclusion
Summarize the results of the spike, key findings, and any recommendations or next steps
Attachments
...
Conclusion
Option 2 should simplify the DI flow significantly, prevent accumulating clutter in SRS, allow to remove the post-processing step Create Instance (Holdings, Authority) action, and overall improve performance of DI.
Implementation stories
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODSOURMAN-1019
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODINV-849
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODINV-850
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODSOURCE-672
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODSOURMAN-1020
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODSOURMAN-1021
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODSOURMAN-1023
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODSOURMAN-1022
Jira Legacy server System JIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODINV-850