Spike Overview

...

Attachments

Spike Overview

Jira link:

Jira Legacy

server	System JiraJIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODDATAIMP-744

Spike Status: IN PROGRESSCOMPLETED

Objective: Incomplete/disposable records from incoming file that are supposed to be used only as a carrier of data for creating/updating entities other than Instances, should not be saved in SRS. Background Investigate if DI flow can be changed to remove the first step of saving incoming records in SRS, define how and when SRS records should be created and persisted in SRS, reflect changes in sequence diagrams, create stories for refactoring.

Background and Problem Statement

There is no explicit action to save the SRS MARC record, it is implicit and happens for each incoming file (with a couple of exceptions implemented as "bug fixes", ex. Job Profile contains action for Update Instance or Update Holdings). According to the original design, DI record from the incoming file is considered new and valid record that should be saved prior to any other actions and serve as a single source of truth. In fact, there are indeed scenarios where records that are coming should be saved in SRS and referenced by other entities that are derived from it. However, there are also multiple use cases (usually some kind of updates or creates on Holdings and/or Item, creating Orders and Invoices), where incoming record is considered to be disposable, it might contain only partial data, and if it is saved we end up either with lost data (when original record is overridden) or with messed up links to corresponding inventory entities (when we save the record as new one).

Problem Statement

In addition, SRS contains a lot of clutter - records that are not used after import is completed, as well as broken records that are not linked to any FOLIO entity as a result of failed imports from the past. Post-processing mechanism is redundant and can be avoided if prior saving of the records is not mandatory and the problem of generation identifier for Instances (Holdings and Authority?) is resolved. Removing the mandatory step of saving the MARC in SRS prior to any other actions would also significantly simplify the DI flow. Stated problems if addressed would lead to improvements in DI performance - create scenarios would be simplified, update scenarios should benefit from quicker search if SRS DB is not piling up clutter.

Scope

In Scope

Main focus is on importing MARC Bibs, but flows for MARC Holdings and MARC Authority should also be reviewed and either left as is for now, or same changes applied as for MARC Bibs (if applicable).

Out of Scope

Cleaning up SRS DB from clutter is out of scope. Deleting OLD and broken records should be addressed in other spike. Piling up of OLD records as a result of multiple updates on SRS MARC and the overall versioning mechanism is our of scope of this spike.

Research Questions

What DI scenarios require saving SRS MARC Bib other than Create Instance action?
What if Save batches of incoming records are not saved in SRS by default? Incoming MARC json will not be accessible in the DI logs, but we could save parsed content in SRM as part of a journal record.
If SRS MARC record is saved in SRS only as an implicit part of Create Instance action should it happen before or after Creating of the Instance? If before - can Instance UUID and HRID be generated prior to creating the Instance to get rid of Post-Processing step. At what point manipulations on 001+003 → 035 fields should be done? SRM before Saving of the record in SRS or before mapping of Instance in Create Instance action?
How MARC Modify action should fit in the changed flow?
MARC Holdings and MARC Authority - should it follow the same flow as MARC Bib?
Linking MARC Bibs and Authority - any adjustments/risks there?

Deliverables

Updated diagrams of DI flow for main scenarios. New feature and stories for refactoring in Jira.

...

Simplified* diagram of

...

DI flow for creating SRS MARC Bib and Instance

Source

*Error handling is omitted, DI_ERROR event can be sent at any step in case of errors.

Option 1

Expand

Remove step when initial records are saved in SRS (in batches).

Save incoming parsed content in SRM (it will be required for DI log) - it should be cleaned up when JobExecution is deleted

Provide endpoint in mod-inventory-storage(?) to generate ids for the Instance (Holdings/Authority?) prior to creating those records in inventory (in batches?)

Revisit 001 + 003 → 035 logic

Save MARC (Edifact records are also not needed in SRS) in SRS only as implicit part of Create instance (Holdings/Authority?) action. Save one by one?

Move on to inventory - basically finish the action there. Post-processing won't be required as ids are generated already, and underlying MARC contains them.

Image Modified

Pros

Simplified flow, removed Post-Processing step for Create Instance action (see updated diagram below)
Declutter SRS (incoming records for the jobs that do not require SRS MARC to be created and linked with other entities, will not be saved)

Source

Image Modified

Cons

Need to generate inventory identifiers (reserve the hrid sequence) before creating inventory entities (step 16 on the diagram)
Allow saving Inventory entities with already assigned identifiers (uuid and hrid), step 25
For other flows (that do not require saving SRS MARC records) we'll need to add storage for initial records to be referenced in DI logs

Option 2 - SELECTED

Remove step when initial records are saved in SRS (in batches).

...

Move on to the action in profile. Create MARC would an implicit step for Create Instance action.

Revisit 001 + 003 → 035 logic should be done in mod-inventory before creating Instance

After inventory entity is created - reuse existing post-processing step, but update to save MARC with inventory identifiers (or make a post request to SRS instead of sending the message)

Pros

Straightforward decision about action in SRM - following the profile (see diagram below)

Source

...

Simplified flow for scenarios that do need the SRS entity to be created
No clutter in SRS (incoming records for the jobs that do not require SRS MARC to be created and linked with other entities, will not be saved)
Removed post-processing step
Performance benefits

Source

Image Added

Cons

Error handling - in case MARC Bib was not created, we end up with Instance record (source=MARC), but no underlying SRS MARC. Need to make sure such Instance is editable
Store all incoming records in SRM to be referenced from DI logs

Option 3

Expand

Make Create SRS MARC action explicit - either additional action in the profile, or a checkbox - Save (or not) SRS MARC Bib

Pros

Shifting decision making on the user side

Straightforward flow

Cons

Error prone - we'll have to validate profiles thoroughly, shifting responsibility to the step of profile creation basically hardcoding the same principles, but earlier

Risks & Assumptions

Risk 1
Risk 2 ...
Assumption 1
Assumption 2 ...

Conclusion

Summarize the results of the spike, key findings, and any recommendations or next steps

Attachments

...

Conclusion

Option 2 should simplify the DI flow significantly, prevent accumulating clutter in SRS, allow to remove the post-processing step Create Instance (Holdings, Authority) action, and overall improve performance of DI.

Implementation stories

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODSOURMAN-1019

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODINV-849

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODINV-850

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODSOURCE-672

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODSOURMAN-1020

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODSOURMAN-1021

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODSOURMAN-1023

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODSOURMAN-1022

Jira Legacy

server	System JIRA
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODINV-850

Version	Old Version 7	New Version Current
Changes made by	Kateryna Senchenko	Kateryna Senchenko
Saved on	Jul 21, 2023	Aug 02, 2023

Versions Compared

Key

Spike Overview

Spike Overview

Problem Statement