Data Import notes
Flow description
- JobDefinition (uuid, profile - defines job type: insert/update) for import created in mod-srm
- MARC file + JobDefinition ID are uploaded from WEB client to mod-data-import (stored in memory, can be persisted. possible oom)
- MARC records are packed into batches and put to Kafka queue DI_RAW_RECORDS_CHUNK_READ
- mod-srm reads batches from the queue, validates and passes to mod-srs via Kafka queue DI_RAW_RECORDS_CHUNK_PARSED. JobStarts on first chunk received. In case JobProfile contains an action for MARC_BIB update - an event is published to different a queue DI_MARC_BIB_FOR_UPDATE_RECEIVED and step 5 is skipped.
- mod-srs stores records (unless MARC_BIB update action is present in JobProfile) into PostgreSQL database and returns the result back via Kafka queue (broken records are also stored as 'error record') - DI_PARSED_RECORDS_CHUNK_SAVED
- mod-srm reads the profile and creates JSON payload (containing parsed MARC, profile, mapping parameters) for processing. exports it to an appropriate Kafka queue (one message per MARC entry) - DI_SRS_MARC_BIB_RECORD_CREATED, DI_SRS_MARC_AUTHORITY_RECORD_CREATED, DI_SRS_MARC_HOLDING_RECORD_CREATED or DI_EDIFACT_RECORD_CREATED
FOR CREATE
Save MARC_BIB + create Instance, Holdings, Item
- JobDefinition (uuid, profile - defines job type: insert/update) for import created in mod-srm
- MARC file + JobDefinition ID are uploaded from WEB client to mod-data-import (stored in memory, can be persisted. possible oom)
- MARC records are packed into batches and put to Kafka queue DI_RAW_RECORDS_CHUNK_READ
- mod-srm reads batches from the queue, validates and passes to mod-srs via Kafka queue DI_RAW_RECORDS_CHUNK_PARSED. JobStarts on first chunk received.
- mod-srs stores records into PostgreSQL database and returns the result back via Kafka queue (broken records are also stored as 'error record') - DI_PARSED_RECORDS_CHUNK_SAVED
- mod-srm reads the profile and creates JSON payload (containing parsed MARC, profile, mapping parameters) for processing. exports it to an appropriate Kafka queue (one message per MARC entry) - DI_SRS_MARC_BIB_RECORD_CREATED
- mod-inventory reads the message, creates Instance. Stores (via OKAPI HTTP) in mod-inventory-storage. Exports message DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING
- mod-srs reads message and updates according entry (Instance HRID is set to '001' MARC_BIB field, the value from '001' is moved to '035', Instance ID is set to '999 ff i' field). Creates new messages with updated payload DI_SRS_MARC_BIB_INSTANCE_HRID_SET and DI_INVENTORY_INSTANCE_CREATED or DI_COMPLETED (in case JobProfile contains action only for Instance create)
- mod-inventory reads message from DI_SRS_MARC_BIB_INSTANCE_HRID_SET and updates Instance with updated identifiers fields - no new messages are sent at this point.
- mod-inventory reads the message from DI_INVENTORY_INSTANCE_CREATED, creates Holdings. Stores (via OKAPI HTTP) in mod-inventory-storage. Exports message to DI_INVENTORY_HOLDING_CREATED
- mod-inventory reads the message from DI_INVENTORY_HOLDING_CREATED, creates Items. Stores (via OKAPI HTTP) in mod-inventory-storage. Exports message to DI_INVENTORY_ITEM_CREATED
- mod-inventory reads message and exports to DI_COMPLETED
FOR UPDATE
- match by MARC
- mod-srs reads the message from DI_SRS_MARC_BIB_RECORD_CREATED (or DI_MARC_BIB_FOR_UPDATE_RECEIVED if JobProfile contains an action for MARC_BIB update), searches for an according entity in the DB and publishes DI_SRS_MARC_BIB_RECORD_MATCHED or DI_SRS_MARC_BIB_RECORD_MATCHED_READY_FOR_POST_PROCESSING (in case JobProfile contains action for update of the entity that is not stored in mod-srs)
UPDATE:
Match by Instance/Holdings/Item
- JobDefinition (uuid, profile - defines job type: insert/update) for import created in mod-srm
- MARC file + JobDefinition ID are uploaded from WEB client to mod-data-import (stored in memory, can be persisted. possible oom)
- MARC records are packed into batches and put to Kafka queue DI_RAW_RECORDS_CHUNK_READ
- mod-srm reads batches from the queue, validates and passes to mod-srs via Kafka queue DI_RAW_RECORDS_CHUNK_PARSED. JobStarts on first chunk received.
- mod-srs stores records into PostgreSQL database and returns the result back via Kafka queue (broken records are also stored as 'error record') - DI_PARSED_RECORDS_CHUNK_SAVED
- mod-srm reads the profile and creates JSON payload (containing parsed MARC, profile, mapping parameters) for processing. exports it to an appropriate Kafka queue (one message per MARC entry) - DI_SRS_MARC_BIB_RECORD_CREATED
- mod-inventory reads message from DI_SRS_MARC_BIB_RECORD_CREATED. Tries to find entity according to match criteria (in mod-inventory-storage via OKAPI HTTP)
- If found: exports result to DI_INVENTORY_INSTANCE_MATCHED / DI_INVENTORY_HOLDING_MATCHED / DI_INVENTORY_ITEM_MATCHED
- mod-inventory receives match result and updates Instance/Holding/Item (according to action in profile) in mod-inventory-storage according to profile. And publishes result to applied topic (one per entity type) DI_INVENTORY_INSTANCE_UPDATED / DI_INVENTORY_HOLDING_UPDATED / DI_INVENTORY_ITEM_UPDATED
- mod-inventory reads message from prev. step (1 of 3 topics) and seeks for more actions (return to a.) If no more actions in profile: export to DI_COMPLETE
- mod-inventory receives match result and updates Instance/Holding/Item (according to action in profile) in mod-inventory-storage according to profile. And publishes result to applied topic (one per entity type) DI_INVENTORY_INSTANCE_UPDATED / DI_INVENTORY_HOLDING_UPDATED / DI_INVENTORY_ITEM_UPDATED
- If not found: DI_INVENTORY_INSTANCE_NOT_MATCHED / DI_INVENTORY_HOLDING_NOT_MATCHED / DI_INVENTORY_ITEM_NOT_MATCHED events are published, follow 'NON-MATCHED' branch in profile -(go to 'CREATE' or 'UPDATE' or export to DI_COMPLETED for empty actions)
AFTER PROCESSING
- mod-srm reads DI_COMPLETED/DI_ERROR updates job progress