Data Import notes

Flow description

  1. JobDefinition (uuid, profile - defines job type: insert/update) for import created in mod-srm
  2. MARC file + JobDefinition ID are uploaded from WEB client to mod-data-import (stored in memory, can be persisted. possible oom)
  3. MARC records are packed into batches and put to Kafka queue DI_RAW_RECORDS_CHUNK_READ
  4. mod-srm reads batches from the queue, validates and passes to mod-srs via Kafka queue DI_RAW_RECORDS_CHUNK_PARSED. JobStarts on first chunk received. In case JobProfile contains an action for MARC_BIB update - an event is published to different a queue DI_MARC_BIB_FOR_UPDATE_RECEIVED and step 5 is skipped.
  5. mod-srs stores records (unless MARC_BIB update action is present in JobProfile) into PostgreSQL database and returns the result back via Kafka queue (broken records are also stored as 'error record') - DI_PARSED_RECORDS_CHUNK_SAVED
  6. mod-srm reads the profile and creates JSON payload (containing parsed MARC, profile, mapping parameters) for processing. exports it to an appropriate Kafka queue (one message per MARC entry) - DI_SRS_MARC_BIB_RECORD_CREATED, DI_SRS_MARC_AUTHORITY_RECORD_CREATED, DI_SRS_MARC_HOLDING_RECORD_CREATED or DI_EDIFACT_RECORD_CREATED

FOR CREATE

Save MARC_BIB + create Instance, Holdings, Item

  1. JobDefinition (uuid, profile - defines job type: insert/update) for import created in mod-srm
  2. MARC file + JobDefinition ID are uploaded from WEB client to mod-data-import (stored in memory, can be persisted. possible oom)
  3. MARC records are packed into batches and put to Kafka queue DI_RAW_RECORDS_CHUNK_READ
  4. mod-srm reads batches from the queue, validates and passes to mod-srs via Kafka queue DI_RAW_RECORDS_CHUNK_PARSED. JobStarts on first chunk received.
  5. mod-srs stores records into PostgreSQL database and returns the result back via Kafka queue (broken records are also stored as 'error record') - DI_PARSED_RECORDS_CHUNK_SAVED
  6. mod-srm reads the profile and creates JSON payload (containing parsed MARC, profile, mapping parameters) for processing. exports it to an appropriate Kafka queue (one message per MARC entry) - DI_SRS_MARC_BIB_RECORD_CREATED
  7. mod-inventory reads the message, creates Instance. Stores (via OKAPI HTTP) in mod-inventory-storage. Exports message DI_INVENTORY_INSTANCE_CREATED_READY_FOR_POST_PROCESSING
  8. mod-srs reads message and updates according entry (Instance HRID is set to '001' MARC_BIB field, the value from '001' is moved to '035', Instance ID is set to '999 ff i' field). Creates new messages with updated payload DI_SRS_MARC_BIB_INSTANCE_HRID_SET and DI_INVENTORY_INSTANCE_CREATED or DI_COMPLETED (in case JobProfile contains action only for Instance create)
  9. mod-inventory reads message from DI_SRS_MARC_BIB_INSTANCE_HRID_SET and updates Instance with updated identifiers fields - no new messages are sent at this point.
  10. mod-inventory reads the message from DI_INVENTORY_INSTANCE_CREATED, creates Holdings. Stores (via OKAPI HTTP) in mod-inventory-storage. Exports message to DI_INVENTORY_HOLDING_CREATED
  11. mod-inventory reads the message from DI_INVENTORY_HOLDING_CREATED, creates Items. Stores (via OKAPI HTTP) in mod-inventory-storage. Exports message to DI_INVENTORY_ITEM_CREATED
  12. mod-inventory reads message and exports to DI_COMPLETED

FOR UPDATE

  • match by MARC
  1. mod-srs reads the message from DI_SRS_MARC_BIB_RECORD_CREATED (or DI_MARC_BIB_FOR_UPDATE_RECEIVED if JobProfile contains an action for MARC_BIB update), searches for an according entity in the DB and publishes DI_SRS_MARC_BIB_RECORD_MATCHED or DI_SRS_MARC_BIB_RECORD_MATCHED_READY_FOR_POST_PROCESSING (in case JobProfile contains action for update of the entity that is not stored in mod-srs)

UPDATE:

Match by Instance/Holdings/Item

  1. JobDefinition (uuid, profile - defines job type: insert/update) for import created in mod-srm
  2. MARC file + JobDefinition ID are uploaded from WEB client to mod-data-import (stored in memory, can be persisted. possible oom)
  3. MARC records are packed into batches and put to Kafka queue DI_RAW_RECORDS_CHUNK_READ
  4. mod-srm reads batches from the queue, validates and passes to mod-srs via Kafka queue DI_RAW_RECORDS_CHUNK_PARSED. JobStarts on first chunk received.
  5. mod-srs stores records into PostgreSQL database and returns the result back via Kafka queue (broken records are also stored as 'error record') - DI_PARSED_RECORDS_CHUNK_SAVED
  6. mod-srm reads the profile and creates JSON payload (containing parsed MARC, profile, mapping parameters) for processing. exports it to an appropriate Kafka queue (one message per MARC entry) - DI_SRS_MARC_BIB_RECORD_CREATED
  7. mod-inventory reads message from DI_SRS_MARC_BIB_RECORD_CREATED. Tries to find entity according to match criteria (in mod-inventory-storage via OKAPI HTTP)
  8. If found: exports result to DI_INVENTORY_INSTANCE_MATCHED / DI_INVENTORY_HOLDING_MATCHED / DI_INVENTORY_ITEM_MATCHED
    1. mod-inventory receives match result and updates Instance/Holding/Item (according to action in profile) in mod-inventory-storage according to profile. And publishes result to applied topic (one per entity type) DI_INVENTORY_INSTANCE_UPDATED / DI_INVENTORY_HOLDING_UPDATED / DI_INVENTORY_ITEM_UPDATED
    2. mod-inventory reads message from prev. step (1 of 3 topics) and seeks for more actions (return to a.)  If no more actions in profile: export to DI_COMPLETE
  1. If not found:  DI_INVENTORY_INSTANCE_NOT_MATCHED / DI_INVENTORY_HOLDING_NOT_MATCHED / DI_INVENTORY_ITEM_NOT_MATCHED events are published, follow 'NON-MATCHED' branch in profile -(go to 'CREATE' or 'UPDATE' or export to DI_COMPLETED for empty actions)


AFTER PROCESSING

  1. mod-srm reads DI_COMPLETED/DI_ERROR updates job progress