Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-3210] NFR: R1 2022 Lotus Data import Stability/Reliability work Created: 12/Aug/21  Updated: 28/Feb/22  Resolved: 28/Feb/22

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Lotus (R1 2022)
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: P2
Reporter: Taisiya Trunova Assignee: Ann-Marie Breaux (Inactive)
Resolution: Done Votes: 0
Labels: NFR
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Continues
continues UXPROD-3193 NFR: R3 2021 Kiwi Data import Stabili... Closed
is continued by UXPROD-3429 NFR: R2 2022 Morning Glory Data impor... Closed
Defines
is defined by KAFKAWRAP-7 SPIKE: Prevent losing Kafka messages ... Closed
is defined by MODDATAIMP-473 SPIKE: Review PTF reports and create ... Closed
is defined by MODDATAIMP-495 SPIKE: Analysis of the possibilities ... Closed
is defined by MODDATAIMP-500 SPIKE: Design approach for assigning ... Closed
is defined by MODINV-408 Implement ProcessRecordErrorHandler f... Closed
is defined by MODSOURCE-402 Properly handle DB failures during ev... Closed
is defined by MODSOURMAN-598 Properly handle DB failures during ev... Closed
is defined by KAFKAWRAP-3 Implement error handler contract for ... Closed
is defined by MODDATAIMP-491 Improve logging to be able to trace t... Closed
is defined by MODDATAIMP-566 SPIKE: Investigate unreachable (garba... Closed
is defined by MODINV-460 SPIKE: Analyze possibilities to imple... Closed
is defined by MODINV-547 Idempotence for createInstanceEventHa... Closed
is defined by MODINV-548 SPIKE: Investigate approach for event... Closed
is defined by MODINV-584 Improve logging to be able to trace t... Closed
is defined by MODINV-588 Implement deduplication for Instances Closed
is defined by MODINV-589 Implement deduplication for Holdings Closed
is defined by MODINV-590 Implement deduplication for Items Closed
is defined by MODINV-591 Implement deduplication for Authorities Closed
is defined by MODINVOICE-252 Implement ProcessRecordErrorHandler f... Closed
is defined by MODSOURCE-290 Implement ProcessRecordErrorHandler f... Closed
is defined by MODSOURMAN-474 Implement ProcessRecordErrorHandler f... Closed
Relates
relates to UXPROD-3471 NFR: R2 2022 Morning Glory: Implement... Closed
Release: Lotus R1 2022
Epic Link: Batch Importer (Bib/Acq)
Front End Estimate: Out of scope
Front-End Confidence factor: Low
Back End Estimate: Jumbo: > 45 days
Development Team: Folijet
PO Rank: 125
Rank: Cornell (Full Sum 2021): R1
Rank: U of AL (MVP Oct 2020): R1

 Description   

BE est is Jumbo; rough estimate is 120 days

Current situation or problem:

  1. High CPU/Memory consumption on modules
  2. Duplicates may created upon import for holdings and items (instances were fixed)
  3. Confirm that SRS does fail when processing during import
    # If we have infrastructure issue (like DB not available, module being restarted or network failure), we are sending DI_ERROR instead of retrying

Investigation required for:

  • Race condition on start (Kafka consumers start working before DB is configured) OR Periodical DB shutdown after SRS restart. Jobs get stuck if not able to update status in DB (messages ACKed even if we could not process them)
  • Kafka consumers stop reading messages eventually, breaking job progress until module restart.
  • mod-data-import stores input file in memory, limiting size of uploaded file and possibly having oom
  • Consumer gets disconnected from Kafka cluster

Proposed solution/stories

  1. Make consumers behave idempotent. Add pass-through identifier to de-duplicate messages.
  2. Generate "INSTANCE CREATED" from mod-inventory. Consume in SRS to update HRID in BIB and in INVENTORY to continue processing.
  3. Do not ACK messages in Kafka if there's not a logic, but infrastructure error/exception. Split failed processing results into 2 categories:
    • IO errors - do not ack. retry until fixed
    • Business logic - DI_ERROR and Ack current message
  4. Remove unnecessary topics (* ready for post processing and hrid set)
  5. De-duplicate status messages per-record while tracking progress

One possible solution: Split to chunks, put to database, work with database/temp storage. Partially done (to be investigated)

Links to additional info:
Update to wherever the plan is now stored
Data Import Stabilization plan - Vladimir Shalaev - FOLIO Wiki

Questions


Generated at Fri Feb 09 00:30:13 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.