Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-3023] NFR: R2 2021 Juniper Data Import Stabilization and Reliability work Created: 31/Mar/21  Updated: 31/Aug/21  Resolved: 28/Jul/21

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: R2 2021
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: P2
Reporter: Ann-Marie Breaux (Inactive) Assignee: Ann-Marie Breaux (Inactive)
Resolution: Done Votes: 0
Labels: NFR, data-import, epam-folijet, performance, r2-2021-at-risk, split, testing
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
defines UXPROD-47 Batch Importer (Bib/Acq) Analysis Complete
is defined by MODINV-400 Errors in Subsequent Updates Closed
is defined by MODSOURMAN-456 Error causing UPDATE import to stop Closed
is defined by MODDATAIMP-412 SPIKE: investigate and create stories... Closed
is defined by MODDATAIMP-424 Import of 5 concurrent jobs become stuck Closed
is defined by MODDATAIMP-436 SPIKE Find out reasons for low covera... Closed
is defined by MODDATAIMP-453 Kafka consumers reconnect Closed
is defined by MODDATAIMP-456 Setup coverage report for Sonar in D... Closed
is defined by MODDICORE-137 Item update failed Closed
is defined by MODINV-397 Investigate Memory Leaks on Long Dura... Closed
is defined by MODINV-404 Item update failed Closed
is defined by MODINV-423 Excessive CPU usage in a system with ... Closed
is defined by MODSOURCE-278 Record from chunk is not saved occasi... Closed
is defined by MODSOURMAN-447 SPIKE prepare a mechanism for monitor... Closed
is defined by MODSOURMAN-453 Add index for the " job_execution_so... Closed
is defined by MODSOURMAN-454 Excessive CPU usage in a system with ... Closed
is defined by MODPUBSUB-98 Kafka requirements for mod-pubsub and... Closed
Relates
relates to UXPROD-3191 NFR: R3 2021 Kiwi Data import perform... Closed
Epic Link: Batch Importer (Bib/Acq)
Front End Estimate: Very Small (VS) < 1day
Front End Estimator: Ivan Kryzhanovskyi
Front-End Confidence factor: Medium
Back End Estimate: Jumbo: > 45 days
Back End Estimator: Ann-Marie Breaux (Inactive)
Development Team: Folijet
PO Rank: 115
Cap Plan Fix Version (DO NOT CHANGE): R2 2021

 Description   

Architectural and Infrastructure work to make Data Import more stable and performant

Based on feedback from the PTF and architects

As of 14 April 2021, PTF has tested importing files of 50K MARC to create Bibs, Instances, Holdings, Item

Additional testing still needed

  • Multiple imports triggered by different users at the same time
  • Imports that update SRS and Inventory records (so have matching portion in the job profile, which creates does not have)
  • EDIFACT - Ann-Marie Breaux to get a large renewal invoice from Anne C, so that we can see if that causes any issues


 Comments   
Comment by Jacquie Samples [ 28/Apr/21 ]

Ann-Marie Breaux I am curious to know what kind of testing has happened on the data that was loaded in the 50k import.  Has anyone tested updating or overlaying those bib and/or order data?  In other words, was the load successful in that those data are now actionable and effective? 

I am also interested to know how long it took to load the 50k bib records.

Thanks!

Comment by Ann-Marie Breaux (Inactive) [ 29/Apr/21 ]

Hi Jacquie Samples We'll be putting together details on the wiki in the next week or two. We started with Creates for SRS Bibs, and all 3 Inventory record types. Now we are also working with multiple updates to the same records. We have 5 job profiles that should result in successes for all updates, 1 job profile that will fail for holdings and item updates, and 1 job profile that will fail on item updates. We are not yet making them as varied as you'll encounter in a real use case. Another large part of the testing is related to processing time, so we want predictable results before we start introducing more unpredictability. That whole walk before you run thing.

With a few adjustments to environment setup details, a load of 50K creating or updated 50K SRS, Instances, Holdings, and Items completes in ca. 2.5 hours. We think there are still some steps that we can take to improve that performance, but the devs are still analyzing the various runs, adjusting settings, and trying again. More soon, I promise!

Comment by Ann-Marie Breaux (Inactive) [ 18/May/21 ]

Based on number of stories and total points, we likely will need to split this feature, and continue the work in R3

Comment by Ann-Marie Breaux (Inactive) [ 26/May/21 ]

After internal conversation, decided to deprioritize some of these architectural changes for Juniper, in favor of bugfixes and automated tests

Comment by Ann-Marie Breaux (Inactive) [ 17/Jun/21 ]

Architectural changes moved to Kiwi feature UXPROD-3135 Closed

Generated at Fri Feb 09 00:28:43 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.