Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-3191] NFR: R3 2021 Kiwi Data import performance work Created: 22/Jul/21  Updated: 15/Nov/21  Resolved: 05/Nov/21

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Kiwi (R3 2021)
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: P2
Reporter: Taisiya Trunova Assignee: Ann-Marie Breaux (Inactive)
Resolution: Done Votes: 0
Labels: NFR, data-import, epam-folijet, performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Continues
is continued by UXPROD-3261 NFR: R1 2022 Lotus Data import perfor... Closed
Defines
defines UXPROD-47 Batch Importer (Bib/Acq) Analysis Complete
is defined by MODDATAIMP-499 SPIKE: Use active Kafka producer from... Closed
is defined by MODINV-427 Testing of loading 500K MARC records ... Closed
is defined by MODINVSTOR-792 GET item-storage/items?query=barcode=... Closed
is defined by MODSOURCE-307 Updates becoming increasingly slow Closed
is defined by MODSOURCE-388 SPIKE: Slow query from mod-oai-pmh Closed
is defined by MODSOURMAN-537 SPIKE: Slow Queries observed from job... Closed
is defined by MODSOURMAN-548 Sorting causes module to crash Closed
is defined by MODSOURMAN-550 Reduce BE response payload for DI Lan... Closed
is defined by MODINV-508 Block sending a requests with an empt... Closed
is defined by MODINV-572 Block sending a requests with an empt... Closed
is defined by MODSOURCE-340 Lower log level for messages when no ... Closed
Relates
relates to UXPROD-3023 NFR: R2 2021 Juniper Data Import Stab... Closed
relates to UXPROD-3135 NFR: R3 2021 Kiwi Data Import Stabili... Closed
Epic Link: Batch Importer (Bib/Acq)
Front-End Confidence factor: Medium
Back End Estimate: XXXL: 30-45 days
Development Team: Folijet
PO Rank: 116

 Description   

Team estimation - 45 days

UXPROD-3135 Closed was split into UXPROD-3193 Closed for stability and reliability and UXPROD-3191 Closed for performance; Ann-Marie Breaux to close UXPROD-3135 Closed once all issues moved from it to the new features

Current situation or problem:

1.Kafka producer closed after sending

2.WARN message when no handler found

3. Kafka cache resource consumption
4. Data import impacts other processes

5. High resource consumption to get job(s) status/progress

 

In scope

Out of scope

Use case(s)

Proposed solution/stories
*1.*Create pool of active producers. Start pool on module launch, close on shutdown. Reuse connections. Add max/min pool sizes.
2. Do not subscribe to messages you're not going to process OR Lower log lever for this type of messages.

*3.*Remove Kafka cache. Modules that do not do persistent changes will sometimes (on duplicates read) do unnecessary calls. Can be optimized further upon adding distributed in-memory cache (ex hazelcast) (blocked by 6 PUT LINK TO FEATURE in p.6)
4. SPIKE REQ.: Need investigation (possible solution - configure rate limiter). Relates to High CPU/Memory consumption on modules
5. Add some kind of caching for progress tracking (database or in-memory)

Links to additional information:

Data Import Stabilization plan - Vladimir Shalaev - FOLIO Wiki

Questions



 Comments   
Comment by Ann-Marie Breaux (Inactive) [ 25/Aug/21 ]

Grooming today - have most of the spikes, but likely will have more stories/tasks once the spikes are completed

Generated at Fri Feb 09 00:30:04 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.