SPIKE: Investigate Timeout exceptions during big imports (Poppy)

RCA Group

None

Description

During testing DI performance for MODSOURCE-601 we received a lot of TimeoutException in mod-source-record-manager after importing big files (50k+ MARC BIB records). It's fixed by increasing db_connectionsize from 15 to 400. Investigate this case and try to find a bottleneck for this case.

Includes some Kafka changes, which requires a release of KAFKAWRAP

ORCHID Critical service patch details

  1. Describe issue impact on business: Data import receives some timeouts. When asking for additional connections, there is pushback for requesting so many (400). This patch fixes the timeout error messages, plus we can decrease the number of connections required.

  2. What institutions are affected? (field “Affected Institutions” in Jira to be populated): Any that use Data Import

  3. What is the workaround if exists? None, Data import continues to be less efficient

  4. What areas will be impacted by fix (i.e. what areas need to be retested): Confirm Data import Smoke and Critical path work as expected

  5. Brief explanation of technical implementation and the level of effort (in workdays) and technical risk (low/medium/high):

    • Purpose: Previous Flow Control implementation heavily loaded the DB for module synchronization.

    • Approach: Using a fetch(long) operator for managing to receive packages from DI_RAW_RECORDS_CHUNK_READ topic.

    • Technical risk: Low

  6. Brief explanation of testing required and level of effort (in workdays). Provide test plan agreed with by QA Manager and PO: After the MODSOURCE and MODSOURMAN patches are applied, we need to retest the Smoke and Critical Path Data Import tests (most of which are automated), and perhaps selected Extended Manual tests. Manual testing across these MODSOURCE and MODSOURMAN changes are likely 3-5 days of work for manual QA, plus some input from PO.

  7. What is the roll back plan in case the fix does not work? Revert to previous version

NOLANA Critical service patch details

  1. Describe issue impact on business: Data import receives some timeouts. When asking for additional connections, there is pushback for requesting so many (400). This patch fixes the timeout error messages, plus we can decrease the number of connections required.

  2. What institutions are affected? (field “Affected Institutions” in Jira to be populated): Any that use Data Import

  3. What is the workaround if exists? None, Data import continues to be less efficient

  4. What areas will be impacted by fix (i.e. what areas need to be retested): Confirm Data import Smoke and Critical path work as expected

  5. Brief explanation of technical implementation and the level of effort (in workdays) and technical risk (low/medium/high):

    • Purpose: Previous Flow Control implementation heavily loaded the DB for module synchronization.

    • Approach: Using a fetch(long) operator for managing to receive packages from DI_RAW_RECORDS_CHUNK_READ topic.

    • Technical risk: Low

  6. Brief explanation of testing required and level of effort (in workdays). Provide test plan agreed with by QA Manager and PO: After the MODSOURCE and MODSOURMAN patches are applied, we need to retest the Smoke and Critical Path Data Import tests (most of which are automated), and perhaps selected Extended Manual tests. Manual testing across these MODSOURCE and MODSOURMAN changes are likely 3-5 days of work for manual QA, plus some input from PO.

  7. What is the roll back plan in case the fix does not work? Revert to previous version

Environment

None

Potential Workaround

None

Checklist

hide

TestRail: Results

Activity

Show:

Ann-Marie BreauxJune 20, 2023 at 2:13 PM

Includes some Kafka changes, which requires a release of KAFKAWRAP

Ivan KryzhanovskyiJune 6, 2023 at 5:12 PM

Testing will be done in scope of MODSOURCE-629

Done

Details

Assignee

Reporter

Priority

Story Points

Sprint

Development Team

Folijet

Fix versions

Release

Poppy (R2 2023)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created May 10, 2023 at 12:59 PM
Updated October 14, 2023 at 10:57 AM
Resolved June 6, 2023 at 5:12 PM
TestRail: Cases
TestRail: Runs