Done
Details
Assignee
Aliaksandr FedasiukAliaksandr FedasiukReporter
Olamide KolawoleOlamide KolawolePriority
P2Story Points
0Development Team
FolijetFix versions
Release
Nolana (R3 2022) Service Patch #2RCA Group
Third party component integrationCSP Approved
YesAffected releases
Orchid (R1 2023)Nolana (R3 2022)Morning Glory (R2 2022)Lotus (R1 2022)Kiwi (R3 2021)TestRail: Cases
Open TestRail: CasesTestRail: Runs
Open TestRail: Runs
Details
Details
Assignee
Aliaksandr Fedasiuk
Aliaksandr FedasiukReporter
Olamide Kolawole
Olamide KolawolePriority
Story Points
0
Development Team
Folijet
Fix versions
Release
Nolana (R3 2022) Service Patch #2
RCA Group
Third party component integration
CSP Approved
Yes
Affected releases
Orchid (R1 2023)
Nolana (R3 2022)
Morning Glory (R2 2022)
Lotus (R1 2022)
Kiwi (R3 2021)
TestRail: Cases
Open TestRail: Cases
TestRail: Runs
Open TestRail: Runs
Created June 27, 2023 at 2:37 AM
Updated July 18, 2023 at 2:25 PM
Resolved June 27, 2023 at 2:38 AM
There is an issue with lingering transactions when exceptions occur within lambda on some code paths. This is caused by a bug captured by the offending library used in SRS, https://github.com/jklingsporn/vertx-jooq/issues/197.
A symptom of this issue is that database connections are in a "idle in transaction" state for some time after an exception has occurred. Transactions are open but no SQL statements are sent through. The database is left waiting for any interaction before the connection is closed. Top suspect is that the connection is closed when the query executor, which holds a reference to the transaction object, is garbage collected. This also means that for the duration of the connection being held hostage other processes within SRS can't use the connection. In a Data Import job with lots of errors, the total duration of the job could increase considerably.
Acceptance Criteria
Upgrade vertx-jooq-classic-reactive to at least version 6.4.1
ORCHID Critical service patch details
Describe issue impact on business: Data import needs less time connected to the SRS database. During investigation, a bug was found by a library used in SRS, and this change updates the library to a newer version where the bug has been fixed.
What institutions are affected? (field “Affected Institutions” in Jira to be populated): All who use SRS or Data import
What is the workaround if exists? None, jobs just continue to be slow
What areas will be impacted by fix (i.e. what areas need to be retested): Confirm Data import Smoke and Critical path work as expected
Brief explanation of technical implementation and the level of effort (in workdays) and technical risk (low/medium/high):
Purpose
There is an issue with lingering transactions when exceptions occur within lambda on some code paths. This is caused by a bug captured by the offending library used in SRS, jklingsporn/vertx-jooq#197.
A symptom of this issue is that database connections are in a "idle in transaction" state for some time after an exception has occurred. Transactions are open but no SQL statements are sent through. The database is left waiting for any interaction before the connection is closed. Top suspect is that the connection is closed when the query executor, which holds a reference to the transaction object, is garbage collected. This also means that for the duration of the connection being held hostage other processes within SRS can't use the connection. In a Data Import job with lots of errors, the total duration of the job could increase considerably.
Approach
Upgrade vertx-jooq-classic-reactive to at least version 6.5.5
Technical risk: Low
Brief explanation of testing required and level of effort (in workdays). Provide test plan agreed with by QA Manager and PO: After the MODSOURCE and MODSOURMAN patches are applied, we need to retest the Smoke and Critical Path Data Import tests (most of which are automated), and perhaps selected Extended Manual tests. Manual testing across these MODSOURCE and MODSOURMAN changes are likely 3-5 days of work for manual QA, plus some input from PO.
What is the roll back plan in case the fix does not work? Revert to previous version
NOLANA Critical service patch details
Describe issue impact on business: Data import needs less time connected to the SRS database. During investigation, a bug was found by a library used in SRS, and this change updates the library to a newer version where the bug has been fixed.
What institutions are affected? (field “Affected Institutions” in Jira to be populated): All who use SRS or Data import
What is the workaround if exists? None, jobs just continue to be slow
What areas will be impacted by fix (i.e. what areas need to be retested): Confirm Data import Smoke and Critical path work as expected
Brief explanation of technical implementation and the level of effort (in workdays) and technical risk (low/medium/high):
Purpose
There is an issue with lingering transactions when exceptions occur within lambda on some code paths. This is caused by a bug captured by the offending library used in SRS, jklingsporn/vertx-jooq#197.
A symptom of this issue is that database connections are in a "idle in transaction" state for some time after an exception has occurred. Transactions are open but no SQL statements are sent through. The database is left waiting for any interaction before the connection is closed. Top suspect is that the connection is closed when the query executor, which holds a reference to the transaction object, is garbage collected. This also means that for the duration of the connection being held hostage other processes within SRS can't use the connection. In a Data Import job with lots of errors, the total duration of the job could increase considerably.
Approach
Upgrade vertx-jooq-classic-reactive to at least version 6.5.5
Technical risk: Low
Brief explanation of testing required and level of effort (in workdays). Provide test plan agreed with by QA Manager and PO: After the MODSOURCE and MODSOURMAN patches are applied, we need to retest the Smoke and Critical Path Data Import tests (most of which are automated), and perhaps selected Extended Manual tests. Manual testing across these MODSOURCE and MODSOURMAN changes are likely 3-5 days of work for manual QA, plus some input from PO.
What is the roll back plan in case the fix does not work? Revert to previous version