(Iris Hotfix #3) Data import stopped process before finishing: deadlock for "job_monitoring"

Description

From MODDATATIMP-475:

Review logs for Iris Bugfest job that finished prematurely and did not create all the expected records

https://bugfest-iris.folio.ebsco.com
Job 6436 run by @Ann-Marie Breaux
5,000 record file, TAMU_sample_bibs_5k_2.mrc, attached
Using job profile: PTF Create Instance Holdings Item
Job started at 8:47 am Iris Bugfest time, and it finished at 8:50 am Bugfest, so only 3 minutes!
UI log summary shows Completed with errors
UI log detail shows 1,950 SRS MARC, Instances, Holdings, and Items created
No indication of why it stopped after processing 1,950 incoming records, instead of all 5,000

After investigation in the scope of MODDATAIMP-475, it seems like the root cause of this issue is that there is deadlock for the "job_monitoring"-table.

There are some logs from srm-module from Iris-Bugfest(See full log as attached-file):

2021-06-24T06:48:59.474Z io.vertx.pgclient.PgException: { "message": "deadlock detected", "severity": "ERROR", "code": "40P01", "detail": "Process 12999 waits for ShareLock on transaction 88425453; blocked by process 13000.\nProcess 13000 waits for ShareLock on transaction 88425452; blocked by process 12999.", "hint": "See server log for query details.", "where": "while rechecking updated tuple (2,38) in relation \"job_monitoring\"", "file": "deadlock.c", "line": "1146", "routine": "DeadLockReport" }

It seems like fix for this issue should fix this unexpected behaviour.

Note: there are a lot of errors "Couldn't update JobExecution status, JobExecution already marked as ERROR". But it seems like they were caused by deadlock (error handling mechanism worked in this way, so ignore them).

CSP Request Details

None

CSP Rejection Details

None

Potential Workaround

None

Attachments

Linked issues

Checklist

hide

TestRail: Results

Activity

Show:

Oleksii PetrenkoJuly 16, 2021 at 5:34 PM

Deployed to Iris bugfest env. Please proceed with verification

Ann-Marie BreauxJuly 8, 2021 at 3:49 PM

O happy day!! Thank you, @Maryna Zhuravlova

Maryna ZhuravlovaJuly 7, 2021 at 2:08 PM

After bug testing the I confirmed that bug is fixed (env - snapshot)

Khalilah GambrellJuly 7, 2021 at 1:00 PM

@Ann-Marie Breaux, I will submit the request with several other requests.

Ann-Marie BreauxJuly 7, 2021 at 12:53 PM

Hi @Khalilah Gambrell Was this approved by CPT, or is it on the agenda for next Mon meeting? If already approved, needs the CPT Yes field filled in

Done

Details
Assignee
Khalilah Gambrell
Reporter
Volodymyr Rohach
Labels
back-enddata-importepam-spitfire
Priority
P1
Story Points
2
Sprint
None
Development Team
Spitfire
Parent
UXPROD-47 Batch Importer (Bib/Acq)
Fix versions
3.0.9
Release
R1 2021 Hot Fix #3
CSP Approved
Yes
TestRail: Cases
Open TestRail: Cases
TestRail: Runs
Open TestRail: Runs

Created July 5, 2021 at 12:26 PM

Updated July 21, 2021 at 11:25 AM

Resolved July 8, 2021 at 1:15 PM

TestRail: Cases

TestRail: Runs

(Iris Hotfix #3) Data import stopped process before finishing: deadlock for "job_monitoring"

Description

CSP Request Details

CSP Rejection Details

Potential Workaround

Attachments

Linked issues

blocks

is cloned by

is required by

relates to

Checklist

TestRail: Results

Activity

Oleksii PetrenkoJuly 16, 2021 at 5:34 PM

Ann-Marie BreauxJuly 8, 2021 at 3:49 PM

Maryna ZhuravlovaJuly 7, 2021 at 2:08 PM

Khalilah GambrellJuly 7, 2021 at 1:00 PM

Ann-Marie BreauxJuly 7, 2021 at 12:53 PM

Details

Assignee

Reporter

Labels

Priority

Story Points

Sprint

Development Team

Parent

Fix versions

Release

CSP Approved

TestRail: Cases

TestRail: Runs

Flag notifications

Something's gone wrong

Something's gone wrong