DB Deadlock in fs09000000_mod_source_record_manager.job_monitoring

Description

Large (Create BIBs) data import jobs get stuck right out of the gate due to deadlocking in the database.

This problem was found in the scope of ticket Performance dependencies between bulk edit and data import but further investigations show that it is found in even standalone DI jobs.

BE side reports the issue: 022-11-23 10:23:50 UTC:10.23.10.99(60214):fs09000000_mod_source_record_manager@folio:[8742]:ERROR:  deadlock detected
2022-11-23 10:23:50 UTC:10.23.10.99(60214):fs09000000_mod_source_record_manager@folio:[8742]:DETAIL:  Process 8742 waits for ShareLock on transaction 518512889; blocked by process 3640.
    Process 3640 waits for ShareLock on transaction 518512821; blocked by process 8742.
    Process 8742: UPDATE fs09000000_mod_source_record_manager.job_monitoring SET last_event_timestamp = $1, notification_sent = $2 WHERE job_execution_id = $3
    Process 3640: UPDATE fs09000000_mod_source_record_manager.job_monitoring SET last_event_timestamp = $1, notification_sent = $2 WHERE job_execution_id = $3

Steps to Reproduce:

  1. Navigate to theData Import app -> import 50k MARC BIB records with "PTF Create 2"

  2. Import 50k MARC BIB records with "PTF Create 2"

Expected Results:

  1. The Data import job completed successfully.

Actual Results:

  1. The process of importing was stuck. No data import is available for hours. All further data import jobs progress is 0%.

Additional Information:

Environment: ncp3 https://ptf-nolana-2.int.aws.folio.org/ 

When a deadlock is resolved data import will be available only after the main modules that are involved in the process restarted.

CSP Request Details

None

CSP Rejection Details

None

Potential Workaround

None

Attachments

6

Checklist

hide

TestRail: Results

Activity

Show:

Ann-Marie BreauxDecember 8, 2022 at 9:27 AM

Hi   and all is looking better. Thank you for making these adjustments.

if we want to pursue the difference in import time for Nolana BF versus the PTF env, let's create a separate ticket. Is that something that PTF could look into in the next few weeks?

Roman_FedynyshynDecember 8, 2022 at 9:05 AM

this screenshot represents multiple tests done (from 1K up to 50K several times).
As you can see there is no deadlocks anymore. 

Thanks.

Kateryna SenchenkoDecember 7, 2022 at 12:46 PM

Hi , ,

Any performance issues on Nolana bugfest are definitely a separate topic - in scope of this task we removed Job Monitoring task including the job_monitoring table in the DB, therefore we do not observe any deadlock errors on it. I believe this ticket can be closed. 

If we need to look into Nolana configs to determine why it performs slower, then let's create another ticket. Thank you!

 

Ann-Marie BreauxDecember 7, 2022 at 8:39 AM

Hi   100K (create instance, holdings, item) made it through Nolana BF without any errors, but took 2.75 hours (job 8015). It was Tues evening US time, so probably not much else happening on Nolana BF at the same time.

Ann-Marie BreauxDecember 6, 2022 at 9:37 PM

Hi   and 50K (create instance, holdings, item) took 90 mins.

Do you think it's worth creating a ticket to compare the Nolana BF env to the PTF env to see if we can find differences or ways to account for the variation? There were not other import jobs happening at the same time. I don't know how to gauge whether there was circ activity or other inventory interactions that were happening at the same time. 

I'm kicking off a 100K file now

Done

Details

Assignee

Reporter

Priority

Story Points

Sprint

Development Team

Folijet

Fix versions

Release

Nolana (R3 2022) Bug Fix

RCA Group

Implementation coding issue

Affected releases

Nolana (R3 2022)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created November 23, 2022 at 12:26 PM
Updated October 9, 2023 at 10:53 AM
Resolved December 4, 2022 at 7:32 PM
TestRail: Cases
TestRail: Runs