Import of 5 concurrent jobs become stuck

RCA Group

None

Description

This issue is observed in a Honeysuckle HF3 FOLIO environment.

Creating ticket with context to check the same scenario does not exist in Iris environment

Specifically - 5 parallel Data Import job runs were submitted at that time, jobs progressed for about 30 mins or so and have been stuck ever since.

Observe errors in mod-pubsub audit_message table indicating some events were not successfully delivered (most likely due to issues with inventory at the time)Error delivering DI_SRS_MARC_BIB_INSTANCE_HRID_SET event with id 'ad1a0390-7926-4533-b9a7-044f80d170c9' to /inventory/handlers/instances, response status code is 503, Service Unavailable

As well as messages in the log

Error delivering DI_SRS_MARC_BIB_INSTANCE_HRID_SET event with id '22d302a2-36c0-4ac0-9618-037ee6a33806' to /inventory/handlers/instances, response status code is 502, Bad Gateway

And message from mod-source-record-storage
20:15:00.081 [vert.x-eventloop-thread-0] ERROR tpClientResponseImpl [10600064eqId] io.vertx.core.VertxException: Connection was closed

Spikes in CPU and Memory were observed for both mod-inventory and mod-pubsub

After Restarting modules – attempts to run concurrent imports continued to get stuck

Environment

None

Potential Workaround

None

Attachments

7

Checklist

hide

TestRail: Results

Activity

Show:

Ann-Marie BreauxMay 26, 2021 at 11:54 AM

Will focus on MODDATAIMP-419 instead

Nick CappadonaMay 13, 2021 at 2:55 PM

, Just in case my previous comment was missed, please consider closing this issue and instead focusing efforts on MODDATIMP-419 which is about handling a single import run of 10k+ records that includes a match profile and updates the MARC Bib.

I added results yesterday based on failed tests against the Iris reference environment.

Ann-Marie BreauxMay 13, 2021 at 2:34 PM

Grooming: needs review and checking on Iris; then decide if any updates are needed

Ann-Marie BreauxMay 11, 2021 at 5:25 PM

In this week's grooming, let's review this, plus the various bugfixes/changes related to PTF work, and link up the Jiras that we think would make concurrent imports testable in Iris bugfest.

Nick CappadonaApril 29, 2021 at 5:51 PM

Are you asking me to continue the steps as outlined in MODDATIMP-419 on Bugfest? It appears to be down (or "kaput") right now.

I just want to reiterate that these two tickets are related. We prefer to focus everyone's efforts on optimizing Data Import's ability to handle job executions of increasing size (10 - 15k records in this example) MODDATIMP-419 instead of having to split the file into multiples of <3k records and submit parrallel job executions (MODDATAIMP-424).

Won't Do

Details

Assignee

Reporter

Priority

Story Points

Development Team

Folijet

Release

R2 2021

Affected Institution

Cornell

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created April 23, 2021 at 7:10 PM
Updated October 29, 2021 at 7:24 AM
Resolved May 26, 2021 at 11:54 AM
TestRail: Cases
TestRail: Runs