[RRT] Jobs completing with errors

RCA Group

Not a bug

Description

Jobs completing with various errors:

org.folio.processing.exceptions.EventProcessingException: Failed to handle event payload, cause event payload context does not contain MARC_BIBLIOGRAPHIC or INSTANCE data - fixed
io.vertx.core.impl.NoStackTraceThrowable: Failed to retrieve MARC record by instance id: '2475a3a8-49c7-401d-ac9e-cd30a763a964', status code: 404 - fixed by MODINV-847
in00000001356 io.vertx.core.impl.NoStackTraceThrowable: Current retry number 1 exceeded or equal given number 1 for the Instance update for jobExecutionId 'c03084fc-025c-42d3-b2c4-a38313c87c80'
Job execution id = d7138f09-c66c-46a8-98e5-6dfb4278dacc - looks like an Optimistic Locking error, but might be an issue with misconfiguration - investigating further in scope of

Environment

None

Potential Workaround

None

Linked issues

defines

UXPROD-3840

NFR: Data Import and Inventory Technical, NFR, & Misc work (Poppy R2 2023)

relates to

MODINV-841

Instance records are discarded

MODINV-847

Allow to overlay source 'MARC' instances without related MARC record

MODSOURMAN-1018

Data Import logs indicate false discarded instances

MODINV-919

SPIKE: Investigate causes for frequent OL errors during large imports

MODSOURMAN-1003

The Instance, holdings, item are not updated after importing .mrc file for update

MODSOURCE-662

Cannot use instance or holdings submatch with a marc to marc match (Orchid CSP 3 Clone)

Checklist

hide

TestRail: Results

Activity

Show:

Ann-Marie BreauxNovember 6, 2023 at 5:04 PM

Thank you for all the details, . Closing this issue

Kateryna SenchenkoNovember 6, 2023 at 1:34 PM

Results of the analysis in the current thread:

org.folio.processing.exceptions.EventProcessingException: Failed to handle event payload, cause event payload context does not contain MARC_BIBLIOGRAPHIC or INSTANCE data - appears to be fixed, I couldn’t reproduce the issue, the scenario is the same - Instance matched, but Holdings not matched, therefore Update Instance and Create Holdings - and it works as expected. There were issues with Holdings sub-matches and create actions in non-match branches that were resolved either as Orchid CSP #3 or in Poppy (could be a side-effect of fixes MODSOURCE-662 and )
io.vertx.core.impl.NoStackTraceThrowable: Failed to retrieve MARC record by instance id: '2475a3a8-49c7-401d-ac9e-cd30a763a964', status code: 404 - fixed in Poppy
OL exceptions - currently working with PTF to understand better

There are couple of possible reasons:

Different jobs that run in parallel and try to update the same record at the same time, or a user tries to update the same record that is being updated by the DI job at the same moment. However, such clashes should be extremely rare and covered by the retry mechanism that will get the updated version of record and try the operation again.
Misconfiguration - modules are running in different Kafka consumer groups - we observed a precedent, but it is very unlikely (Carol confirmed it is not the case)
Kafka can duplicated the events - de-duplication mechanism exists only for create actions, if update event is duplicated it will be processed. Such cases should also be covered by the retry mechanism on OL and as a result record should be processed successfully. If retry number is exceeded - something is out of ordinary happening, need to investigate more
mod-inventory/mod-source-record-storage were restarted and consumed messages that were already processed. Also should be handled by the retry.

So far these are all our thoughts on the topic. More information will be gathered and analyzed in scope of

Carole GodfreySeptember 6, 2023 at 9:11 PM

Noting – issue number 3 is not due to a misconfiguration

Done

Details
Assignee
Kateryna Senchenko
Reporter
Kateryna Senchenko
Labels
RRTdata-importepam-folijetno-release-requiredsupport
Priority
P2
Story Points
0
Sprint
None
Development Team
Folijet
Parent
UXPROD-47 Batch Importer (Bib/Acq)
Release
Poppy (R2 2023) Bug Fix
TestRail: Cases
Open TestRail: Cases
TestRail: Runs
Open TestRail: Runs

Created September 1, 2023 at 10:06 AM

Updated January 4, 2024 at 2:32 PM

Resolved November 6, 2023 at 5:05 PM

Configure

TestRail: Cases

TestRail: Runs