[RRT] Jobs completing with errors
RCA Group
Description
Environment
Potential Workaround
defines
relates to
Checklist
hideTestRail: Results
Activity

Ann-Marie BreauxNovember 6, 2023 at 5:04 PM
Thank you for all the details, . Closing this issue

Kateryna SenchenkoNovember 6, 2023 at 1:34 PM
Results of the analysis in the current thread:
org.folio.processing.exceptions.EventProcessingException: Failed to handle event payload, cause event payload context does not contain MARC_BIBLIOGRAPHIC or INSTANCE data - appears to be fixed, I couldn’t reproduce the issue, the scenario is the same - Instance matched, but Holdings not matched, therefore Update Instance and Create Holdings - and it works as expected. There were issues with Holdings sub-matches and create actions in non-match branches that were resolved either as Orchid CSP #3 or in Poppy (could be a side-effect of fixes MODSOURCE-662 and )
io.vertx.core.impl.NoStackTraceThrowable: Failed to retrieve MARC record by instance id: '2475a3a8-49c7-401d-ac9e-cd30a763a964', status code: 404 - fixed in Poppy
OL exceptions - currently working with PTF to understand better
There are couple of possible reasons:
Different jobs that run in parallel and try to update the same record at the same time, or a user tries to update the same record that is being updated by the DI job at the same moment. However, such clashes should be extremely rare and covered by the retry mechanism that will get the updated version of record and try the operation again.
Misconfiguration - modules are running in different Kafka consumer groups - we observed a precedent, but it is very unlikely (Carol confirmed it is not the case)
Kafka can duplicated the events - de-duplication mechanism exists only for create actions, if update event is duplicated it will be processed. Such cases should also be covered by the retry mechanism on OL and as a result record should be processed successfully. If retry number is exceeded - something is out of ordinary happening, need to investigate more
mod-inventory/mod-source-record-storage were restarted and consumed messages that were already processed. Also should be handled by the retry.
So far these are all our thoughts on the topic. More information will be gathered and analyzed in scope of

Carole GodfreySeptember 6, 2023 at 9:11 PM
Noting – issue number 3 is not due to a misconfiguration
Details
Assignee
Kateryna SenchenkoKateryna SenchenkoReporter
Kateryna SenchenkoKateryna SenchenkoPriority
P2Story Points
0Sprint
NoneDevelopment Team
FolijetRelease
Poppy (R2 2023) Bug FixTestRail: Cases
Open TestRail: CasesTestRail: Runs
Open TestRail: Runs
Details
Details
Assignee

Reporter

Jobs completing with various errors:
org.folio.processing.exceptions.EventProcessingException: Failed to handle event payload, cause event payload context does not contain MARC_BIBLIOGRAPHIC or INSTANCE data - fixed
io.vertx.core.impl.NoStackTraceThrowable: Failed to retrieve MARC record by instance id: '2475a3a8-49c7-401d-ac9e-cd30a763a964', status code: 404 - fixed by MODINV-847
in00000001356 io.vertx.core.impl.NoStackTraceThrowable: Current retry number 1 exceeded or equal given number 1 for the Instance update for jobExecutionId 'c03084fc-025c-42d3-b2c4-a38313c87c80'
Job execution id = d7138f09-c66c-46a8-98e5-6dfb4278dacc - looks like an Optimistic Locking error, but might be an issue with misconfiguration - investigating further in scope of