SPIKE: Investigate causes for frequent OL errors during large imports

RCA Group

None

Description

Reported by PTF testing Poppy main release.

Large imports (10 -100k) of marc bibs with different profiles, including basic Create Instance, Holdings, Item complete with errors. Errors indicate OL exceptions (in case of create Instance it happens during post-processing step, when Instance is updated to contain new 035 identifier). Reproduced consistently on large imports, but with the different number of such errors. No background activity, only DI

 

Configuration:
Environment: OCP3;

Job Profile:  PTF-2 Create; 

Number of records = 250K 

Version: mod-data-import:3.0.3

DI module revision: 36

During the data import (about 130K were finished without errors, test is still in progress) process we see more than 8000 errors in logs.

Query to extract logs from Logs Insights

fields @timestamp, @message, @logStream, @log

filter @message like "(optimistic locking)"

sort @timestamp asc

limit 1000
 

Environment

None

Potential Workaround

None

Attachments

2

Checklist

hide

TestRail: Results

Activity

Show:

Kateryna Senchenko November 6, 2023 at 2:47 PM

Thank you , in this case the issue is not that critical - retry mechanism works as expected, but we still want to find out why OL errors occur so often

Mykhailo Petryshyn November 6, 2023 at 1:00 PM

  

I`ve checked on 250K DI(PTF-Create), during this test with keywords " Current retry number" and "exceeded" no errors were found. 
 
All data imports were completed without errors, but errors "optimistic locking" are seen in logs.

 

 

Kateryna Senchenko November 6, 2023 at 11:37 AM

, please also search for the log "exceeded" (We are interested if there are "Current retry number {} exceeded" errors that would result in DI_ERROR event for the processing record)

Kateryna Senchenko November 6, 2023 at 11:34 AM
Edited

Theoretically, we could get duplicated events on update (de-duplication mechanism exists only for create actions), but retry on OL should cover such cases.

Or mod-inventory/mod-source-record-storage were restarted and consumed messages that were already processed.

Kateryna Senchenko November 6, 2023 at 11:33 AM

Hi , are those imports completing successfully or with errors? Presence of OL exceptions in the log doesn't mean the record fails to process, we have a retry mechanism that should get the updated version of Instance and try to update it one more time. In this case we will observe the exception in the log, but the record will be processed successfully in the end. Please clarify

Details

Assignee

Reporter

Priority

Story Points

Development Team

Folijet

Release

Trillium (R2 2025)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created November 6, 2023 at 10:09 AM
Updated March 4, 2025 at 8:43 PM
TestRail: Cases
TestRail: Runs