Implement flow control logic in SRM for Kafka messages processing

RCA Group

None

Description

Acceptance criterias:

Adjust existed DataImportKafkaHandler that accepts DI_COMPLETE, DI_ERROR to invoke new flow control logic functionality.

Implement Java service to control flow of execution based on pausing of srm consumer for DI_RAW_RECORDS_CHUNK_READ topic.

Basically this new service should:

Store in memory record counter how many DI_COMPLETE, DI_ERROR reads were done
Based of value of property di.flow.max_simultaneous_records make pause() for Kafka consumer. Firstly read first 100 records( default value of simultaneous_records parameter) and invoke pause() on Kafka consumer. It will create a chance to publish OCLC messages to the topic DI_RAW_RECORDS_CHUNK_PARSED(consumed by SRS) before next chunk from data import coming.
Based on value of property di.flow.records_threshold make resume() for Kafka consumer, if count of DI_COMPLETE, DI_ERROR started to equal 50 (default value of di.flow.records_threshold) we need to call resume() for Kafka consumer and read next chunks of messages from data-import DI_RAW_RECORDS_CHUNK_READ topic

Environment

None

Potential Workaround

None

Attachments

Linked issues

Checklist

hide

TestRail: Results

Activity

Show:

Serhii_Nosko May 16, 2022 at 3:40 PM

Hi , our rancher is restored to the working state.
So you can import OCLC records on it to feel real speed closing to the bugfest one. When running OCLC imports you can re-check results on the View All page(because state of Running job section updates with some delays). I also attached 5k file.
Link to our rancher: https://folijet.ci.folio.org/data-import

Ann-Marie Breaux May 16, 2022 at 7:35 AM

Hi All sounds good - thank you! Closing this issue.

Serhii_Nosko May 11, 2022 at 7:50 AM
Edited

Hi , currently the default params of max_simultaneous_records is 100, records_threshold is 50. In another words when 50 records of big DI job finish - OCLC import starts. Snapshot env has limited capacity, so processing of this 50 records would need more time, than for example for perf-rancher, bugfest or production. So you could trigger couple of OCLC imports and go View all page to check that OCLC imports finish before main 500+ record file to make sure that flow control is working. Also attaching 1k file to this story - you could try on this file as well.

Previous situation as described by Jenn Colt was :
The OCLC record import took several minutes. The single record import does complete before the large job completely finishes but not until it is maybe 75% done. Link

Expected situation for now for snapshot env:
The OCLC import completes earlier that 75% of importing big files.

Also as alternative I can decrease these params of max_simultaneous_records and records_threshold, but for our main envs such as bugfest, perf-rancher or prod I think we should increase them again.

Ann-Marie Breaux May 11, 2022 at 7:29 AM

Hi I imported a 500+ record file, and while that job was running, I tried several Inventory Single Record Imports. For each of those Inventory imports, I got the red toast you'll see in the attached screenshot. They eventually completed, but I couldn't really tell that they were faster than before. Any other ideas on how to test?

Serhii_Nosko May 6, 2022 at 10:51 AM

I think that this related story: https://folio-org.atlassian.net/browse/MODDATAIMP-613
can be closed after your review of current story.
To start performance testing story https://folio-org.atlassian.net/browse/MODSOURMAN-663 to compare results with feature flow feature enabled and our previous perf-rancher results described here we need perf rancher running

Done

Details

Assignee

Serhii_Nosko

Reporter

Serhii_Nosko

Labels

data-importdi-architectureepam-folijetneeds-testrail

Priority

Story Points

Sprint

None

Development Team

Folijet

Parent

UXPROD-47 Batch Importer (Bib/Acq)

Fix versions

3.4.0

Release

Morning Glory (R2 2022)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created December 28, 2021 at 1:12 PM

Updated July 5, 2022 at 1:25 PM

Resolved May 16, 2022 at 7:35 AM

Configure

TestRail: Cases

TestRail: Runs

Implement flow control logic in SRM for Kafka messages processing

RCA Group

Description

Environment

Potential Workaround

Attachments

Linked issues

defines

has to be done after

has to be done before

is cloned by

relates to

Checklist

TestRail: Results

Activity

Serhii_Nosko May 16, 2022 at 3:40 PM

Ann-Marie Breaux May 16, 2022 at 7:35 AM

Serhii_Nosko May 11, 2022 at 7:50 AMEdited

Ann-Marie Breaux May 11, 2022 at 7:29 AM

Serhii_Nosko May 6, 2022 at 10:51 AM

Details

Assignee

Reporter

Labels

Priority

Story Points

Sprint

Development Team

Parent

Fix versions

Release

TestRail: Cases

TestRail: Runs

Serhii_Nosko May 11, 2022 at 7:50 AM
Edited