Implement flow control logic in SRM for Kafka messages processing

RCA Group

None

Description

 

Acceptance criterias:

Adjust existed DataImportKafkaHandler that accepts DI_COMPLETE, DI_ERROR to invoke new flow control logic functionality.

Implement Java service to control flow of execution based on pausing of srm consumer for DI_RAW_RECORDS_CHUNK_READ topic.

Basically this new  service should:  

  •  Store in memory record counter how many DI_COMPLETE, DI_ERROR reads were done

  •  Based of value of  property di.flow.max_simultaneous_records make pause() for Kafka consumer. Firstly read first 100 records( default value of  simultaneous_records  parameter) and invoke pause() on Kafka consumer. It will create a chance to publish OCLC messages to the topic DI_RAW_RECORDS_CHUNK_PARSED(consumed by SRS) before next chunk from data import coming.

  • Based on value of property di.flow.records_threshold make resume() for Kafka consumer, if count of DI_COMPLETE, DI_ERROR started to equal 50 (default value of di.flow.records_threshold) we need to call resume() for Kafka consumer and read next chunks of messages from data-import DI_RAW_RECORDS_CHUNK_READ topic

Environment

None

Potential Workaround

None

Attachments

4

Checklist

hide

TestRail: Results

Activity

Show:

Serhii_Nosko May 16, 2022 at 3:40 PM

Hi , our rancher is restored to the working state.
So you can import OCLC records on it to feel real speed closing to the bugfest one. When running OCLC imports you can re-check results on the View All page(because state of Running job section updates with some delays). I also attached 5k file.
Link to our rancher: https://folijet.ci.folio.org/data-import

Ann-Marie Breaux May 16, 2022 at 7:35 AM

Hi All sounds good - thank you! Closing this issue.

Serhii_Nosko May 11, 2022 at 7:50 AM
Edited

Hi , currently the default params of max_simultaneous_records is 100, records_threshold is 50. In another words when 50 records of big DI job finish - OCLC import starts. Snapshot env has limited capacity, so processing of this 50 records would need more time, than for example for perf-rancher, bugfest or production. So you could trigger couple of OCLC imports and go View all page to check that OCLC imports finish before main 500+ record file to make sure that flow control is working. Also attaching 1k file to this story - you could try on this file as well.

Previous situation as described by Jenn Colt was :
The OCLC record import took several minutes. The single record import does complete before the large job completely finishes but not until it is maybe 75% done. Link

Expected situation for now for snapshot env:
The OCLC import completes earlier that 75% of importing big files.

Also as alternative I can decrease these params of max_simultaneous_records and records_threshold, but for our main envs such as bugfest, perf-rancher or prod I think we should increase them again.

Ann-Marie Breaux May 11, 2022 at 7:29 AM

Hi I imported a 500+ record file, and while that job was running, I tried several Inventory Single Record Imports. For each of those Inventory imports, I got the red toast you'll see in the attached screenshot. They eventually completed, but I couldn't really tell that they were faster than before. Any other ideas on how to test?

Serhii_Nosko May 6, 2022 at 10:51 AM

I think that this related story: https://folio-org.atlassian.net/browse/MODDATAIMP-613
can be closed after your review of current story.
To start performance testing story https://folio-org.atlassian.net/browse/MODSOURMAN-663 to compare results with feature flow feature enabled and our previous perf-rancher results described here we need perf rancher running

Done

Details

Assignee

Reporter

Priority

Story Points

Sprint

Development Team

Folijet

Fix versions

Release

Morning Glory (R2 2022)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs
Created December 28, 2021 at 1:12 PM
Updated July 5, 2022 at 1:25 PM
Resolved May 16, 2022 at 7:35 AM
TestRail: Cases
TestRail: Runs