MODINREACH-347 - Record Contribution Errors Should not Cause Contribution Jobs to Halt
- MODINREACH-347Getting issue details... STATUS , repo mod-inn-reach
Solution Proposal and Recommendation
Concept is to proceed with solution of:
- Pause and resume contribution jobs (initial and ongoing)
- Introduce component / mechanism that will manage retries (with intervals, maxAmount, other params) to the Central server when jobs are paused
Option Space
Option A | Option B | Option C | |
---|---|---|---|
Title | Extend KafkaConsumer using its pause and resume method | Extend springframework.kafka ConcurrentMessageListenerContainer | SpringBatch |
Description | Extending KafkaConsumer | Extending springframework.kafka We’re already using org.springframework.kafka but not taking advantage of its concurrent features and not taking advantage of features that allow for pausing and starting consumers. Our implementation of InitialContributionJobMessageConsumer could contain a single reference to the ConcurrentMessageListenerContainer. Our message processing code could easily be injected into the implementation of an AcknowledgingMessageListener. When a message cannot be committed because the central server is unavailable, we could pause the ConcurrentMessageListenerContainer, and schedule a retry using a ConcurrentTaskScheduler. · Javadoc for ConcurrentMessageListenerContainer ·Example code using ConcurrentMessageListenerContainer to read messages form kafka with pause and restart: · Medium post describing the above repo | Using Spring Batch. Spring Batch does much more than read messages from kafka, but it can easily be adapted to perform that task. See the notes for an example for how to read from kafka. The reason we should strongly consider Spring Batch is that unlike extending springframework.kafka we get job persistence for free in the form of a JobRepository. When using the @EnableBatchProcessing annotation a JobRepository is provided for you. Spring batch has features to support our other requirements, like parallel processing, and pausing and restaring job execution based on business requirements. Also, the idea of restarting based on other failures is baked into Spring Batch. Useful links: |
Pros (Benefit/Effort reduction) |
|
|
|
Cons (Costs/Risks) |
|
|
|
Impact (processes, data, system, timeline) |
Recommendation
From the listed options, Option B looks as the most appropriate that addresses requirements for jobs resilience and introduces abilities to performance improvements for further Quality Improving Process in the scope of contributions jobs.
Decision taken
As results of meetings done today with Steve Ellis, Gurleen Kaur1 currently it's decided to proceed with Option B - Extend springframework.kafka ConcurrentMessageListenerContainer as the first part of the solution. Still to be defined retrying mechanism. Also, impact to be defined.