Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update existing solution with POC comments, add other options

Background

Authority records remapping was implemented for mod-marc-migrations according to the design MARC records migration (authority)

...

Support chunks processing distribution between app instances.

...

Option 1. Spring Batch Remote Partitioning

Overview

Remote partitioning using Spring Batch Integration https://docs.spring.io/spring-batch/reference/spring-batch-integration/sub-elements.html#remote-partitioning with Kafka .

With such approach we will have batch job “manager“, which will construct chunks when job is submitted, then send chunk metadata to kafka so consumers (batch job “worker“) can read, process chunks, write/upload the file and return processing result metadata to kafka to later be consumed by “manager“ to complete the job.

Provides horizontal scaling in a scope of 1 job.

Notes/pitfalls

  1. There’s currently a GitHub issue https://github.com/spring-projects/spring-batch/issues/4133 connected to simultaneous running of multiple jobs. Issue reproduces, confirmed with POC. Probably we’ll not be affected if one app instance runs only one job May be avoided by limiting job execution at 1 at a time.

  2. If chunks are submitted to kafka - parallel chunks processing in a scope of one instance would require concurrent consuming.

  3. “manager“/”worker” are supposed to be separate app instances, f.e. using profiles for configuration. Probably will be ok to have 1 manager + one worker for each app instance. TODO: check/test how to route responses in such case, or if it’s possible to consume on manager different from one which created the jobNOT ok, looks like spring batch job execution is running continuously on manager that started it and never gets completed if some worker responses are consumed by other manager.

  4. In case of one app instance remote chunking will most likely be slower than current solution, so probably some profile/env variable should be present to enable remote chunking only in case there’re multiple app instances, otherwise - use currently implemented approach.

Alternative solution ideas, which require walking away from spring batch

  1. Some scheduler to poll chunks that need processing, will require some locking on chunks.

  2. Fire single kafka event about migration start and have it consumed by all app instances, will require some locking on chunks.

  3. Fire chunk events in kafka. App instance responsible for firing these events will create a scheduled job created for each migration to monitor processing status. Or some db trigger could update operation status when chunks processed.

  4. Probably will be possible to direct chunk processing requests directly to other app instances on Eureka platform.. May be a maintenance problem, probably just a configuration question.

  5. Problem with proper handling folio execution contexts, easy to put in kafka headers, but no way found to execute step in folio context on worker side found so far, as a workaround - start context early in the worker and end it after response sent, should work fine if only one tenant job is launched at a time.

Spring batch also brings additional database operations which duplicates some logic that already present in feature design, such as migration/job statuses, chunks etc… which brings us to other solution ideas.

Required effort

  1. Probably we’ll need two separate app deployments: 1 deployment with 1 instance for manager and 1 deployment with multiple instances for workers.

  2. On development side - mostly spring batch configurations, some changes to existing code, kafka addition.

Option 2. Async spring batch job start

Overview

After migration operation created in db do either:

  1. send kafka message about operation creation so some other app instance can start/perform spring batch job. Probably multiple jobs may go to the same kafka partition which will cause jobs to be stuck in a queue while other app instances may be not busy

  2. have a scheduled job which will check on created operation, change it status to “in progress“ and run a spring batch job. Requires synchronization on operation and scheduler which will check db for each tenant.

Provides horizontal scaling in a scope of multiple jobs.

Required effort

  1. In case of kafka - add producing/consumig, minimal changes to existing codebase

  2. In case of scheduler - add client for tenant retrieval, scheduler to walk through all tenant schemas, minimal changes to existing code

Option 3. Async processing without spring batch

Overview

After chunk objects are constructed - send them (or lightweight version with only required info) to kafka. Each chunk then could be processed by different app instances.

Requires some mechanism to finish migration. Either check db for total/processed number of records after processing each chunk, or have some service that’ll create scheduled job which will cancel itself after migration is finished by checking database periodically, f.e. as demonstrated in Scheduled self-cancelling task example.

Provider horizontal scaling in a scope of one job, could provide horizontal scaling in scope of multiple jobs. More clear/isolated folio context interactions. No additional db call related to spring batch

Required effort

  1. Implement db checks on chunks precessing completion OR service to create scheduled jobs for migrations, which could also maintain logic related to limiting parallel jobs processing number.

  2. Remove spring batch logic

  3. Add kafka setup

  4. Rearrange processing logic which was tied together by spring batch, now it needs to be run in some service

Option 4. Separate spring batch job for each chunk

Overview

Similar approach as in Option 3, but with preserving most spring batch logic.

After chunks preparation - send chunk/chunkId in Kafka, consumers will start a spring batch job for each chunk which will preserve most spring batch logic related to read/process/write.

Requires same mechanism to finish migration as Option 3.

Provider horizontal scaling in a scope of one job, could provide horizontal scaling in scope of multiple jobs. More clear/isolated folio context interactions. Still has spring batch db interactions

Required effort

  1. Same as previous but instead of removing spring batch logic and rearranging processing - change spring batch configuration to have less elements / simpler structure