Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Background

Authority records remapping was implemented for mod-marc-migrations according to the design MARC records migration (authority)

Currently vertical scaling is supported by increasing chunk size, chunks processing parallelism and resources for the module. In current implementation chunks data is prepared and read sequentially, only remapping/saving to file and related db queries are done in parallel. All files are uploaded to external storage when job ends.

...

Theoretically, if two app instances exist, they could process 2 jobs simultaneously, but only if load balancer routes second request to the second app instance.

Purpose

Support chunks processing distribution between app instances.

Solution Options

Option 1. Spring Batch Remote Partitioning

Manager+worker on the same instance doesn’t work properly. Manager, worker on different instances are not possible with current deployment approach.

Overview

Remote partitioning using Spring Batch Integration https://docs.spring.io/spring-batch/reference/spring-batch-integration/sub-elements.html#remote-partitioning with Kafka .

...

  1. There’s currently a GitHub issue https://github.com/spring-projects/spring-batch/issues/4133 connected to simultaneous running of multiple jobs. Issue reproduces, confirmed with POC. May be avoided by limiting job execution at 1 at a time.

  2. If chunks are submitted to kafka - parallel chunks processing in a scope of one instance would require concurrent consuming.

  3. “manager“/”worker” are supposed to be separate app instances, f.e. using profiles for configuration. Probably will be ok to have 1 manager + one worker for each app instance. NOT ok, looks like spring batch job execution is running continuously on manager that started it and never gets completed if some worker responses are consumed by other manager.

  4. In case of one app instance remote chunking will most likely be slower than current solution, so probably some profile/env variable should be present to enable remote chunking only in case there’re multiple app instances, otherwise - use currently implemented approach. May be a maintenance problem, probably just a configuration question.

  5. Problem with proper handling folio execution contexts, easy to put in kafka headers, but no way found to execute step in folio context on worker side found so far, as a workaround - start context early in the worker and end it after response sent, should work fine if only one tenant job is launched at a time.

...

After migration operation created in db do either:

  1. send kafka message about operation creation so some other app instance can start/perform spring batch job. Probably multiple jobs may go to the same kafka partition which will cause jobs to be stuck in a queue while other app instances may be not busy

  2. have a scheduled job which will check on created operation, change it status to “in progress“ and run a spring batch job. Requires synchronization on operation and scheduler which will check db for each tenant.

Provides horizontal scaling in a scope of multiple jobs.

...

  1. Implement db checks on chunks processing completion.

  2. Remove spring batch logic

  3. Rearrange processing logic which was tied together by spring batch, now it needs to be run in some service

Option 4. Separate spring batch job for each chunk

Makes no sense to have spring batch logic just for individual chunk processing

Overview

Similar approach as in Option 3, but with preserving most spring batch logic.

...

  1. Same as previous but instead of removing spring batch logic and rearranging processing - change spring batch configuration to have less elements / simpler structure

Summary

Solution option

Required effort

Benefits

Drawbacks

2

TODO

horizontal scaling for multiple jobs

no horizontal scaling in a scope of one job

3

~5-6sp

horizontal scaling for multiple jobs + in a scope of one job.

No spring batch db calls/overhead

Amount of effort required