[Spike] mod-data-export-worker horizontal scaling
Role | Person | Comments |
|---|---|---|
Solution Architect | @Taras Spashchenko |
|
Java Lead | @Viachaslau Khandramai (Deactivated) |
|
UI Lead | @Uladislau Samets |
|
Product Owner | @Magda Zacharska |
|
Notes:
Risks: known issue - https://folio-org.atlassian.net/browse/MODEXPW-150 can be investigated in parallel with two tasks below
https://folio-org.atlassian.net/browse/MODEXPW-178 - Spike: mod-data-export-worker horizontal scaling
1. Local FS → MinIO storage (6-8 SPs)
All operations on the local FS must be migrated to S3-compatible storage. The MinIO client can be used as a provider.
1.1 Adapter layer (3-5 SPs)
The following operations should be supported by S3-adapter:
| FS-Client | S3-Client | Note |
|---|---|---|---|
1 |
|
| Files.createFile(filename) together with new FileWriter(filename) can be replaced by
|
2 |
|
|
|
3 |
|
|
|
4 |
|
|
|
5 |
|
|
|
6 |
|
|
|
7 |
|
|
|
8 |
|
|
|
9 |
|
|
|
10 |
|
|
|
11 |
|
|
|
12 |
|
|
|
13 |
|
|
|
14 |
|
|
|
14 |
|
|
|
1.2 Refactoring (3 SPs)
All the FS operations should be moved to use newly implemented S3-adapter.
a) All the paths should contain tenantID as parent folder name. For this purposes corresponding wrapper should be created and used for paths generating.
b) Folder name can be specified in objectName parameter prefix for putObject method. https://docs.minio.io/docs/java-client-api-reference#putObject https://github.com/minio/minio/issues/2423#issuecomment-239408168
var objectName = String.format(%s/%s/%s, tenantId, folderName, fileName);
minioClient.putObject(bucketName, objectName, inputStream, contentType);2. Schedulers refactoring (8-13 SPs)
The current scheduling implementation needs to be changed to prevent multi-schedulers execution for several active instances. Other than that schedulers should be safely restored after modules restarting without calls to TenantAPI. This isn't in scope of mod-data-export-worker - it will be processed separately in scope of mod-data-export-spring.
TDB: consider implementation of PubSub, connect with Taras on Quartz application. Will require teams Firebird, Thunderjet, and Spitfire connection on implementation and testing.
3. Shared resources refactoring (3-5 SPs)
3.1 Move local map that stores JobCommands by IDs to database (by @Oleksandr Bozhko )
When uploading a file as a part of Bulk edit functionality, the new Job is created. Based on the created Job, new JobCommand instance is created and passed to mod-data-export-worker module through the Kafka. Kafka listener accepts this JobCommand and adds it to local map. When Job starts, JobCommand is retrieved from that map by job id. However, when there is more than one instance of mod-data-export-worker module, POST request to start a job can be accepted by another instance and its local map will not contain such job id.
Assuming the stated above, it would be better to move local map to database.