SPIKE - Investigate performance improvements of bulk edit

@Viachaslau Khandramai

@Mikita Siadykh

@Taras Spashchenko

mod-data-export-worker module

One of the bottlenecks was found when launching a job in the mod-data-export-worker module, BulkEditInstanceProcessor#process method, which is invoked for every identifier in a CSV file and instances are fetched one by one from Inventory using 'inventory/instances' endpoint with query, and main thread is blocked until the next record is retrieved.

Possible workaround: parallelize the process of fetching instances from inventory; use views instead of HTTP endpoint to fetch instances.

 

When processing the file with identifiers, every error is appended to the storage once found (BulkEditProcessingErrorsService#saveErrorInCSV; this case only matters if there are many errors).

Possible workaround: either store errors in memory and save them to the storage at the end of the job, or save them to the storage in a separate thread to avoid blocking the main thread.

 

CsvPartitioner is not implemented for bulk edit and it prevents from using task executor to parallelize reading the file.

Possible workaround: implement CsvPartitioner for bulk edit when StepBuilder is initialized, use async task executor to parallelize reading the partitions.

mod-bulk-operations module

When updating instances through the application and commit changes, endpoint 'inventory/instances' and PUT is used to update records one by one and main thread is blocked until the previous record is updated (mod-bulk-operations module).

Possible workaround: somehow parallelize the the commit stage, maybe there is a point to use spring batch for that and update by chunks in parallel.