SPIKE: MDEXP-201- Progress bar is stuck on 0%

The purpose of the spike is to investigate and provide a technical solution on how to notify a user that the export started but it will take much time to complete when performance is poor (for different reasons).

Solution

All of the investigated solutions somehow concern performance improvements that have been done for mod-data-export earlier and may affect the general performance.

1) Decrease batch size defined in InputDataManger to read UUIDs from a source .csv file and increase the pool size of workers in ExportDataManager. It's the simplest solution in terms of the number of changes. The main idea is to parallelize the mapping process more and thereby to start to update the progress of job execution more often. For example, batch size can be set to 5 instead of 50, and the pool size can be increased from 2 to 20. 

Props of the solution:

  • simple implementation
  • doesn't affect the existing mechanism of data export

Cons of the solution:

  • concerns the changes done in the scope of the user story  MDEXP-166 - Getting issue details... STATUS  as far as workers pool size was increased from 1 to 2
  • increase of the workers pool size significantly may not compensate for a decrease of UUIDs batch size as far as performance measurement results didn't show an increase of performance following an increase of the pool size

2) Decrease srs load partition size in ExportDataManger and start to map and write records as soon as receive the first partition of srs and instance (holdings, items) records. Since the most likely reason for the performance issues, it's some troubles and poor connections with source-record-storage and mod-inventory-storage, it makes sense to decrease the number of the records retrieved from those modules per one request and handle records at once they are received. At the same time, the job execution progress can be updated two times - first after writing underlying srs records and second after writing mapped records. For example, the srs load partition size and inventory load partition size can be decreased from 50 to 5.    

Props of the solution:

  • doesn't affect the pool size of workers and UUIDs batch size
  • solves the issue exactly where we have a bottleneck
  • splits updating of the progress after writing underlying records and mapped instance records therefore when performance issues are only on mod-data-inventory or source-record-storage sides the progress still will be updated

Cons of the solution:

  • increase of the data export execution time in the case when the connection with other modules is stable and there no performance issues
  • concerns the changes done in the scope of the user story  MDEXP-207 - Getting issue details... STATUS  as far as partition sizes were increased from 20 to 50


It also makes sense to move partition sizes to mod-configuration to have the ability to changes them in runtime.

3) Add new additional field in-progress to update progress before retrieving records from SRS. As far as the stuck progress root cause requests to SRS  this approach supposes to set numbers of records that are being processing to separate "in-progress" field at the beginning of the export before sending requests to SRS. Then on UI side, the new formula should be used to calculate progress percentage: 

in-progress/(totalRecords) *100

For example, if the number of total records = 100 and the number of records processed for one iteration = 50, then 50 records will be set as "in-progress". In case when the export is stuck and "exported" and "failed" fields are not updated long time progress percentage will be the next: 50/100*100 = 50%

At the same time, it doesn't work the same when total less than batch size for one iteration. For example, the number of total records = 10 and the number of records processed for one iteration the same = 50. In this case progress percentage will be the next: 10/10*100 = 100%. Nevertheless, it shows the user some progress that is better than stuck 0% progress.

We still should update "failed" and "exported" fields to show general information at the end of the export.

Conclusions

After some internal discussions, it was decided to follow the third solution with new brand "in-progress" field. The first and second solutions were declined because of the big risks of sufficient performance decrease and concerns the improvement changes were done previously.