EHoldings export report [Morning Glory]

EHoldings export report [Morning Glory]





Overview

Per PERF-267, test Eholdings exports (PERF-273) of 10K  records to understand the workflow behavior before and when the mod-data-export-worker task crashes, if it crashes at all. 

  • How long does it take to export 10K records?

  •  What happens to the job that is running, will it be able to resume and complete successfully when the new task is spun up? 

  • Look for a memory trend and use it to decide on the number of concurrent jobs needed to reach the tipping point.  



Infrastructure

  • 10 m6i.2xlarge EC2 instances  (changed. In Lotus it was m5.xlarge)

  • 2 instances of db.r6.xlarge database instances, one reader and one writer

  • MSK

    • 4 m5.2xlarge brokers in 2 zones

    • auto.create-topics.enable = true

    • log.retention.minutes=120

    • 2 partitions per DI topics



Software Versions

  • mod-data-export-worker v 1.4.1 

  • mod-data-export-spring v 1.4.1

  • mod-agreements:5.2.0

  • mod-notes:3.1.0



Results

Summary 

  • This is initial test report for exporting Eholdings functionality. 

  • Approximately 10K records can be exported in 30 minutes (we did test it with 9 631 titles and it takes 27 minutes package eholdings/packages/53-1094073).

  • 2K export was completed in ±4 minutes;

  • System is unstable and often fail during procedure with symptoms MODEXPW-170

    • Job is completed and file download link is active (like in job#000034 on the screen below) also you can download exported file with whole amount of data.

    • Start time and End time becomes equal;

    • In DB job status still in progress.

    • This status never changes and block other jobs after it (they getting status "scheduled") until you'll restart mod-data-export-worker, mod-data-export-spring and the explicitly change job status in DB to "FAILED".

  • Memory trend: Memory is growing - however when container (mod-data-export-worker) starts the memory usage is on 15% rate. and after finishing 10K export it's on 29%. It's hard to determine memory trends while jobs usually stuck and you have to restart container to proceed. 

  • Failover  The mod-data-export-worker's container was restarted to simulate a crash. The ongoing job got stuck with the result described above (link to download is active, but job is forever in progress).









On this screenshot three attempts of exporting. 





Notable observations

  • There is no way to track exporting progress. Like how much records is transferred at this time? 

  • there is no way to check how big file is exported. There is only jobId and Job time containing useful info. 

  • UI does not updates by itself. only after page restart Which means additional calls on back end and more resource usage