Overview
- These tests are run to investigate the performance difference of mod-data-export with logLevel=info vs logLevel=warn for Juniper release. This testing was part of - MDEXP-394Getting issue details... STATUS where we observed that mod-data-export is continuously writing a lot of data to log. For 100K DE job, mod-data-export is writing 42Million records. This could also result in a crash if there is not enough CPU, memory allocated to mod-data-export.
- In mod-data-export log4j2.properties file was modified with rootLogger.level = warn, logger.netty.level = warn, status = warn
Backend:
- mod-data-export-4.1.1 (snapshot version)
- mod-source-record-storage-5.1.4
- mod-source-record-manager-3.1.3
- okapi-4.8.2
- mod-authtoken-2.8.0
Frontend:
- folio_data-export-4.1.0
Environment:
- 8 million inventory records
- 74 FOLIO back-end modules deployed in 144 ECS services
- 3 okapi ECS services
- 12 m5.large EC2 instances
- 1 writer db.r6g.xlarge 1 reader db.r6g.xlarge AWS RDS instance
- INFO logging level / WARN looging level
High-Level Summary
- With WARN level logging, seeing 9% improvement in memory utilization
- With WARN level logging, No bumps in the memory were observed. Memory utilization stays stable for multiple Data Export job runs.
- Improvement by 42Million records for WARN. 42 Million fewer records were written.
- How many data-export jobs can run in parallel? Multiple jobs can run in parallel and 2.75 Million instance records can be exported at any given point in time. If we try to export more than that for example 3 Million instance records, mod-data-export crashes with OOM(Out Of Memory)
Test Runs
1 user - INFO level logging vs WARN level logging
Test | Total instances | Duration | mod-data-export log level | Total time to complete exporting all instances, holdings, and items | Total records logged in CloudWatch |
1. | 100,000 | 1 hour 6 minutes | INFO | 2.97 minutes | 42.5 Million |
2. | 100,000 | 57 minutes | WARN | 18 minutes | 120K |
Total records logged for 100K, INFO vs WARN
INFO - 42.5 Million records
WARN - ~120K records
Memory Utilization INFO vs WARN
CPU Utilization
CPU utilization increase as the number of instances increases from 100 to 500k. CPU reaches maximum for 100k, 500k instances but there is not much difference for 1 user and 2 users.
Service Memory Utilization
Memory increases gradually as we start running tests especially for mod-data-export and then stabilizes. For all other modules such as mod-source-record-storage, the okapi remains constant between 80% - 100%.
Check how many jobs can run in parallel
Multiple jobs can run in parallel but data-export fails if trying to export 3 Million instance records with the below configuration.
Current memory allocation to mod-data-export service in ECS task definition container:
Soft memory limit - 360 MB
Hard memory limit - 512 MB
Memory Utilization gradually increases from 101% to 141% as we increase the number of instance records where it eventually crashes.
Number of inventory instance records (Millions) | Average Memory Utilization (%) |
---|---|
1 | 101.66 |
2 | 102.77 |
2.5 | 124.4 |
2.75 | 136.11 |
3 | 141 (service fails with OOM) |
When trying to export 3M records, POST data-export/file-definitions/d63d8a83-e339-44b2-8a2f-41caaf080221/upload fails with 503
Appendix
For more raw data of the test run please see the attached test-report-honeysuckle.xlsx for Honeysuckle.