Bulk Edit Items App report [Orchid] 08/03/2023
Overview
Bulk Edits - Establish a performance baseline for Items bulk updates PERF-406 in the Orchid release that has architectural changes that were implemented in UXPROD-3842. The goal is to make sure the performance did not deteriorate in comparison to Nolana release. Some questions can help us to determine the performance and stability of the new Bulk Edits implementation:
- How long does it take to export 100, 1000, 10k, and 100K records?
- Can it be used with up to 5 concurrent users?
- Run consecutively four jobs editing 10k item records
- Run simultaneously four jobs editing 10k item records
- Look for a memory trend and CPU usage
Summary
Test report for Bulk Edits items-app functionality 2023-03-08.
Orchid release works 30% faster for 10k items bulk editing than Nolana.
It is approximately the same stable as Nolana.
- For 1 concurrent job 100 records can be edited in 1 min 9 s which is 3 times slower than in Nolana, 1000 records editing could be performed in approximately the same time as Nolana(2 min 40 s), and 10k records bulk editing is about 30% faster.
- 10k records per user, 5 users simultaneously (50k records total) can be uploaded and edited in about 20 minutes which is about 8 min faster compared to Nolana (about 28 min).
- The memory of mod-inventory-storage was high at 109% but stable (It was 109% even before the test). No memory leaks were found.
- CPU for mod-users was up to 125% (5 concurrent jobs 10k records updating) increased compared to Nolana (was about 40% with the same configurations) Need to be investigated in further testing. For all other modules did not exceed 65% for all of the tests.
- For all records number (100, 1k,10k), and 5 concurrent jobs - RDS CPU utilization did not exceed 60%.
Recommendations & Jiras
More than 50% of jobs FAILED in about 28-33 min with the error "Connection reset (SocketException)". - PERF-334Getting issue details... STATUS
From time to time Job FAILED with the error from the s3 bucket - MODBULKOPS-76Getting issue details... STATUS
The high CPU usage of mod-users (up to 125% ) needs to be investigated.
Results
Test Runs
1# One (concurrent) Job
Number of records | Duration | Comments |
---|---|---|
100 | 1 min 9 s | |
1000 | 2 min 36 s | |
10k | 17 min 50 s | |
50k | 1 hour 58 min | |
100k | always FAILD |
2# Items App 10k records 3, 4, and 5 concurrent jobs
10k records for each job | |
Number of concurrent jobs | Duration |
---|---|
1 | 17 min 50 s |
3 | 18 min 50 s |
4 | 19 min 10 s |
5 | 20 min 20 s |
3# Editing four jobs consecutively with 10k item records each
Job # | Job duration (run 2) | Job duration (run 1) |
---|---|---|
1 | 17 min 47 s | 18 min 49 s |
2 | 17 min 53 s | 18 min 26 s |
3 | 17 min 45 s | 20 min 44 s |
4 | 18 min 5 s | ERROR occurs: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: 5W7F75FMHHH3KDWT, Extended Request ID: 36K8tkhFQHS1Mjt7sZc4jYrBduBWO/psei+33ZIIOnhrytq7Eie3mjDALtBplhZxSJv4CfrZpnw8Z6nqmz03ZB7b3yiRdecyXfZ/ZtEmN4g=) (S3Excepti |
4# 5 Concurrent Item Apps jobs
"BARCODE". Records number per 1 user | Orchid (Total Time) | Nolana (Total Time) | Morning Glory (Total Time) |
---|---|---|---|
100 | 1 min 10 s | 18 sec | 25-27 sec |
1000 | 2 min 57 s | 3 min | 4 min |
10k | 20 min 20 s | 28 min | 30 min |
25k | 1 hour 3 min | 50 min | |
50k | about 2 hours for successful jobs. | - |
* "-" test was not performed due to errors that occurred
Memory usage
For all test runs
The memory of mod-inventory-storage was high at 109% but stable (It was 109% even before the test). No memory leaks were found.
Instance CPU utilization
Run #1
Run #2 & #3
Service CPU utilization
Run #1
Run #2 & #3
RDS CPU utilization
Run #1
Run #2 & #3
Maximum RDS CPU utilization is 61% for 5 concurrent jobs with 10k item records.
The more concurrent jobs are running -the higher RDS CPU usage - it looks like it should handle up to 7 concurrent jobs without any issues. The maximum number of jobs will be investigated.
Appendix
Infrastructure
PTF -environment ncp5 [ environment name]
- 8 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
- 2 instances of db.r6.xlarge database instances: Writer & reader instances
- MSK ptf-kakfa-3 [ kafka configurations]
- 4 kafka.m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Modules memory and CPU parameters:
Module | SoftLimit | XMX | Revision | Version | desiredCount | CPUUnits | RWSplitEnabled | HardLimit | Metaspace | MaxMetaspaceSize |
---|---|---|---|---|---|---|---|---|---|---|
mod-inventory-storage-b | 1952 | 1440 | 3 | mod-inventory-storage:26.1.0-SNAPSHOT.644 | 2 | 1024 | False | 2208 | 384 | 512 |
mod-inventory-b | 2592 | 1814 | 7 | mod-inventory:20.0.0-SNAPSHOT.392 | 2 | 1024 | False | 2880 | 384 | 512 |
okapi-b | 1440 | 922 | 1 | okapi:5.1.0-SNAPSHOT.1352 | 3 | 1024 | False | 1684 | 384 | 512 |
mod-users-b | 896 | 768 | 4 | mod-users:19.2.0-SNAPSHOT.584 | 2 | 128 | False | 1024 | 88 | 128 |