Overview

Bulk Edits - Establish a performance baseline for Items bulk updates PERF-408 in the Orchid release that has architectural changes that were implemented in UXPROD-3842. The goal is to make sure the performance did not deteriorate in comparison to Nolana release. Some questions can help us to determine the performance and stability of the new Bulk Edits implementation:

How long does it take to export 100, 1000, 10k, and 100K records?
Can it be used with up to 5 concurrent users?
Run consecutively four jobs editing 10k item records
Run simultaneously four jobs editing 10k item records
Look for a memory trend and CPU usage

Summary

Test report for Bulk Edits holdings-app functionality 2023-03-20.

Orchid release works about 40% slower for holdings bulk editing than Nolana. One of the possible root causes of performance degradation could be a too long time to get a preview of changes. MODBULKOPS-86 - Getting issue details... STATUS

It is approximately the same stable as Nolana.

For 1 concurrent job, 100 records can be edited in 1 min 9 s which is 19 s slower compared to Nolana (50 s), and 1000 records editing could be performed in 2 min 54 s which is 40 s slower compared to Nolana(2 min 10 s), and 10k records bulk editing is about 36% slower.
10k records per user, 5 users simultaneously (50k records total) can be uploaded and edited in about 22 minutes which is about 9 min 30 s slower compared to Nolana (about 12 min 25 s). Slowness Could be a result of the changes UXPROD-3842 and MODBULKOPS-86 - Getting issue details... STATUS
The memory utilization of mod-bulk operation increases from 20% to 23% (The service was updated before the test, probably it is reaching a steady state- the memory trend will be investigated in further testing). For all other modules, no memory leaks are suspected.
CPU for all modules did not exceed 56% for all of the tests. Compared to Nolana mod-data-export-worker has no spikes anymore and the average CPU utilization of other modules is approximately the same, except nginx-okapi - which is about 15% higher.
For all records number (100, 1k,10k), and 5 concurrent jobs - RDS CPU utilization did not exceed 41%. Better compared to Nolana(it was up to 50%).

Recommendations & Jiras

More than 50% of jobs with 10k + records FAILED in about 30 min - 1 hour with the error "Connection reset (SocketException)". PERF-334 - Getting issue details... STATUS

Results

Test Runs

1# One (concurrent) Job

Number of records	Orchid (Total Time)	Nolana (Total Time)
100	1 min 9 s	50 sec
1000	2 min 54 s	2 min 10 sec
10k	19 min 33 s	12 min 25 sec
100k	Error in about 43 min - 1 hour: Connection reset (SocketException) PERF-334 - Getting issue details... STATUS	2 hours 17 min

2# Holdings App 10k records 1, 4, and 5 concurrent jobs

Number of concurrent jobs	Duration
10k records for each job
1	19 min 33 s
4	20 min 40 s
5	22 min 2 s

3# Editing four jobs consecutively with 10k holdings records each

Job #	Job duration
1	19 min 23 s
2	19 min 53 s
3	19 min 47 s
4	20 min 5 s

4# 5 Concurrent Holdings Apps jobs

"BARCODE". Records number per 1 user	Orchid (Total Time)	Nolana (Total Time)
100	1 min 9 s	49 sec
1000	3 min 4 s	2 min 25 sec
10k	22 min 2 s	12 min 25 sec
100k	Results are not representative because of Error in about 28-33 min Connection reset (SocketException) PERF-334 - Getting issue details... STATUS	-

* "-" test was not performed due to errors that occurred

Memory usage

For all test runs

The memory utilization of mod-bulk operation increases from 20% to 23% (The service was updated before the test, probably it is reaching a steady state- the memory trend will be investigated in further testing).

The memory of mod-data-export worker was 58 % at the beginning of the tests and grows after the tests finished then goes back to normal and was 55% at

the end of testing (Looks like the work of the Garbage collectors).

Instance CPU utilization

Did not exceed 17%.

Service CPU utilization

CPU for all modules did not exceed 56% for all of the tests.

RDS CPU utilization

Maximum RDS CPU utilization is 41% for 5 concurrent jobs with 10k holdings records.

The more concurrent jobs are running -the higher RDS CPU usage. The maximum number of concurrent jobs will be investigated.

Appendix

Infrastructure

PTF -environment ncp5 [ environment name]

8 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
2 instances of db.r6.xlarge database instances: Writer & reader instances
MSK ptf-kakfa-3 [ kafka configurations]
- 4 kafka.m5.2xlarge brokers in 2 zones
- Apache Kafka version 2.8.0
- EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3

Modules memory and CPU parameters:

Module	SoftLimit	XMX	Revision	Version	desiredCount	CPUUnits	RWSplitEnabled	HardLimit	Metaspace	MaxMetaspaceSize
mod-inventory-storage-b	1952	1440	3	mod-inventory-storage:26.1.0-SNAPSHOT.644	2	1024	False	2208	384	512
mod-inventory-b	2592	1814	7	mod-inventory:20.0.0-SNAPSHOT.392	2	1024	False	2880	384	512
okapi-b	1440	922	1	okapi:5.1.0-SNAPSHOT.1352	3	1024	False	1684	384	512
mod-users-b	896	768	4	mod-users:19.2.0-SNAPSHOT.584	2	128	False	1024	88	128
mod-data-export-worker	2600	2048	3	mod-data-export-worker:3.0.0-SNAPSHOT.104	2	1024	False	3072	384	512
mod-data-export-spring	1844	1292	3	mod-data-export-spring:2.0.0-SNAPSHOT.67	1	256	False	2048	200	256
mod-bulk-operations	3864	0	10	mod-bulk-operations:1.0.2	2	400	False	4096	384	512
mod-notes	896	322	3	mod-notes:5.1.0-SNAPSHOT.245	2	128	False	1024	128	128
mod-agreements	2580	2048	3	mod-agreements:5.6.0-SNAPSHOT.117	2	128	False	3096	384	512
nginx-okapi	896	0	3	nginx-okapi:2022.03.02	2	128	False	1024	0	0

Bulk Edit Holdings App report [Orchid] 20/03/2023