[Ramsons] [ECS] [Data import] Create MARC authority Records
- 1 Overview
- 2 Summary
- 2.1 Recommendations & Jiras
- 2.2 Test Runs
- 3 Test Results and Comparison
- 4 Cluster resource utilization for Test 1
- 5 Cluster resource utilization for Test 2
- 6 Cluster resource utilization for Test 3
- 7 MSK Cluster
- 8 OpenSearch Service
- 8.1 Maximum CPU utilization percentage for all data nodes, Test 1.
- 8.2 CPU utilization percentage for the master node Test 1.
- 8.3 Maximum CPU utilization percentage for all data nodes Test 2.
- 8.4 CPU utilization percentage for the master node Test 2.
- 8.5
- 8.6 Maximum CPU utilization percentage for all data nodes Test 3.
- 8.7 CPU utilization percentage for the master node Test 3
- 9 Appendix
- 10 Methodology/Approach
Overview
This document presents performance testing results for Data Import of MARC Authority records using a Create job profile in the Ramsons release on Okapi-based ECS environments (RCON). The tests were conducted with Kafka consolidated topics and file-splitting features enabled.
The performance evaluation was carried out across a range of records for a single tenant: 1K, 5K, 10K, 25K, and 50K records. Additionally, we conducted a Data Import and parallel Check-In/Check-Out test simulating 5 virtual users to assess system behavior under concurrent operations and parallel data import on 3 tenants.
Current ticket: PERF-979: [Ramsons] [ECS] [Data import] Create MARC authority RecordsClosed
Previous report: [Quesnelia] [ECS] [Data import] Create MARC authority Records
Summary
Data Import tests finished successfully during Test 1 - Test 3
The Data Import process during Test 1 of MARC bibliographic records using a Create job profile in the Ramsons release demonstrates a slight but noteworthy improvement in performance compared to the Quesnelia release(Table 1).
The Data Import and parallel Check-In/Check-Out testing, simulating five virtual users, revealed that the Ramsons release demonstrated better performance compared to Quesnelia.
The test results indicate that five virtual users (5 VU) for Check-In/Check-Out (CICO) operations do not affect the performance of the data import process, even vice versa, the duration of the DI has slightly decreased
Response time of CI and CO transactions increased proportionally with the increase in the number of importing records(Table 2).
The parallel 50К data-import on 3 tenants was successful, but the duration increased by 1.5-3 times compared to one DI on one tenant(Table 3).
Mod-source-record-manager has a new approach for inserting data in the records journal, using the function on the DB side, we observed that compared to previous results, this version results in about 50 more AAS. But according to the testing results this problem did not lead to a deterioration of the DI process.
Recommendations & Jiras
During Data Import multi-tenant testing, we observed spikes on mod-permission on the fs07-2 tenant, on another tenant, there were no such problems. We need to investigate this problem PERF-1074: Invetigates CPU and memory spikes on mod-permissionOpen because this problem was also in another DI multitenant testing reports
Tickets for performance improvement on mod-source-record-manager, “inserting data in the records journal“ MODSOURMAN-1294: Data-import. Slow DB query [SELECT insert_journal_records($1::jsonb[])]Closed
Test Runs
Test | Test conditions and short description | Status |
Test 1. | Tenant: cs00000int. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI | Completed |
Test 2. | Tenant: cs00000int_001. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI CheckIn-CheckOut 5 Virtual users | Completed |
Test 3. | Parallel, multi-tenant Data import | Completed |
Test Results and Comparison
Test №1
Table 1. - Test with 1k, 10k, 25k, and 50k records files DI started on one tenant cs00000int, and comparative results between Quesnelia and Ramsons.
Number of records | % creates | DI duration | DI duration | DI duration Orchid | DI duration | DI duration | DI duration | Time Diff and Perc. Improvement |
---|---|---|---|---|---|---|---|---|
1,000 | 100 | 24 s | 27 s | 41 sec | 29 sec | 25 sec | 27 sec | 2 sec, 8% |
5,000 | 100 | 1 min 21 s | 1 min 15 s | 1min 21s | 1 min 38 sec | 1 min 23 sec | 1 min 24 sec | 1 sec, 1.2% |
10,000 | 100 | 2 min 32 s | 2 min 31 s | 2min 53s | 2 min 53 sec | 2 min 43 sec | 2 min 38 sec | 5 sec, 3.1% |
25000 | 100 | 11 min 14 s | 7 min 7 s | 5 min 42s | 6 min 24 sec | 6 min 27 sec | 5 min 24 sec | 1 min 24 sec, 16.3% |
50,000 | 100 | 22 min | 11 min 24 s | 11 min 11s | 13 min 48 sec | 11 min 45 sec | 9 min 42 sec | 2 min 03 sec, 17.4% |
Test 2. DI Central tenant 1k-5K-10K-22K-50K + CI/CO 5VU.
Table 2. - Сomparative Baseline Check-In\Check-Out results without Data Import between Quesnelia and Ramsons.
Number of records | DI Duration with CICO Poppy | DI Duration with CICO | DI Duration with CICO | CI Avg time | CI Avg time | CI, Avg time without DI | CO time Avg | CO time Avg | CO, Avg time without |
---|---|---|---|---|---|---|---|---|---|
1,000 | 35 sec | 21 sec | 17 sec | 0.870 sec | 0.642 sec |
0.616 sec
| 1.361 sec | 1.231 sec |
1.187 sec
|
5,000 | 1 min 41 sec | 1 min 09 sec | 57 sec | 0.878 sec | 0.655 sec | 1.772 sec | 1.243 sec | ||
10,000 | 3 min 4 sec | 2 min 17 se | 1 min 47 sec | 0.955 sec | 0.671 sec | 1.905 sec | 1.261 sec | ||
25,000 | 6 min 32 sec | 6 min 20 sec | 4 min 01 sec | 0.970 sec | 0.691 sec | 1.920 sec | 1.339 sec | ||
50,000 | 13 min 48 sec | 13 min 49 sec | 09 min 13 sec | 1.040 sec | 0.796 sec | 1.907 sec | 1.585 sec |
Test №3
Table 3. - Duration on parallel multitenant data-import on tenants cs00000int, cs00000int_0001 and cs00000int_0002
Tenant | 50K DI |
---|---|
Central - cs00000int | 27 min 03 sec |
College- cs00000int_0001 | 27 min 18 sec |
Professional- cs00000int_0002 | 15 min 02 sec |
Cluster resource utilization for Test 1
Service CPU Utilization
The image shows CPU consumption during Test 1.
Service memory utilization
Service memory utilization remains consistent across all modules.
DB CPU Utilization
Here are the conclusions drawn from the database CPU usage graph:
For 1k records, the maximum CPU usage was approximately 35%.
For 5k records, the maximum CPU usage reached around 72%.
For 10k records, the maximum CPU usage climbed to about 92%.
For both 25k and 50k records, the maximum CPU usage was around 93%.
DB Connections
Database load
Sliced by SQL
Top SQL queries during test 1
Cluster resource utilization for Test 2
The checkIn-CheckOut test started at about 15:30 and finished at 16:25
CICO Response time graph
Response time and throughput were stable during the 1-hour CICO test with 5 VU. Error rate ~0.02%
Service CPU Utilization
The image shows CPU consumption during Test 2
Service memory utilization
Service memory utilization remains consistent across all modules.
DB CPU Utilization
Here are the conclusions drawn from the database CPU usage graph:
For 1k records, the maximum CPU usage was approximately 28%.
For 5k records, the maximum CPU usage reached around 76%.
For 10k records, the maximum CPU usage climbed to about 86%.
For both 25k and 50k records, the maximum CPU usage was around 86%.
DB Connections
In the idle state number of connection ~1100 and during CICO 5VU + 50K DI ~1520
Database load
Sliced by SQL
Top SQL queries during test 2
Cluster resource utilization for Test 3
Service CPU Utilization
The image shows CPU consumption during Test 3
Service memory utilization
Service memory utilization remains consistent across all modules.
DB CPU Utilization
The maximum CPU usage was approximately 83%.
DB Connections
In the idle state, the number of connections is ~1200 and during test 3 ~ 1587
Database load
Sliced by SQL
Top SQL queries during test 3
MSK Cluster
MSK Cluster resource utilization for Test 1
CPU (User) usage by the broker reaches a maximum 64 % during 50k DI
Disk usage by broker
MSK Cluster resource utilization for Test 2
CPU (User) usage by broker reaches maximum of 63% during 50k and 25k DI and CICO
Disk usage by broker
MSK Cluster resource utilization for Test 3
CPU (User) usage by broker reaches a maximum 65 % during 50k DI and CICO
Disk usage by broker
OpenSearch Service
Maximum CPU utilization percentage for all data nodes, Test 1.
CPU utilization percentage for the master node Test 1.
Maximum CPU utilization percentage for all data nodes Test 2.
CPU utilization percentage for the master node Test 2.
Maximum CPU utilization percentage for all data nodes Test 3.
CPU utilization percentage for the master node Test 3
Appendix
Infrastructure
PTF -environment RCON
11 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
db.r6.xlarge database instances, writer
MSK fse-test
4 kafka.m7g.xlarge brokers in 2 zones
Apache Kafka version 3.7.x (KRaft mode)
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
OpenSearch 2.13 ptf-test cluster
r6g.2xlarge.search 4 data nodes
r6g.large.search 3 dedicated master nodes
Cluster Resources - rcon-pvt
Inventory size
Methodology/Approach
DI tests scenario a data import job profile that creates new MARC authority records for non-matches (Job Profile: KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q) were started from UI on Ramsons(RCON ) env with file splitting features enabled on a non-ecs environment.
Action for non-matches: Create MARC authority record
Test setTest 1: Manually tested 1k, 10k, 25k, and 50k records files DI started on one tenant(cs00000int) only.
Test 2: Manually tested 1k, 10k, 25k, and 50k records files DI started on one tenant(cs00000_0001_int) only plus Check-in and Checkout (CICO) for 5 concurrent users.
Test 3: Manually tested 50k records files DI started on 3 tenants concurrently
To get data-import durations, SQL query was used
SELECT (completed_date-started_date) as duration, *
FROM {tenant}_mod_source_record_manager.job_execution
where subordination_type = 'COMPOSITE_PARENT'
and job_profile_name ='KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q'
order by started_date desc
limit 15