[Ramsons] [non-ECS] [Data import] Create MARC authority Records
- 1 Overview
- 2 Summary
- 2.1 Recommendations & Jiras
- 2.2 Test Runs
- 3 Test Results and Comparison
- 4 Cluster resource utilization for Test 1
- 5 Cluster resource utilization for Test 2
- 6 Cluster resource utilization for Test 3
- 7 MSK Cluster
- 8 OpenSearch Service
- 8.1 Maximum CPU utilization percentage for all data nodes Test 1.
- 8.2 CPU utilization percentage for the master node Test 1.
- 8.3 Maximum CPU utilization percentage for all data nodes Test 2.
- 8.4 CPU utilization percentage for the master node Test 2.
- 8.5 Maximum CPU utilization percentage for all data nodes Test 3.
- 8.6 CPU utilization percentage for the master node Test 3
- 9 Appendix
- 10 Methodology/Approach
Overview
This document presents performance testing results for Data Import of MARC Bibliographic records using a create job profile in the Ramsons release on Okapi-based non-ECS environments (rcp1). The tests were conducted with Kafka consolidated topics and file-splitting features enabled.
The performance evaluation was carried out across a range of records for a single tenant: 1K, 5K, 10K, 25K, and 50K records. Additionally, we conducted a data import and parallel Check-In/Check-Out test simulating 5 virtual users to assess system behavior under concurrent operations and parallel data import on 3 tenants simultaneously.
Current ticket: https://folio-org.atlassian.net/browse/PERF-971
Summary
Data import tests finished successfully during Test 1 - Test 3
The data import process during Test 1 of MARC bibliographic records using a create job profile in the Ramsons release demonstrates a slight but noteworthy improvement in performance compared to the Quesnelia release. Our analysis indicates an average speed increase of DI Marc Authority of approximately 5%.
The data import and parallel Check-In/Check-Out testing, simulating five virtual users, revealed that the Ramsons release demonstrated slower performance compared to the Quesnelia release when handling imports of 25,000 and 50,000 records. This outcome suggests that during Test 2 on the Ramsons release the background auto vacuum were running and it affected the average CICO response time and last imports(25K and 50K)
The test results indicate that five virtual users (5 VU) for Check-In/Check-Out (CICO) operations do not affect the performance of the data import process.
Recommendations & Jiras
During data-import multi-tenant testing, we observed spices on mod-permission on the fs07-2 tenant, on another tenant, there were no such problems. We need to investigate this problem https://folio-org.atlassian.net/browse/PERF-1074 because this problem was also in another DI multitenant testing reports
In SRM was released a new approach for inserting data in the records journal, using the function on the DB side, we observed that compared to previous reports it used about 50 more AAS, ticket for performance improvement https://folio-org.atlassian.net/browse/MODDATAIMP-1177
Test Runs
Test | Test conditions and short description | Status |
Test 1. | Tenant: fs0900000. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q. 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI | Completed |
Test 2. | Tenant: fs0900000. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI CheckIn-CheckOut 5 Virtual users | Completed |
Test 3. | Parallel, multi-tenant Data import | Completed |
Test Results and Comparison
Test №1
Table 1. - Test with 1k, 10k, 25k, and 50k records files DI started on one tenant fs09000000-(rcp1-00), and comparative results between Quesnelia and Ramsons.
Number of records | DI duration | DI duration | DI duration | DI duration | Time Diff and Perc. Improvement |
|---|---|---|---|---|---|
1,000 | 41 sec | 29 sec | 22 sec | 21 sec | 1 sec, 4.55% |
5,000 | 1min 21s | 1 min 38 sec | 1 min 19 sec | 1 min 04 sec | 15 sec, 18.99% |
10,000 | 2min 53s | 2 min 53 sec | 2 min 36 sec | 2 min 36 sec | 0 sec, 0% |
25000 | 100 | 5 min 42s | 6 min 24 sec | 5 min 55 sec | 29 sec, 7.55% |
50,000 | 11 min 11s | 13 min 48 sec | 11 min 59 sec | 10 min 31 sec | 88 sec, 12.23% |
Test №2
Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K, and 50K started on one tenant fs09000000-(rcp1-00).
Table 2. - Сomparative Baseline Check-In\Check-Out results without Data Import between Quesnelia and Ramsons.
| CICO, Avg time without | CICO, 95% time without | CICO, Avg time without | CICO, 95% time without |
|---|---|---|---|---|
Check-In | 511 ms | 593 ms | 835 ms | 934 ms |
Check-Out | 876 ms | 1117 ms | 1115 ms | 1323 ms |
Table 3. - Сomparative Check-In\Check-Out results between Baseline (Quesnelia) and Check-In\Check-Out plus Data Import (Quesnelia.)
Number of records | DI Duration with CICO | DI Duration | CI time Avg, sec | CO time Avg, sec | Baseline CI | Baseline CO |
|---|---|---|---|---|---|---|
1,000 | 20 sec | 28 sec | 1.064 | 1.238 | +27% | +11% |
5,000 | 1 min 19 sec | 1 min 09 sec | 906 | 1.242 | +8.5% | +11% |
10,000 | 2 min 35 sec | 2 min 27 sec | 974 | 1.239 | +16.6% | +11% |
25,000 | 6 min 26 sec | 6 min 54 sec | 1.105 | 1.368 | +32.3% | +23% |
50,000 | 12 min 47 sec | 15 min 47 sec | 1.331 | 1.548 | +59% | +39% |
Test №3
Table 4. - Duration on parallel multitenant data-import on tenants fs09000000-(rcp1-00) and fs07000001-(rcp1-01)
Tenant | 50K DI | 25K DI |
|---|---|---|
fs0900000 | 36 min 28 sec | 18 min 32 sec |
fs0700001 | 36 min 38 sec | 25 min 26 sec |
Cluster resource utilization for Test 1
Service CPU Utilization
The image shows CPU consumption during Test 1.
Service memory utilization
Service memory utilization remains consistent across all modules.
DB CPU Utilization
Here are the conclusions drawn from the database CPU usage graph:
For 1k records, the maximum CPU usage was approximately 30%.
For 5k records, the maximum CPU usage reached around 50%.
For 10k records, the maximum CPU usage climbed to about 80%.
For both 25k and 50k records, the maximum CPU usage was around 90%.
DB Connections
Database load
Sliced by SQL
Top SQL queries during test 1
Cluster resource utilization for Test 2
The checkIn-CheckOut test started at about 13:20 and finished at 14:20
CICO Response time graph
Service CPU Utilization
The image shows CPU consumption during Test 2
Service memory utilization
Service memory utilization remains consistent across all modules.
DB CPU Utilization
Here are the conclusions drawn from the database CPU usage graph:
For 1k records, the maximum CPU usage was approximately 38%.
For 5k records, the maximum CPU usage reached around 76%.
For 10k records, the maximum CPU usage climbed to about 82%.
For both 25k and 50k records, the maximum CPU usage was around 90%.
DB Connections
In the idle state number of connection ~1100 and during CICO 5VU + 50K DI ~1390
Database load
Sliced by SQL
Top SQL queries during test 2
Cluster resource utilization for Test 3
Service CPU Utilization
The image shows CPU consumption during Test 3
Service memory utilization
Service memory utilization remains consistent across all modules.
DB CPU Utilization
The maximum CPU usage was approximately 83%.
DB Connections
In the idle state number of connections is ~1200 and during test 3 ~ 1587