Overview
This document contains the results of testing Data Import for MARC Bibliographic records in the Orchid release to detect the baseline for ocp3. - PERF-662Getting issue details... STATUS
Summary
- Duration for DI correlates with number of the records imported (100k records- 32 min, 250k - 1 hour 33 min, 500k - 3 hours 33 min). Multitenant DI could be performed successfully for up to 9 jobs in parallel. If jobs are big they will start one by one in order for each tenant but processed in parallel on 3 tenants. Small DI (1 record) could be finished faster not in order. Response time for Check-In/Check-Out is prolonged twice (for Check-In from 0.517s to 1.138s, for Check-Out from 0.796s to 1.552s) during DI.
- The increase in memory utilization was due to the scheduled cluster shutdown. no memory leak is suspected for DI modules.
- Average CPU usage for the test with 500k records Created for mod-di-converter-storage was about 462%, and for all other modules did not exceed 150 %. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 400%.
- Approximately DB CPU usage is up to 95%.
Recommendations and Jiras
It is recommended to increase CPU units for mod-di-converter-storage to 512.
Results
Test # | Duration ocp3 | Results | ||
---|---|---|---|---|
1 | 100K MARC Create | PTF - Create 2 | 32-33 minutes | Completed |
1 | 250K MARC Create | PTF - Create 2 | 1 hour 33 min - 1 hour 57 min | Completed |
1 | 500K MARC Create | PTF - Create 2 | 3 hours 33 min | Completed |
2 | Multitenant MARC Create (100k, 50k, and 1 record) | PTF - Create 2 | 3 hours 1 min | Completed |
3 | CI/CO + DI MARC Create (20 users CI/CO, 25k records DI on 3 tenants) | PTF - Create 2 | 24 min | Completed * |
* - One record on one tenant could be discarded with error: io.netty.channel.StacklessClosedChannelException
Test #3 With CI/CO 20 users and DI 25k records on each of the 3 tenants
Test#3 | Duration with DI | Duration without DI |
---|---|---|
Check-In | 1.138 | 0.517 |
Check-Out | 1.552 | 0.796 |
Test#3 | DI Duration with CI/CO | DI Duration without CI/CO* |
---|---|---|
Tenant _1 | 20 min | 14 min (18 min for run 2) |
Tenant _2 | 19 min | 16 min (18 min for run 2) |
Tenant _3 | 16 min | 16 min (15 min for run 2) |
* - Same approach testing DI: 3 DI jobs total on 3 tenants without CI/CO. Start the second job after the first one reaches 30%, and start another job on a third tenant after the first job reaches 60% completion. DI file size: 25k
Memory Utilization
The increase in memory utilization was due to the scheduled cluster shutdown. no memory leak is suspected for DI modules.
MARC BIB CREATE
Test#1 100k, 250k, 500k records DI
Test#2 Multitenant DI (9 concurrent jobs)
Test#3 With CI/CO
Service CPU Utilization
MARC BIB CREATE
Average CPU usage for the test with 500k records Created for mod-di-converter-storage was about 462%, and for all other modules did not exceed 150 %. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 400%.
Test#1 250k, 500k records DI
Test#2 Multitenant
Test#3 With CI/CO
Instance CPU Utilization
Test#1 250k, 500k records DI
Test#2 Multitenant DI (9 concurrent jobs)
RDS CPU Utilization
MARC BIB CREATE
Approximately DB CPU usage is up to 95%
Test#1 250k, 500k records DI
Test#2 Multitenant DI (9 concurrent jobs)
Test#3 With CI/CO
RDS Database Connections
MARC BIB CREATE
For DI job Create- 520 connections count.
Test#1 250k, 500k records DI
Test#2 Multitenant
Test#3 With CI/CO
Appendix
Infrastructure ocp3
Records count :
- tenant0_mod_source_record_storage.marc_records_lb = 9674629
- tenant2_mod_source_record_storage.marc_records_lb = 0
- tenant3_mod_source_record_storage.marc_records_lb = 0
- tenant0_mod_source_record_storage.raw_records_lb = 9604805
- tenant2_mod_source_record_storage.raw_records_lb = 0
- tenant3_mod_source_record_storage.raw_records_lb = 0
- tenant0_mod_source_record_storage.records_lb = 9674677
- tenant2_mod_source_record_storage.records_lb = 0
- tenant3_mod_source_record_storage.records_lb = 0
- tenant0_mod_source_record_storage.marc_indexers = 620042011
- tenant2_mod_source_record_storage.marc_indexers = 0
- tenant3_mod_source_record_storage.marc_indexers = 0
- tenant0_mod_source_record_storage.marc_indexers with field_no 010 = 3285833
- tenant2_mod_source_record_storage.marc_indexers with field_no 010 = 0
- tenant3_mod_source_record_storage.marc_indexers with field_no 010 = 0
- tenant0_mod_source_record_storage.marc_indexers with field_no 035 = 19241844
- tenant2_mod_source_record_storage.marc_indexers with field_no 035 = 0
- tenant3_mod_source_record_storage.marc_indexers with field_no 035 = 0
- tenant0_mod_inventory_storage.authority = 4
- tenant2_mod_inventory_storage.authority = 0
- tenant3_mod_inventory_storage.authority = 0
- tenant0_mod_inventory_storage.holdings_record = 9592559
- tenant2_mod_inventory_storage.holdings_record = 16
- tenant3_mod_inventory_storage.holdings_record = 16
- tenant0_mod_inventory_storage.instance = 9976519
- tenant2_mod_inventory_storage.instance = 32
- tenant3_mod_inventory_storage.instance = 32
- tenant0_mod_inventory_storage.item = 10787893
- tenant2_mod_inventory_storage.item = 19
- tenant3_mod_inventory_storage.item = 19
PTF -environment ocp3
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731 - MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning: - 2 partitions for DI topics
Methodology/Approach
Test 1: Manually tested 100k, 250k, and 500k records files started one by one on one tenant only.
Test 2: Manually tested 100k+50k+1 record files DI started simultaneously on every 3 tenants (9 jobs total).
Test 3: Run CICO on one tenant, DI jobs 3 tenants, including the one that runs CICO. Start the second job after the first one reaches 30%, and start another job on a third tenant after the first job reaches 60% completion. CICO: 20 users, DI file size: 25k