Overview
This document contains the results of testing Data Import Splitting Feature for MARC Bibliographic records in the Orchid release to detect the baseline for ocp3.
Splitting feature documentation Detailed Release Notes for Data Import Splitting Feature
Summary
- Duration for DI correlates with number of the records imported (100k records- 38 min, 250k - 1 hour 32 min, 500k - 3 hours 29 min).
- ---------Multitenant DI could be performed successfully for up to 9 jobs in parallel. If jobs are big they will start one by one in order for each tenant but processed in parallel on 3 tenants. Small DI (1 record) could be finished faster not in order. Duration for Check-In/Check-Out is prolonged twice during DI.
- This has memory utilization increasing due to previous modules restarting (everyday cluster shot down process) no memory leak is suspected for all of the modules.
- Average CPU usage for mod-inventory -was 144%, mod-di-converter-storage was about 107%, and for all other modules did not exceed 100 %. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 260%.
- Approximately DB CPU usage is up to 95%.
Recommendations and Jiras
Results
Test # | Splitting Feature Enabled | Results | Splitting Feature Disabled | Results | Before Splitting Feature released | Results | ||
---|---|---|---|---|---|---|---|---|
1 | 100K MARC Create | PTF - Create 2 | 37 min -39 min | Completed | 40 min | Completed | 32-33 minutes | Completed |
1 | 250K MARC Create | PTF - Create 2 | 1 hour 32 min | Completed | 1 hour 41 min | Completed | 1 hour 33 min - 1 hour 57 min | Completed |
1 | 500K MARC Create | PTF - Create 2 | 3 hours 29 min | Completed* | 3 hours 55 min | Completed | 3 hours 33 min | Completed |
2 | Multitenant MARC Create (100k, 50k, and 1 record) | PTF - Create 2 | 2 hours 40 min | Completed* | 3 hours 1 min | Completed | ||
3 | CI/CO + DI MARC Create (20 users CI/CO, 25k records DI on 3 tenants) | PTF - Create 2 | 24 min | Completed * | ||||
4 | 100K MARC Update (Create new file) | PTF - Updates Success - 1 | 58 min 25 sec 57 min 19 sec | Completed | 1 hour 3 min | Completed | - | - |
4 | 250K MARC Update | PTF - Updates Success - 1 | 2 hours 2 min ** 2 hours 12 min | Completed with errors ** Completed | 1 hour 53 min | Completed | - | - |
4 | 500K MARC Update | PTF - Updates Success - 1 | 4 hours 43 min 4 hours 38 minutes | Completed Completed | 5 hour 59 min | Completed | - | - |
* - One record on one tenant could be discarded with error: io.netty.channel.StacklessClosedChannelException ?
** - up to 10 items were discarded with the error: io.vertx.core.impl.NoStackTraceThrowable: Cannot get actual Item by id: org.folio.inventory.exceptions.InternalServerErrorException: Access for user 'data-import-system-user' (f3486d35-f7f7-4a69-bcd0-d8e5a35cb292) requires permission: inventory-storage.items.item.get
Data Import Robustness Enhancement
25K records | RECORDS_PER_SPLIT_FILE | ||||||||
Number of concurent tenants | Job profile | 500 | 1K | 5K | 10K | ||||
---|---|---|---|---|---|---|---|---|---|
1 Tenant | PTF - Create 2 | 12 minutes 55 seconds | Completed | ||||||
10 minutes 31 seconds | Completed | ||||||||
2 Tenants | PTF - Create 2 | 19 minutes 29 seconds | Completed | ||||||
18 minutes 19 seconds | Completed | ||||||||
3 Tenants | PTF - Create 2 | 24 minutes 15 seconds | Completed | ||||||
24 minutes 38 seconds | Completed |
Test #3 With CI/CO 20 users and DI 25k records on each of the 3 tenants
Test#3 | Duration with DI | Duration without DI |
---|---|---|
Check-In | 1.138 | 0.517 |
Check-Out | 1.552 | 0.796 |
Test#3 | DI Duration with CI/CO | DI Duration without CI/CO* |
---|---|---|
Tenant _1 | 20 min | 14 min (18 min for run 2) |
Tenant _2 | 19 min | 16 min (18 min for run 2) |
Tenant _3 | 16 min | 16 min (15 min for run 2) |
* - Same approach testing DI: 3 DI jobs total on 3 tenants without CI/CO. Start the second job after the first one reaches 30%, and start another job on a third tenant after the first job reaches 60% completion. DI file size: 25k
Memory Utilization
This has memory utilization increasing due to previous modules restarting (everyday cluster shot down process) no memory leak is suspected for DI modules.
MARC BIB CREATE
Test#1 100k, 250k, 500k records DI
Test#2 Multitenant DI (9 concurrent jobs)
Test#3 With CI/CO
Service CPU Utilization
MARC BIB CREATE
Average CPU usage for mod-inventory -was 144%, mod-di-converter-storage was about 107%, and for all other modules did not exceed 100 %. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 260%.
Test#1 500k records DI
Test#2 Multitenant
Test#3 With CI/CO
Instance CPU Utilization
Test#1 500k records DI
Test#2 Multitenant DI (9 concurrent jobs)
RDS CPU Utilization
MARC BIB CREATE
Approximately DB CPU usage is up to 95%
Test#1 500k records DI
Test#2 Multitenant DI (9 concurrent jobs)
Test#3 With CI/CO
RDS Database Connections
MARC BIB CREATE
For DI job Create- 535 connections count.
Test#1 500k records DI
Test#2 Multitenant
Test#3 With CI/CO
Appendix
Infrastructure ocp3
Records count :
- tenant0_mod_source_record_storage.marc_records_lb = 9674629
- tenant2_mod_source_record_storage.marc_records_lb = 0
- tenant3_mod_source_record_storage.marc_records_lb = 0
- tenant0_mod_source_record_storage.raw_records_lb = 9604805
- tenant2_mod_source_record_storage.raw_records_lb = 0
- tenant3_mod_source_record_storage.raw_records_lb = 0
- tenant0_mod_source_record_storage.records_lb = 9674677
- tenant2_mod_source_record_storage.records_lb = 0
- tenant3_mod_source_record_storage.records_lb = 0
- tenant0_mod_source_record_storage.marc_indexers = 620042011
- tenant2_mod_source_record_storage.marc_indexers = 0
- tenant3_mod_source_record_storage.marc_indexers = 0
- tenant0_mod_source_record_storage.marc_indexers with field_no 010 = 3285833
- tenant2_mod_source_record_storage.marc_indexers with field_no 010 = 0
- tenant3_mod_source_record_storage.marc_indexers with field_no 010 = 0
- tenant0_mod_source_record_storage.marc_indexers with field_no 035 = 19241844
- tenant2_mod_source_record_storage.marc_indexers with field_no 035 = 0
- tenant3_mod_source_record_storage.marc_indexers with field_no 035 = 0
- tenant0_mod_inventory_storage.authority = 4
- tenant2_mod_inventory_storage.authority = 0
- tenant3_mod_inventory_storage.authority = 0
- tenant0_mod_inventory_storage.holdings_record = 9592559
- tenant2_mod_inventory_storage.holdings_record = 16
- tenant3_mod_inventory_storage.holdings_record = 16
- tenant0_mod_inventory_storage.instance = 9976519
- tenant2_mod_inventory_storage.instance = 32
- tenant3_mod_inventory_storage.instance = 32
- tenant0_mod_inventory_storage.item = 10787893
- tenant2_mod_inventory_storage.item = 19
- tenant3_mod_inventory_storage.item = 19
PTF -environment ocp3
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731 - MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning: - 2 partitions for DI topics
Before Splitting Feature released
Module ocp3-pvt Mon Sep 11 09:33:28 UTC 2023 | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | R/W split enabled |
---|---|---|---|---|---|---|---|---|---|---|
mod-remote-storage | 13 | mod-remote-storage:2.0.3 | 2 | 4920 | 4472 | 1024 | 3960 | 512 | 512 | false |
mod-agreements | 8 | mod-agreements:5.5.2 | 2 | 1592 | 1488 | 128 | 968 | 384 | 512 | false |
mod-data-import | 7 | mod-data-import:2.7.1 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 | false |
mod-search | 30 | mod-search:2.0.1 | 2 | 2592 | 2480 | 2048 | 1440 | 512 | 1024 | false |
mod-authtoken | 7 | mod-authtoken:2.13.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | false |
mod-configuration | 7 | mod-configuration:5.9.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-inventory-storage | 1 | mod-inventory-storage:26.1.0-SNAPSHOT.665 | 0 | 2208 | 1952 | 1024 | 1440 | 384 | 512 | false |
mod-circulation-storage | 15 | mod-circulation-storage:16.0.1 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | false |
mod-source-record-storage | 11 | mod-source-record-storage:5.6.7 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | false |
mod-calendar | 7 | mod-calendar:2.4.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-inventory | 12 | mod-inventory:20.0.6 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | false |
mod-circulation | 9 | mod-circulation:23.5.6 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | false |
mod-di-converter-storage | 8 | mod-di-converter-storage:2.0.5 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-pubsub | 8 | mod-pubsub:2.9.1 | 2 | 1536 | 1440 | 1024 | 922 | 384 | 512 | false |
mod-users | 8 | mod-users:19.1.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-patron-blocks | 8 | mod-patron-blocks:1.8.0 | 2 | 1024 | 896 | 1024 | 768 | 88 | 128 | false |
mod-source-record-manager | 9 | mod-source-record-manager:3.6.4 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | false |
nginx-edge | 7 | nginx-edge:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | false |
mod-quick-marc | 7 | mod-quick-marc:3.0.0 | 1 | 2288 | 2176 | 128 | 1664 | 384 | 512 | false |
nginx-okapi | 7 | nginx-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | false |
okapi-b | 8 | okapi:5.0.1 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | false |
mod-feesfines | 7 | mod-feesfines:18.2.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-patron | 7 | mod-patron:5.5.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-notes | 7 | mod-notes:5.0.1 | 2 | 1024 | 896 | 128 | 952 | 384 | 512 | false |
pub-okapi | 7 | pub-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 | false |
Service versions for Splitting Feature test
Module ocp3-pvt Mon Sep 25 12:43:06 UTC 2023 | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | R/W split enabled |
---|---|---|---|---|---|---|---|---|---|---|
mod-data-import | 10 | mod-data-import:2.7.2-SNAPSHOT.137 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 | false |
mod-search | 30 | mod-search:2.0.1 | 2 | 2592 | 2480 | 2048 | 1440 | 512 | 1024 | false |
mod-configuration | 8 | mod-configuration:5.9.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-bulk-operations | 7 | mod-bulk-operations:1.0.6 | 2 | 3072 | 2600 | 1024 | 1536 | 384 | 512 | false |
mod-inventory-storage | 1 | mod-inventory-storage:26.1.0-SNAPSHOT.665 | 0 | 2208 | 1952 | 1024 | 1440 | 384 | 512 | false |
mod-circulation-storage | 15 | mod-circulation-storage:16.0.1 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | false |
mod-source-record-storage | 12 | mod-source-record-storage:5.6.7 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | false |
mod-calendar | 7 | mod-calendar:2.4.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-inventory | 12 | mod-inventory:20.0.6 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | false |
mod-circulation | 9 | mod-circulation:23.5.6 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | false |
mod-di-converter-storage | 8 | mod-di-converter-storage:2.0.5 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-pubsub | 9 | mod-pubsub:2.9.1 | 2 | 1536 | 1440 | 1024 | 922 | 384 | 512 | false |
mod-users | 9 | mod-users:19.1.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-patron-blocks | 9 | mod-patron-blocks:1.8.0 | 2 | 1024 | 896 | 1024 | 768 | 88 | 128 | false |
mod-source-record-manager | 12 | mod-source-record-manager:3.6.5-SNAPSHOT.245 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | false |
mod-quick-marc | 7 | mod-quick-marc:3.0.0 | 1 | 2288 | 2176 | 128 | 1664 | 384 | 512 | false |
nginx-okapi | 7 | nginx-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | false |
okapi-b | 8 | okapi:5.0.1 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | false |
mod-feesfines | 8 | mod-feesfines:18.2.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | false |
mod-notes | 7 | mod-notes:5.0.1 | 2 | 1024 | 896 | 128 | 952 | 384 | 512 | false |
pub-okapi | 7 | pub-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 | false |
Methodology/Approach
To set splitting feature: Detailed Release Notes for Data Import Splitting Feature
Test 1: Manually tested 100k, 250k, and 500k records files started one by one on one tenant only.
Test 2: Manually tested 100k+50k+1 record files DI started simultaneously on every 3 tenants (9 jobs total).
Test 3: Run CICO on one tenant, DI jobs 3 tenants, including the one that runs CICO. Start the second job after the first one reaches 30%, and start another job on a third tenant after the first job reaches 60% completion. CICO: 20 users, DI file size: 25k