In Progress

Table of Contents

Overview

...

In Progress

Table of Contents

Overview

The Data Import Task Force (DITF) implements a feature that splits large input MARC files into smaller ones, resulting in smaller jobs, so that the big files could be imported and be imported consistently. This document contains the results of performance tests on the feature and also an analysis the feature's performance with respect to the baseline tests. The following Jiras were implemented.

...

One record on one tenant could be discarded with error: io.netty.channel.StacklessClosedChannelException.
Jira Legacy
server System Jira
serverId 01505d01-b853-3c2e-90f1-ee9b165564fc
key MODDATAIMP-748
Reproduces in both cases with and without splitting feature enabled in at least 30% of test runs with 500k record files and multitenant testing.
During the new Data Import splitting feature testing, items for update were discarded with the error: io.vertx.core.impl.NoStackTraceThrowable: Cannot get actual Item by id: org.folio.inventory.exceptions.InternalServerErrorException: Access for user 'data-import-system-user' (f3486d35-f7f7-4a69-bcd0-d8e5a35cb292) requires permission: inventory-storage.items.item.get. Less than 1% of records could be discarded due to missing permission for 'data-import-system-user'. Permission was not added automatically during the service deployment. I added permission manually to the database and the error does not occur anymore.
Jira Legacy
server System Jira
serverId 01505d01-b853-3c2e-90f1-ee9b165564fc
key MODDATAIMP-930
UI issue, when canceled or completed with error Job progress bar cannot be deleted from the screen.
Jira Legacy
server System Jira
serverId 01505d01-b853-3c2e-90f1-ee9b165564fc
key MODDATAIMP-929
Usage:
- Should not use less than 1000 for RECORDS_PER_SPLIT_FILE. The system is stable enough to ingest 1000 records consistently and smaller amounts will incur more overheads, resulting in longer jobs' durations.
- When toggling the file-splitting feature, mod-source-record-storage, mod-source-record-manager's tasks need to be restarted.
- Keep in mind about the Kafka broker's disk size (as bigger jobs - up to 500K - can be run now), consecutive jobs may use up the disk quickly because the messages' retention time currently is set at 8 hours. For example with 300GB disk size, consecutive jobs of 250K, 500K, 500K sizes will exhaust the disk.
More CPU could be allocated to mod-inventory and mod-di-converter-storage

...

** - up to 10 items were discarded with the error: io.vertx.core.impl.NoStackTraceThrowable: Cannot get actual Item by id: org.folio.inventory.exceptions.InternalServerErrorException: Access for user 'data-import-system-user' (f3486d35-f7f7-4a69-bcd0-d8e5a35cb292) requires permission: inventory-storage.items.item.get. Less than 1% of records could be discarded due to missing permission for 'data-import-system-user'. Permission was not added automatically during the service deployment. I added permission manually to the database and the error does not occur anymore.

Jira Legacy

server	System Jira
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	MODDATAIMP-930

...

With CI/CO 20 users and DI 25k records on each of the 3 tenants Splitting Feature Disabled

ocp3-mod-data-import:12

Image Modified

Data Import Robustness Enhancement
Jira Legacy
server System Jira
serverId 01505d01-b853-3c2e-90f1-ee9b165564fc
key PERF-646

Number of concurrent tenants	Job profile	500	Status	1K	Status	5K	Status	10K	Status	Test with Split disabled	Status
25K records		RECORDS_PER_SPLIT_FILE
1 Tenant test#1	PTF - Create 2	12 minutes 55 seconds	Completed	11 minutes 48 seconds	Completed	09 minutes 21 seconds	Completed	9 minutes 2 sec	Completed	10 min 35 sec	Completed
1 Tenant test#2	PTF - Create 2	10 minutes 31 seconds	Completed	09 minutes 32 seconds	Completed	9 minutes 6 sec	Completed	9 minutes 14 sec	Completed	11 min 27 sec	Completed
2 Tenants test#1	PTF - Create 2	19 minutes 29 seconds	Completed	15 minutes 47 seconds	Completed	16 minutes 15 seconds	Completed	16 minutes 3 seconds	Completed	19 min 18 sec	Completed
2 Tenants test#2	PTF - Create 2	18 minutes 19 seconds	Completed	15 minutes 47 seconds	Completed	16 minutes 11 sec	Completed	16 min 41 sec	Completed	20 min 33 sec	Completed
3 Tenants test#1	PTF - Create 2	24 minutes 15 seconds	Completed	25 minutes 47 seconds	Completed	23 minutes	Completed	23 minutes 27 seconds	Completed	30 min 2 sec	Completed
3 Tenants test#2	PTF - Create 2	24 minutes 38 seconds	Completed	23 minutes 28 seconds	Completed	23 minutes 2 sec	Completed	23 minutes 26 seconds	Completed	T1 - "00:33:35.1" Error T2 - "01:23:36.144" T3 - "01:16:26.391"	Completed with error and long proccesing*

* on the first tenantproccesing stoped wit error " LOGS in progress "
it caused the spike of CPU utilization on Kafka (tenant cluster) up to 94%
Image Added

Instance CPU Utilization

...

Memory utilization rich maximal value for mod-source-record-storage-b 88% and for mod-source-record-manager-b 85%.

Test 2. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 10K, 2 runs for each test.

...

Test 2. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 10K, 2 runs for each test.

Image Added

RDS CPU Utilization

Test 1. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 500, 2 runs for each test. Maximal CPU Utilization = 95%

...

Test 2. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 10K, 2 runs for each test. Maximal CPU Utilization = 94%

Image Modified

RDS Database Connections

Test 1. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 500, 2 runs for each test.

...

tenant0_mod_source_record_storage.marc_records_lb = 9674629
tenant2_mod_source_record_storage.marc_records_lb = 0
tenant3_mod_source_record_storage.marc_records_lb = 0
tenant0_mod_source_record_storage.raw_records_lb = 9604805
tenant2_mod_source_record_storage.raw_records_lb = 0
tenant3_mod_source_record_storage.raw_records_lb = 0
tenant0_mod_source_record_storage.records_lb = 9674677
tenant2_mod_source_record_storage.records_lb = 0
tenant3_mod_source_record_storage.records_lb = 0
tenant0_mod_source_record_storage.marc_indexers = 620042011
tenant2_mod_source_record_storage.marc_indexers = 0
tenant3_mod_source_record_storage.marc_indexers = 0
tenant0_mod_source_record_storage.marc_indexers with field_no 010 = 3285833
tenant2_mod_source_record_storage.marc_indexers with field_no 010 = 0
tenant3_mod_source_record_storage.marc_indexers with field_no 010 = 0
tenant0_mod_source_record_storage.marc_indexers with field_no 035 = 19241844
tenant2_mod_source_record_storage.marc_indexers with field_no 035 = 0
tenant3_mod_source_record_storage.marc_indexers with field_no 035 = 0
tenant0_mod_inventory_storage.authority = 4
tenant2_mod_inventory_storage.authority = 0
tenant3_mod_inventory_storage.authority = 0
tenant0_mod_inventory_storage.holdings_record = 9592559
tenant2_mod_inventory_storage.holdings_record = 16
tenant3_mod_inventory_storage.holdings_record = 16
tenant0_mod_inventory_storage.instance = 9976519
tenant2_mod_inventory_storage.instance = 32
tenant3_mod_inventory_storage.instance = 32
tenant0_mod_inventory_storage.item = 10787893
tenant2_mod_inventory_storage.item = 19
tenant3_mod_inventory_storage.item = 19

PTF -environment ocp3

10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections
R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731
MSK ptf-kakfa-3
4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Kafka topics partitioning: - 2 partitions for DI topics

...

Version	Old Version 56	New Version 57
Changes made by	Mykhailo Petryshyn	Mykhailo Petryshyn
Saved on	Sep 28, 2023	Sep 28, 2023

Versions Compared

Key

In Progress

Overview

In Progress

Overview

With CI/CO 20 users and DI 25k records on each of the 3 tenants Splitting Feature Disabled