Multiple tenant Data Import test report (Nolana)
IN PROGRESS
- 1 Overview
- 2 Summary
- 3 Results
- 3.1 Baseline tests
- 3.2 Multiple tenant tests
- 3.3 Comparison Multiple tenant tests to baseline
- 3.4 Resources usage
- 3.4.1 Baseline tests
- 3.4.1.1 Resources usage MARC Create
- 3.4.1.1.1 Service Memory usage
- 3.4.1.1.2 Service CPU usage
- 3.4.1.1.3 Instance CPU usage
- 3.4.1.1.4 RDS CPU usage
- 3.4.1.1.5 Kafka resources
- 3.4.1.2 Resources usage MARC Update
- 3.4.1.2.1 Service Memory usage
- 3.4.1.2.2 Service CPU usage
- 3.4.1.2.3 Instance CPU usage
- 3.4.1.2.4 RDS CPU usage
- 3.4.1.1 Resources usage MARC Create
- 3.4.1 Baseline tests
- 3.5 Issues in the logs observed during testing
- 3.6 Infrastructure
Overview
This document contains the results of testing Data Import for MARC Bibliographic records in the Nolana release to detect performance trends for multiple concurrent jobs.https://folio-org.atlassian.net/browse/PERF-387
Summary
For all tenants, we can observe different data import duration: for example, 50k MARC BIB Create tenant fs09000000 - 21min, tenant fs07000001 - 34 min, and tenant fs07000002 - 42 min.https://folio-org.atlassian.net/browse/PERF-414
Data import is not able to work on multiple tenants: When data import starts on the second tenant, both jobs are stuck. Data Import becomes unavailable for several hours.https://folio-org.atlassian.net/browse/PERF-415 Could be related to https://folio-org.atlassian.net/browse/PERF-388
Results
Baseline tests
| Duration tenant fs09000000 | Duration tenant fs07000002 | Duration tenant fs07000001 | |
|---|---|---|---|---|
1K MARC Create | PTF - Create 2 | 40 s | 1 min 19 s | 55 s |
1K MARC Update | PTF - Updates Success - 1 | 35 s | 1 min 24 s | 1 min |
2K MARC Create | PTF - Create 2 | 56 s | 2 min 15 s | 1 min 37 s |
2K MARC Update | PTF - Updates Success - 1 | 58 s | 2 min 6 s | 1 min 44 s |
5K MARC Create | PTF - Create 2 | 2m 8 s | 5 min 19 s | 4 min 14 s |
5K MARC Update | PTF - Updates Success - 1 | 2 min 10 s | 5 min | 3 min 47 s |
10K MARC Create | PTF - Create 2 | 4 min 20 s | 10 min 3 s | 7 min 20 s |
10K MARC Update | PTF - Updates Success - 1 | 4 min 8 s | 8 min 43 s | 8 min 8 s |
25K MARC Create | PTF - Create 2 | 10 min 41 s | 23 min 3 s | 17m 43 s |
25K MARC Update | PTF - Updates Success - 1 | 10 min 40 s | 20 min 47 s | 17 min 56 s |
50K MARC Create | PTF - Create 2 | 21 min 11 s | 42 min 28 s | 34min 40 s |
50K MARC Update | PTF - Updates Success - 1 | 20 min 57 s | 40 min 20 s | 35 min 24 s |
100K MARC Create | PTF - Create 2 | 42 min 35 s | 1 hour 29 min | 1 hour 20 min |
100K MARC Update | PTF - Updates Success - 1 | 41 min 56 s | 1 hour 27 min | 1 hour 20 min |
500K MARC Create | PTF - Create 2 | - | - | - |
500K MARC Create import failed due to 500K records file corruption.
Multiple tenant tests
| Duration tenant fs09000000 | Duration tenant fs07000002 | Duration tenant fs07000001 | |
|---|---|---|---|---|
50K MARC Create (2 concurrent jobs) | PTF - Create 2 | - | ||
50K MARC Update (2 concurrent jobs) | PTF - Updates Success - 1 | - | DNR | DNR |
50K MARC Create (3 concurrent jobs) | PTF - Create 2 | DNR | DNR | DNR |
50K MARC Update (3 concurrent jobs) | PTF - Updates Success - 1 | DNR | DNR | DNR |
Comparison Multiple tenant tests to baseline
| Duration tenant fs09000000 | Baseline tenant fs09000000 | Duration tenant fs07000002 | Baseline tenant fs07000002 | Duration tenant fs07000001 | Baseline tenant fs07000001 | |
|---|---|---|---|---|---|---|---|
50K MARC Create (2 concurrent jobs) | PTF - Create 2 | - | - |
|
| ||
50K MARC Update (2 concurrent jobs) | PTF - Updates Success - 1 | - | - |
|
|
|
|
50K MARC Create (3 concurrent jobs) | PTF - Create 2 | 21 min 11 s |
|
|
|
| |
50K MARC Update (3 concurrent jobs) | PTF - Updates Success - 1 | DNR | 20 min 57 s |
|
|
|
|
Nolana | version |
|---|---|
mod-data-import | 2.6.2 |
mod-data-import-converter-storage | 1.15.2 |
mod-source-record-manager | 3.5.6 |
mod-source-record-storage | 5.5.2 |
mod-inventory | 19.0.2 |
mod-inventory-storage | 25.0.3 |
Resources usage
Baseline tests
Resources usage MARC Create
Service Memory usage
Service CPU usage
Instance CPU usage
RDS CPU usage
Kafka resources
Resources usage MARC Update
Service Memory usage
Service CPU usage
Instance CPU usage
RDS CPU usage
Issues in the logs observed during testing
FieldValue@ingestionTime1673614146585@log054267740449:ncp3-folio-eis@logStreamncp3/mod-data-export-spring/a353ecd6739e469c8e6cf198b667ab0c@message12:49:01 [${FolioLoggingContext:requestid}] [${FolioLoggingContext:tenantid}] [${FolioLoggingContext:userid}] [${FolioLoggingContext:moduleid}] ERROR LogAccessor Exception thrown when sending a message with key='6c32c128-fde4-4754-90c4-3499c0c48fca' and payload='JobCommand(type=DELETE, id=6c32c128-fde4-4754-90c4-3499c0c48fca, name=null, description=null, export...' to topic ncp3.fs09000000.data-export.job.command:@timestamp1673614141676
FieldValue@ingestionTime1673614466212@log054267740449:ncp3-folio-eis@logStreamncp3/mod-inventory/b84baa871ca844fda9259c19b83d00ed@message12:54:24 [] [] [] [] ERROR KafkaConsumerWrapper Error while processing a record - id: 16 subscriptionPattern: SubscriptionDefinition(eventType=DI_INVENTORY_INSTANCE_CREATED, subscriptionPattern=ncp3\.Default\.\w{1,}\.DI_INVENTORY_INSTANCE_CREATED) offset: 32104@timestamp1673614464009
Is related to https://folio-org.atlassian.net/browse/PERF-388
Or the job can be finished with errors:
io.netty.channel.StacklessClosedChannelExceptionInstance --- Created
Holdings ---Discarded
Field | Value |
|---|---|
@ingestionTime | 1673876991708 |
@log | 054267740449:ncp3-folio-eis |
@logStream | ncp3/mod-inventory/b84baa871ca844fda9259c19b83d00ed |
@message | 13:49:49 [] [] [] [] WARN Fetcher [Consumer clientId=consumer-DI_SRS_MARC_BIB_RECORD_MODIFIED.mod-inventory-19.0.2-66, groupId=DI_SRS_MARC_BIB_RECORD_MODIFIED.mod-inventory-19.0.2] Received unknown topic or partition error in fetch for partition ncp3.Default.fs07000001.DI_SRS_MARC_BIB_RECORD_MODIFIED-0 |
@timestamp | 1673876989113 |
Infrastructure
PTF -environment ncp3
9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 instances of db.r6.xlarge database instances, one reader, and one writer
MSK ptf-kakfa-3
4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Modules memory and CPU parameters
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
|---|---|---|---|---|---|---|---|---|
mod-inventory | 19.0.2 | 7 | 2 | 1024 | 2880 | 2592 | 512m | 1814m |
mod-inventory-storage | 25.0.3 | 3 | 2 | 1024 | 2208 (1872 in MG) | 1952 (1684 in MG) | 512m | 1440m |
okapi | 4.14.7 | 1 | 3 | 1024 | 1684 (1512 in MG) | 1440 (1360 in MG) | 512m | 922m |
mod-permissions | 6.2.0 | 8 | 2 | 512 | 1684 | 1544 | 512 | 1024 |
mod-search | 1.8.2 | 2 | 2 | 400 | 2592 | 2480 | 1024 | 1440 |
mod-data-import-cs | 1.15.1 | 1 | 2 | 258 | 1024 | 896 | 128m | 768m |
mod-quick-marc | 2.5.0 | 3 | 1 | 128 | 2288 (2098 in MG) | 2176 (1920 in MG) | 512m | 1664m |
mod-source-record-storage | 5.5.2 | 4 | 2 | 1024 | 1536 (1440 in MG) | 1440 (1296 in MG) | 512m | 908m |
mod-data-import |