Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

IN PROGRESS

...

The purpose for this test is to define how concurrent DI affect duration of DI jobs on the central tenant and to check possible issues during smoke test with 50k DI Create job running concurrently on all 3 tenants.


Ticket: 

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-715

Summary

Data import duration approximately doubling in 10k and 25k tests when increasing the number of concurrent jobs on different tenants. This trend 

Test Runs 

Test #

Scenario

Load levelComment
1DI MARC Bib Create10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants


2DI MARC Bib Update10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants


3DI MARC Bib Create50k concurrently on 3 tenants - smoke test

...

Comparison


Service CPU Utilization

Expand
titleCPU utilization


ServiceCPU CreateCPU Update
mod-inventory-b122.87181.72
mod-di-converter-storage-b78.9475.21
mod-quick-marc-b75.7122.16
nginx-okapi71.7978.33
mod-source-record-storage-b47.3642.14
okapi-b36.9929.78
mod-source-record-manager-b30.4136.98
mod-inventory-storage-b24.8319.45
mod-users-b19.335.61
mod-configuration-b11.692.73
mod-permissions-b9.1918.71
mod-pubsub-b6.976.85
mod-authtoken-b6.513.44
mod-password-validator-b3.272.75
mod-feesfines-b2.292.5
mod-data-import-b1.842.09
mod-circulation-storage-b1.271.65
mod-circulation-b0.330.34
pub-okapi0.230.24


DI Create jobs

Image Added

DI Update jobs

Image Added


Service Memory Utilization

Expand
titleMemory consumption


ServiceMemory CreateMemory Update
mod-inventory-b95.1698.34
mod-permissions-b75.0379.63
mod-source-record-storage-b62.2972.77
mod-users-b61.2359.93
mod-data-import-b61.0268.28
mod-source-record-manager-b47.7654.2
okapi-b41.8442.55
mod-di-converter-storage-b34.6235.22
mod-feesfines-b28.6327.51
mod-quick-marc-b28.3930.48
mod-configuration-b27.5726.51
mod-pubsub-b24.7224.86
mod-authtoken-b21.9320.1
mod-inventory-storage-b17.2118.03
mod-circulation-storage-b17.0416.55
mod-circulation-b10.8811.13
nginx-okapi4.694.69
pub-okapi4.634.46


DI Create jobs

Image Added

DI Update jobs

Image Added

DB CPU Utilization

RDS CPU utilization was 97% for all Create jobs and 94% for Update jobs

Create jobs

Image Added

Update jobs

Image Added

DB Connections

DB connections for 2 tenants Create jobs - 710, for 3 tenants Create jobs - 870

DB connections for 2 tenants Create jobs - 630, for 3 tenants Create jobs - 785

DB connections needed for every additional job processing concurrently on different tenant - 150.

Create jobs

Image Added

Update jobs

Image Added

DB load


Appendix

Errors & Exceptions

...

Expand
titleLogs

pcp1/mod-search

10:59:59 [] [] [] [] WARN KafkaMessageListener Failed to index resource event [eventType: CREATE, tenantId: fs09000000, id: 5cc8ef78-cb05-49fa-8274-1cba1d660aad]

index [pcp1_instance_fs09000000], id [f7aea9b8-614e-4050-9dbd-e2f8a884c06b], message [OpenSearchException[OpenSearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [indices:data/write/bulk[s]] would be [16502737514/15.3gb], which is larger than the limit of [16320875724/15.1gb], real usage: [16499671264/15.3gb], new bytes reserved: [3066250/2.9mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=3103382/2.9mb]]]]

feign.FeignException$Unauthorized: [401 Unauthorized] during [GET] to [http://inventory-view/instances?query=id%3D%3D%28%221e9b752b-6cc3-433b-ae90-cbafdc307cb6%22%29&limit=1] [InventoryViewClient#getInstances(CqlQuery,int)]: [Invalid token]
org.folio.search.exception.SearchOperationException: Failed to perform elasticsearch request [index=pcp1_contributor_fs09000000, type=bulkApi, message: 30,000 milliseconds timeout on connection http-outgoing-265 [ACTIVE]]


Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-294 [ACTIVE]
WARN essageBatchProcessor Failed to process batch, attempting to process resources one by one


feign.RetryableException: timeout executing GET http://inventory-view/instances?query=id%3D%3D%28%22c4ac5388-4b64-4de5-8930-d3be806c1b7f%22%20or%20%229228d6de-c8e3-4e62-85c7-f5b8b45fb649%22%20or%20%2288913517-17dc-4d90-be8b-9560c1f30a01%22%20or%20%220514f12a-680d-415d-b262-ba82f4dd3e76%22%20or%20%2205253541-bf8a-465a-b91d-b2cb4ff0944d%22%20or%20%222f7b4924-8781-4834-b25f-36fc430d8f5d%22%20or%20%22b2dad08a-05e9-41d2-8742-25c98daf7fbe%22%20or%20%22be30bb11-58da-4b06-beea-e204a0823438%22%20or%20%22930b4625-8eb3-492a-bae8-388480364e67%22%20or%20%2223925f8a-efb4-41a2-9002-48292f6419f3%22%20or%20%22765cb2e6-96a4-4b33-99fd-b38011be999f%22%20or%20%22a3666aca-4963-4e94-95ab-dd3d790ffdd3%22%20or%20%2226bb45e4-dfc0-4915-9ec8-0187d334651d%22%20or%20%220c536304-7330-46d2-a19e-9b69ca13591a%22%20or%20%22ee237647-875c-405e-8ce1-fd55f701d83b%22%20or%20%22869ba2aa-f465-42b4-b4ba-b47ccd29d6ac%22%20or%20%221aa2e06e-b647-4e7b-8fa8-9804c65e1dc1%22%20or%20%22a7096f26-0363-49b4-803c-31e5661b12de%22%20or%20%22042395fb-34a7-4412-a78a-d541ef948922%22%20or%20%2273342f96-f5b5-42d6-ab41-1ef121aef0d5%22%20or%20%22fa028d18-e45e-434d-85d2-e8c4e5db9519%22%20or%20%22c0d1a681-66ea-4cc8-bf68-5771cf8a93e9%22%20or%20%22d7ece421-5dae-4519-9d09-100f22b47007%22%20or%20%2225ca20d1-d0e1-414a-b228-b74ceaba2512%22%20or%20%2274fcad26-9ddf-4cf9-8206-f665374a37f3%22%20or%20%221fd756cd-e573-4287-9c3e-c29408ee8709%22%20or%20%22a322007c-f578-4c93-9239-e6558a393710%22%20or%20%22a2ed3841-a9d2-4364-a3fe-939e2bffbe24%22%20or%20%22b31f92f9-027e-40bf-a5b1-bb4e50241a46%22%20or%20%2204175b0b-9dea-403b-911c-82d5f6a2fbe2%22%20or%20%22f4ebc0f9-adb7-4458-a4ae-8d7996b3b4f8%22%20or%20%22e1e6f55c-7720-4a23-a73a-3202746c7c75%22%20or%20%2238a43c32-d2de-454f-b7d7-7725b5bab61e%22%20or%20%22b8f17eec-f61c-4b29-9769-3b8d91a6dae4%22%20or%20%22e93b2a0c-939b-4c31-98f9-c85cb52081eb%22%20or%20%22983fd8c1-cd4a-4087-a4b9-b1c3dd11c08e%22%20or%20%222b8a75db-2ec2-4190-8702-c4dca1067bf6%22%20or%20%22f73f7d21-deab-4af4-9163-7f374f1d56d2%22%20or%20%229819b074-3174-429e-8f7f-1d6312d9630f%22%20or%20%2256743d6d-3df9-4bbf-9495-cc9f2b95e60b%22%20or%20%229f146268-ad02-46b2-8f0d-a5f64fc8579b%22%20or%20%22fb26676f-2d20-4309-8fc5-aa1c09962618%22%20or%20%222b39c2de-6374-46af-b78e-1c83d669c991%22%20or%20%22a7fd3130-6752-40b4-b2a4-cc6f1fee0349%22%20or%20%22e7561b18-c9da-43f8-97d4-9125261de4b6%22%20or%20%221106c6ed-519b-4efa-a73c-deae1dc0570d%22%20or%20%220ea912aa-6107-490f-8855-350c7edc0060%22%20or%20%225ec701f4-af40-4464-b6a4-ba9f14ca7d28%22%20or%20%22e2a15de7-b793-4949-beed-9f56bd9cde9d%22%20or%20%22879210f9-cae8-4202-9dcd-89657a5f8113%22%29&limit=50



pcp1/mod-authtoken
org.folio.auth.authtokenmodule.tokens.TokenValidationException: Access token has expired



Infrastructure

PTF -environment pcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances, writer/reader


    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731


  • MSK tenant
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

...