OverviewThis document contains the results of testing concurrent Data Import with file splitting feature for MARC Bibliographic records in the Poppy release.
The purpose for this test is to define how concurrent DI affect duration of DI jobs on the central tenant and to check possible issues during smoke test with 50k DI Create job running concurrently on all 3 tenants.
Ticket:
PERF-715
-
Getting issue details...
STATUS
Summary
Data Import duration of 10k and 25k jobs approximately doubles when increasing the number of concurrent jobs on different tenants. This trend is consistent across the main/first tenant and other tenants.
Smoke test with 50k didn't reveal some issues. Duration for three concurrent DI Create jobs was 3x higher than one DI on the main tenant; this only confirm previous statement about the concurrency effect.
Maximum average CPU utilization was different during create and update jobs. Top two modules during DI Create jobs in mod-inventory-b - 123%, mod-quick-marc-b - 76%, Update jobs mod-inventory-b - 182%, mod-quick-marc-b - 122%.
Memory consumption was almost the same for DI create and update jobs: Nevertheless it was slightly higher for update jobs in mod-inventory-b - 98%, mod-permission-b - 79%, mod-source-record-storage-b - 73%.
RDS CPU utilization was 97% for all Create jobs and 94% for Update jobs
DB connections were higher during DI Create jobs. With 2 tenants Create jobs - 710, for 3 tenants Create jobs - 870
Top long query for failed job on third tenant during DI Create job with 10k- SELECT jsonb,id FROM fs07000002_mod_inventory_storage.instance_holdings_item_view. Average latency- 386455.99 ms/call
Test Runs
Test # | Scenario | Load level |
---|
1 - Concurrent Create imports | DI MARC Bib Create | 10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants |
2 - Concurrent Update imports | DI MARC Bib Update | 10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants |
3 - Concurrent Create imports ("smoke test") of 50K | DI MARC Bib Create | 50k concurrently on 3 tenants |
Test Results
As the number of concurrent Data Import jobs increases and file size grows, the duration of DI jobs grows proportionally.
Smoke Test finished successfully for 3 concurrent DI Create jobs of 50K each.
DI Create | # of test | Number of concurrent jobs | Main tenant (fs09000000) | | Third tenant (fs07000002) |
---|
10K | Baseline | 1 | 00:04:56 |
|
|
1 | 2 | 00:10:43 | 00:10:37 |
|
2 | 3 | 00:21:12 | 00:21:06 | 00:20:57 * |
25K | Baseline | 1 | 00:11:24 |
|
|
3 | 2 | 00:23:44 | 00:23:30 |
|
4 | 3 | 00:37:11 | 00:37:05 | 00:36:58 |
DI Update |
|
|
|
|
|
10K | Baseline | 1 | 00:06:32 |
|
|
5 | 2 | 00:09:47 | 00:11:26 |
|
6 | 3 | 00:19:08 | 00:19:06 | 00:18:31 |
25K | Baseline | 1 | 00:15:13 |
|
|
7 | 2 | 00:30:49 | 00:30:52 |
|
8 | 3 | 00:47:47 | 00:48:17 | 00:47:54 |
DI Create (Smoke test) |
|
|
|
|
|
50K | 9 | 1 | 00:22:31 |
|
|
10 | 3 | 01:12:54 | 01:12:44 | 01:12:35 |
* - Errors occurred only in 10K DI Create jobs running on third tenant during 3 concurrent jobs test. The errors did not reproduce during subsequent tests.
- io.vertx.core.impl.NoStackTraceThrowable: [{"id":"cf64277b-9945-49a1-93c0-007643c46efe","error":"Timeout for DB_HOST:DB_PORT=db.pcp1.folio-eis.us-east-1:5432","holdingId":"bd17bc47-72eb-480b-8a83-e0a1bc16e0f4"}]
- java.lang.NullPointerException: Cannot invoke "org.folio.processing.mapping.defaultmapper.processor.parameters.MappingParameters.getLinkingRules()" because "mappingParameters" is null
Service CPU Utilization
CPU utilization comparison
Service | CPU Create | CPU Update |
---|
mod-inventory-b | 122.87 | 181.72 |
mod-di-converter-storage-b | 78.94 | 75.21 |
mod-quick-marc-b | 75.7 | 122.16 |
nginx-okapi | 71.79 | 78.33 |
mod-source-record-storage-b | 47.36 | 42.14 |
okapi-b | 36.99 | 29.78 |
mod-source-record-manager-b | 30.41 | 36.98 |
mod-inventory-storage-b | 24.83 | 19.45 |
mod-users-b | 19.33 | 5.61 |
mod-configuration-b | 11.69 | 2.73 |
mod-permissions-b | 9.19 | 18.71 |
mod-pubsub-b | 6.97 | 6.85 |
mod-authtoken-b | 6.51 | 3.44 |
mod-password-validator-b | 3.27 | 2.75 |
mod-feesfines-b | 2.29 | 2.5 |
mod-data-import-b | 1.84 | 2.09 |
mod-circulation-storage-b | 1.27 | 1.65 |
mod-circulation-b | 0.33 | 0.34 |
pub-okapi | 0.23 | 0.24 |
DI Create jobs
DI Update jobs
Service Memory Utilization
Memory consumption comparison
Service | Memory Create | Memory Update |
---|
mod-inventory-b | 95.16 | 98.34 |
mod-permissions-b | 75.03 | 79.63 |
mod-source-record-storage-b | 62.29 | 72.77 |
mod-users-b | 61.23 | 59.93 |
mod-data-import-b | 61.02 | 68.28 |
mod-source-record-manager-b | 47.76 | 54.2 |
okapi-b | 41.84 | 42.55 |
mod-di-converter-storage-b | 34.62 | 35.22 |
mod-feesfines-b | 28.63 | 27.51 |
mod-quick-marc-b | 28.39 | 30.48 |
mod-configuration-b | 27.57 | 26.51 |
mod-pubsub-b | 24.72 | 24.86 |
mod-authtoken-b | 21.93 | 20.1 |
mod-inventory-storage-b | 17.21 | 18.03 |
mod-circulation-storage-b | 17.04 | 16.55 |
mod-circulation-b | 10.88 | 11.13 |
nginx-okapi | 4.69 | 4.69 |
pub-okapi | 4.63 | 4.46 |
DI Create jobs
DI Update jobs
DB CPU Utilization
RDS CPU utilization was 97% for all Create jobs and 94% for Update jobs
Create jobs
Update jobs
DB Connections
Create jobs DB connections for 2 tenants - 710, for 3 tenants - 870
Update jobs DB connections for 2 tenants - 630, for 3 tenants - 785
DB connections needed for every additional job processing concurrently on different tenant - 150.
DB Connections for Create jobs
DB Connections for Update jobs
DB load
Create jobs
Update jobs
Appendix
Errors & Exceptions
During successfully finished tests exceptions were observed:
Logs
failure in bulk execution - 186 errors during all update jobs, >4000 errors during create jobs
10:59:59 [] [] [] [] WARN KafkaMessageListener Failed to index resource event [eventType: CREATE, tenantId: fs09000000, id: 5cc8ef78-cb05-49fa-8274-1cba1d660aad] |
|
message [OpenSearchException[OpenSearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [indices:data/write/bulk[s]] would be [16502737514/15.3gb], which is larger than the limit of [16320875724/15.1gb], real usage: [16499671264/15.3gb], new bytes reserved: [3066250/2.9mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=3103382/2.9mb]]]]
feign.FeignException$Unauthorized: [401 Unauthorized] during [GET] to [http://inventory-view/instances?query=id%3D%3D%28%221e9b752b-6cc3-433b-ae90-cbafdc307cb6%22%29&limit=1] [InventoryViewClient#getInstances(CqlQuery,int)]: [Invalid token] |
org.folio.search.exception.SearchOperationException: Failed to perform elasticsearch request [index=pcp1_contributor_fs09000000, type=bulkApi, message: 30,000 milliseconds timeout on connection http-outgoing-265 [ACTIVE]]
Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-294 [ACTIVE] |
WARN essageBatchProcessor Failed to process batch, attempting to process resources one by one
org.folio.auth.authtokenmodule.tokens.TokenValidationException: Access token has expired
|
number of errors - 23400. The errors happen only during DI in fs07000002 tenant
| filter @logStream like "pcp1/mod-authtoken"
| filter @message like "ERROR FilterApi"
13:48:00 [595516/users] [fs07000002] [] [mod-authtoken] ERROR FilterApi Unable to retrieve permissions for system-user: User does not exist: 8cc96687-ea63-44cb-ab5f-a73bc6985324 request took 7 ms
Infrastructure
Module | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize |
pcp1-pvt |
mod-remote-storage | 10(11)* | 3.0.0 | 2 | 4920 | 4472 | 1024 | 3960 | 512 | 512 |
mod-data-import | 18(20)* | 3.0.7 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 |
mod-authtoken | 13(16)* | 2.14.1 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 |
mod-configuration | 9(10)* | 5.9.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-users-bl | 9(10)* | 7.6.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 |
mod-inventory-storage | 12(15)* | 27.0.3(27.0.4)* | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 |
mod-circulation-storage | 12(14)* | 17.1.3(17.1.7)* | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 |
mod-source-record-storage | 15(18)* | 5.7.3(5.7.5)* | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 |
mod-inventory | 11(14)* | 20.1.3(20.1.7)* | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 |
mod-di-converter-storage | 15(18)* | 2.1.2(2.1.5)* | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-circulation | 12(14)* | 24.0.8(24.0.11)* | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 |
mod-pubsub | 11(13)* | 2.11.2(2.11.3)* | 2 | 1536 | 1440 | 1024 | 922 | 384 | 512 |
mod-patron-blocks | 9(10)* | 1.9.0 | 2 | 1024 | 896 | 1024 | 768 | 88 | 128 |
mod-source-record-manager | 14(17)* | 3.7.4(3.7.8)* | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 |
mod-quick-marc | 9(11)* | 5.0.0(5.0.1)* | 1 | 2288 | 2176 | 128 | 1664 | 384 | 512 |
nginx-okapi | 9 | 2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 |
okapi-b | 11 | 5.1.2 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 |
mod-feesfines | 10(11)* | 19.0.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
pub-okapi | 9 | 2023.06.14 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 |
All modules
Module | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize |
---|
pcp1-pvt |
|
|
|
|
|
|
|
|
|
Fri Mar 15 17:12:45 UTC 2024 |
|
|
|
|
|
|
|
|
|
mod-remote-storage | 11 | mod-remote-storage:3.0.0 | 2 | 4920 | 4472 | 1024 | 3960 | 512 | 512 |
mod-ncip | 10 | mod-ncip:1.14.4 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-finance-storage | 10 | mod-finance-storage:8.5.0 | 2 | 1024 | 896 | 1024 | 700 | 88 | 128 |
mod-agreements | 10 | mod-agreements:6.0.2 | 2 | 1592 | 1488 | 128 | 0 | 0 | 0 |
mod-ebsconet | 10 | mod-ebsconet:2.1.1 | 2 | 1248 | 1024 | 128 | 700 | 128 | 256 |
edge-sip2 | 8 | edge-sip2:3.1.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-organizations | 10 | mod-organizations:1.8.0 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 |
mod-settings | 11 | mod-settings:1.0.2 | 2 | 1024 | 896 | 200 | 768 | 88 | 128 |
edge-dematic | 10 | edge-dematic:2.1.0 | 1 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-data-import | 20 | mod-data-import:3.0.7 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 |
mod-search | 20 | mod-search:3.0.5 | 2 | 2592 | 2480 | 2048 | 1440 | 512 | 1024 |
mod-tags | 10 | mod-tags:2.1.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-authtoken | 16 | mod-authtoken:2.14.1 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 |
edge-courses | 2 | edge-courses:1.3.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-notify | 10 | mod-notify:3.1.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-inventory-update | 10 | mod-inventory-update:3.2.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-configuration | 10 | mod-configuration:5.9.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-orders-storage | 10 | mod-orders-storage:13.6.0 | 2 | 1024 | 896 | 512 | 700 | 88 | 128 |
edge-caiasoft | 10 | edge-caiasoft:2.1.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-login-saml | 18 | mod-login-saml:2.7.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-erm-usage-harvester | 11 | mod-erm-usage-harvester:4.4.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-password-validator | 10 | mod-password-validator:3.1.0 | 2 | 1440 | 1298 | 128 | 768 | 384 | 512 |
mod-licenses | 10 | mod-licenses:5.0.2 | 2 | 2480 | 2312 | 128 | 1792 | 384 | 512 |
mod-gobi | 10 | mod-gobi:2.7.1 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 |
mod-fqm-manager | 9 | mod-fqm-manager:1.1.0-SNAPSHOT.1078 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-bulk-operations | 9 | mod-bulk-operations:1.1.7 | 2 | 3072 | 2600 | 1024 | 1536 | 384 | 512 |
mod-graphql | 16 | mod-graphql:1.12.0 | 0 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-finance | 10 | mod-finance:4.8.0 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 |
mod-erm-usage | 13 | mod-erm-usage:4.6.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-copycat | 10 | mod-copycat:1.5.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-lists | 8 | mod-lists:1.1.0-SNAPSHOT.1261 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-entities-links | 15 | mod-entities-links:2.0.4 | 2 | 2592 | 2480 | 400 | 1440 | 0 | 1024 |
mod-permissions | 47 | mod-permissions:6.5.0-SNAPSHOT.369 | 2 | 1684 | 1544 | 512 | 1024 | 384 | 512 |
pub-edge | 9 | pub-edge:2023.06.14 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 |
mod-orders | 10 | mod-orders:12.7.1 | 2 | 2048 | 1440 | 1024 | 1024 | 384 | 512 |
edge-patron | 10 | edge-patron:5.0.0 | 2 | 1024 | 896 | 256 | 768 | 88 | 128 |
edge-ncip | 11 | edge-ncip:1.9.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-users-bl | 10 | mod-users-bl:7.6.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 |
mod-inventory-storage | 15 | mod-inventory-storage:27.0.4 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 |
mod-invoice | 10 | mod-invoice:5.7.2 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 |
mod-user-import | 10 | mod-user-import:3.8.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-sender | 10 | mod-sender:1.11.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
edge-oai-pmh | 8 | edge-oai-pmh:2.7.1 | 2 | 1512 | 1360 | 1024 | 1440 | 384 | 512 |
mod-data-export-worker | 10 | mod-data-export-worker:3.1.2 | 2 | 3072 | 2800 | 1024 | 2048 | 384 | 512 |
mod-rtac | 10 | mod-rtac:3.5.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-circulation-storage | 14 | mod-circulation-storage:17.1.7 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 |
mod-calendar | 10 | mod-calendar:2.5.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-source-record-storage | 18 | mod-source-record-storage:5.7.5 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 |
mod-event-config | 10 | mod-event-config:2.6.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-courses | 10 | mod-courses:1.4.8 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-inventory | 15 | mod-inventory:20.1.8 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 |
mod-email | 10 | mod-email:1.16.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-di-converter-storage | 18 | mod-di-converter-storage:2.1.5 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-circulation | 14 | mod-circulation:24.0.11 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 |
mod-pubsub | 13 | mod-pubsub:2.11.3 | 2 | 1536 | 1440 | 1024 | 922 | 384 | 512 |
edge-orders | 10 | edge-orders:2.9.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
edge-rtac | 7 | edge-rtac:2.6.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-template-engine | 10 | mod-template-engine:1.19.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-users | 34 | mod-users:19.3.0-SNAPSHOT.677 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-patron-blocks | 10 | mod-patron-blocks:1.9.0 | 2 | 1024 | 896 | 1024 | 768 | 88 | 128 |
edge-fqm | 21 | edge-fqm:1.0.1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-audit | 10 | mod-audit:2.8.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-source-record-manager | 17 | mod-source-record-manager:3.7.8 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 |
nginx-edge | 9 | nginx-edge:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 |
mod-quick-marc | 11 | mod-quick-marc:5.0.1 | 1 | 2288 | 2176 | 128 | 1664 | 384 | 512 |
nginx-okapi | 9 | nginx-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 |
okapi-b | 11 | okapi:5.1.2 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 |
mod-feesfines | 11 | mod-feesfines:19.0.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-invoice-storage | 10 | mod-invoice-storage:5.7.0 | 2 | 1872 | 1536 | 1024 | 1024 | 384 | 512 |
mod-service-interaction | 10 | mod-service-interaction:3.0.2 | 2 | 2048 | 1844 | 256 | 1290 | 384 | 512 |
mod-data-export | 12 | mod-data-export:4.8.7 | 1 | 1024 | 896 | 1024 | 768 | 88 | 128 |
mod-patron | 10 | mod-patron:6.0.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-oai-pmh | 5 | mod-oai-pmh:3.12.8 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 |
edge-connexion | 10 | edge-connexion:1.1.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-kb-ebsco-java | 10 | mod-kb-ebsco-java:4.0.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
mod-notes | 10 | mod-notes:5.1.0 | 2 | 1024 | 896 | 128 | 952 | 384 | 512 |
mod-organizations-storage | 10 | mod-organizations-storage:4.6.0 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 |
mod-data-export-spring | 12 | mod-data-export-spring:3.0.2 | 1 | 2048 | 1844 | 256 | 1536 | 384 | 512 |
mod-login | 10 | mod-login:7.10.1 | 2 | 1440 | 1298 | 1024 | 768 | 384 | 512 |
pub-okapi | 9 | pub-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 |
mod-eusage-reports | 13 | mod-eusage-reports:2.0.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
Methodology/Approach
DI tests were started from UI concurrently with 1 job on each tenant, fs09000000 first and then on fs07000001 so in total two jobs on two tenants. Then 1 job on three tenants concurrently with several seconds delay - started with tenant fs09000000, second tenant -fs07000001 and third tenant - fs07000002.
DI Create jobs were conducted with 10k and 25k first. Then DI Update jobs.