Data Import Creates + Updates multi tenant with file split enabled

Overview

This document contains the results of testing concurrent Data Import with file splitting feature for MARC Bibliographic records in the Poppy release.

The purpose for this test is to define how concurrent DI affect duration of DI jobs on the central tenant and to check possible issues during smoke test with 50k DI Create job running concurrently on all 3 tenants.


Ticket:  PERF-715 - Getting issue details... STATUS

Summary

Data Import duration of 10k and 25k jobs approximately doubles when increasing the number of concurrent jobs on different tenants. This trend is consistent across the main/first tenant and other tenants.

Smoke test with 50k didn't reveal some issues. Duration for three concurrent DI Create jobs was 3x higher than one DI on the main tenant; this only confirm previous statement about the concurrency effect.

Maximum average CPU utilization was different during create and update jobs. Top two modules during DI Create jobs in mod-inventory-b - 123%, mod-quick-marc-b - 76%, Update jobs mod-inventory-b - 182%, mod-quick-marc-b - 122%.

Memory consumption was almost the same for DI create and update jobs: Nevertheless it was slightly higher for update jobs in mod-inventory-b - 98%, mod-permission-b - 79%, mod-source-record-storage-b - 73%.

RDS CPU utilization was 97% for all Create jobs and 94% for Update jobs

DB connections were higher during DI Create jobs. With 2 tenants Create jobs - 710, for 3 tenants Create jobs - 870

Top long query for failed job on third tenant during DI Create job with 10k- SELECT jsonb,id FROM fs07000002_mod_inventory_storage.instance_holdings_item_view. Average latency- 386455.99 ms/call

Test Runs 

Test #

Scenario

Load level
1 - Concurrent Create importsDI MARC Bib Create10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants
2 - Concurrent Update importsDI MARC Bib Update10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants
3 - Concurrent Create imports ("smoke test") of 50K DI MARC Bib Create50k concurrently on 3 tenants 

Test Results

As the number of concurrent Data Import jobs increases and file size grows, the duration of DI jobs grows proportionally. 

Smoke Test finished successfully for 3 concurrent DI Create jobs of 50K each.

DI Create# of testNumber of concurrent jobs

Main tenant

(fs09000000)

Second tenant

(fs07000001)

Third tenant

(fs07000002)

10KBaseline100:04:56

1200:10:4300:10:37
2300:21:1200:21:0600:20:57 *
25KBaseline100:11:24

3200:23:4400:23:30
4300:37:1100:37:0500:36:58
DI Update




10KBaseline100:06:32

5200:09:4700:11:26
6300:19:0800:19:0600:18:31
25KBaseline100:15:13

7200:30:4900:30:52
8300:47:4700:48:1700:47:54
DI Create (Smoke test)




50K9100:22:31

10301:12:5401:12:4401:12:35

* - Errors occurred only in 10K DI Create jobs running on third tenant during 3 concurrent jobs test. The errors did not reproduce during subsequent tests.

  • io.vertx.core.impl.NoStackTraceThrowable: [{"id":"cf64277b-9945-49a1-93c0-007643c46efe","error":"Timeout for DB_HOST:DB_PORT=db.pcp1.folio-eis.us-east-1:5432","holdingId":"bd17bc47-72eb-480b-8a83-e0a1bc16e0f4"}]
  • java.lang.NullPointerException: Cannot invoke "org.folio.processing.mapping.defaultmapper.processor.parameters.MappingParameters.getLinkingRules()" because "mappingParameters" is null

Service CPU Utilization

 CPU utilization comparison
ServiceCPU CreateCPU Update
mod-inventory-b122.87181.72
mod-di-converter-storage-b78.9475.21
mod-quick-marc-b75.7122.16
nginx-okapi71.7978.33
mod-source-record-storage-b47.3642.14
okapi-b36.9929.78
mod-source-record-manager-b30.4136.98
mod-inventory-storage-b24.8319.45
mod-users-b19.335.61
mod-configuration-b11.692.73
mod-permissions-b9.1918.71
mod-pubsub-b6.976.85
mod-authtoken-b6.513.44
mod-password-validator-b3.272.75
mod-feesfines-b2.292.5
mod-data-import-b1.842.09
mod-circulation-storage-b1.271.65
mod-circulation-b0.330.34
pub-okapi0.230.24

DI Create jobs

DI Update jobs


Service Memory Utilization

 Memory consumption comparison