Overview
This document contains the results of testing Data Import in Kiwi with OL enabled + increase number of partition in Kafka MQ Topics.
Infrastructure
- 6 m5.xlarge EC2 instances
- 2 instances of db.r6.xlarge database instances, one reader and one writer
- MSK
- 4 m5.2xlarge brokers in 2 zones
- auto.create-topics.enable = true
- log.retention.minutes=120
- mod-inventory memory
- 256 CPU units, 1814MB mem
- inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
- inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
- kafka.consumer.max.poll.records=10
- mod-inventory-storage
- 128 CPU units, 544MB mem
- mod-source-record-storage
- 128 CPU units, 908MB mem
- mod-source-record-manager
- 128 CPU units, 1292MB mem
- mod-data-import
- 128 CPU units, 1024MB mem
Software versions
- mod-data-import:2.2.0
- mod-inventory:18.0.4
- mod-inventory-storage:22.0.2-optimistic-locking.559
- mod-source-record-storage:5.2.5
- mod-source-record-manager:3.2.6
Results
Tests performed:
KIWI | KIWI (with OL) | KIWI with partitions N# 2 | KIWI with partitions N# 4 | ||
---|---|---|---|---|---|
5K MARC Create | PTF - Create 2 | 5 min, 8 min | 8 min | 5 min | 5,7 min |
5K MARC Update | PTF - Updates Success - 1 | 11 min, 13 min | 6 min | 7,6 min | 6 min |
10K MARC Create | PTF - Create 2 | 11 min , 14 min | 12 min | 10,12 min | 16 min |
10K MARC Update | PTF - Updates Success - 1 | 22 min, 24 min | 15 min | 11 min | failed |
25K MARC Create | PTF - Create 2 | 23 mins, 25 mins, 26 mins | 24 min | 23,26 min | 25 min |
25K MARC Update | PTF - Updates Success - 1 | 1 hour 20 mins (completed with errors) *, 56 mins | 40 min | failed | failed |
50K MARC Create | PTF - Create 2 | Completed with errors, 1 hr 40 mins | 43 min | failed | failed |
50K Update | PTF - Updates Success - 1 | 2 hr 32 mins (job stuck at 76% completion) | 1hr 4min | failed | failed |
With an increase in the number of partitions, there is no noticeable change in the performance of the service, however, negative trends were observed - an increase in the number of errors, more often the data import procedures fell.
number of partitions - 2
this table shows the results of a group of sequential data import tests for 2 partitions. In the case of errors, the number of missing entities in the database was determined
start time | end time | #instances from DB |
---|---|---|
CREATE 5,000 recordsBegan 8:20 AM | 1/28/2022, 8:26 AM | |
CREATE10,000 recordsBegan 8:48 AM | 1/28/2022, 9:00 AM | |
update 10,000 recordsBegan 9:17 AM | FAILED | |
CREATE 25,000 recordsBegan 9:37 AM Completed with errors | 1/28/2022, 10:04 AM | fs09000000_mod_inventory_storage.item - 24996 fs09000000_mod_inventory_storage.holdings_record - 24996 fs09000000_mod_inventory_storage.instance - 25000 fs09000000_mod_source_record_storage.records_lb - 25000 |
restart mods and clean Kafka MQ | ||
create 25,000 recordsBegan 10:31 AM today | 1/28/2022, 10:57 AM | |
restart mods and clean Kafka MQ | ||
50,000 recordsBegan 11:18 AM Completed with errors | 1/28/2022, 12:43 PM | fs09000000_mod_inventory_storage.item 49284 fs09000000_mod_inventory_storage.holdings_record 48684 fs09000000_mod_source_record_storage.records_lb 50000 fs09000000_mod_inventory_storage.instance 50000 |
In terms of dynamic characteristics - CPU load, memory - no changes compared to 1 partition, fails are caused by features and possibly bugs in processing Kafka MQ by data import modules: Inventory, source-record-storage
memory usage