Data Import test report (Morning Glory)

It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations.

Overview

This document contains the results of testing Data Import in Morning Glory to detect performance trends. PERF-265

Infrastructure

  • 10 m6i.2xlarge EC2 instances  (changed. In Lotus it was m5.xlarge)
  • 2 instances of db.r6.xlarge database instances, one reader and one writer
  • MSK
    • 4 m5.2xlarge brokers in 2 zones
    • auto.create-topics.enable = true
    • log.retention.minutes=120
    • 2 partitions per DI topics
  • mod-inventory memory
    • 1024 CPU units, 2592MB mem
    • inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
    • inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
    • kafka.consumer.max.poll.records=10
  • mod-inventory-storage
    • 1024 CPU units, 1684MB mem
  • mod-source-record-storage
    • 1024 CPU units, 1296MB mem
  • mod-source-record-manager
    • 1024 CPU units, 1844MB mem
  • mod-data-import
    • 256 CPU units, 1844MB mem
  • mod-data-import-cs 
    • 128 CPU units, 896MB mem


Infrastructure comparison with MG vs Lotus

Please note that infrastructure has being migrated from 6 m5.xlarge EC2 instances to 10 m6i.2xlarge. Key differences below:

InstanceCPURAM
m5.xlarge416 
m6i.2xlarge832


Differences in modules memory and CPU parameters

ModuleLotusMorning Glory
CPU RAMCPURAM
mod-inventory2561814MB10242592MB
mod-inventory-storage128544MB10241684MB
mod-SRS128908MB10241296MB
mod-SRM1281292MB10241844MB
mod-data-import1281024MB2561844MB
mod-data-import-cs--128896MB


Software versions

  • mod-data-import v2.5.0
  • mod-data-import-converter-storage v1.14.0
  • mod-source-record-manager v3.4.0
  • mod-source-record-storage v5.4.0
  • mod-inventory v18.2.0
  • mod-inventory-storage v24.0.0


Results

Summary

Morning Glory release is faster than lotus (with taking into account that most of memory and CPU parameters per modules containers has being changed).

So far we can only compare results of PTF-Create-2 job profile, while Update-success-2 is not available on our Morning-Glory env. 




Profile

Duration

Morning Glory

Duration Lotus

delta
1K MARC CreatePTF - Create 250 s1 min 9 s38%

1K MARC

Update

PTF - Updates Success - 139s1 min 30 s
2K MARC CreatePTF - Create 21 min 2s1 min 34 s51%
2K MARC UpdatePTF - Updates Success - 11 min 11 s1 min 54 s

5K MARC Create

PTF - Create 22 min 20s

3 min 54 s

67%
5K MARC UpdatePTF - Updates Success - 13 min 4 s4 min 12 s

10K MARC Create 

PTF - Create 24 min 33 s

6 min 45 s

48%
10K MARC Update PTF - Updates Success - 15 min 29 s8 min 4 s
25K MARC CreatePTF - Create 210 min 55 s16 min 8s47%
25K MARC UpdatePTF - Updates Success - 113 min 37 s19 min 50s
50K MARC CreatePTF - Create 221 min 37 s32 min 28 s

50%

50K MARC

Update

PTF - Updates Success - 126 min 10 s39 min 5 s
100K MARC CreatePTF - Create 244 min 4 s1 hr 11 min86%

100K MARC

Update

PTF - Updates Success - 155 min 33 s1 hr 19 min
500K MARC CreatePTF - Create 2

3 hr 55 min

Completed with errors

7 hr 4 min (Completed with errors)


Resources usage

Note: Service CPU utilisation is much lower than it was in Lotus release. It's affected by changing of CPU and Memory parameters.


Note:

mod-source-record-manager memory consumption started to grow severely after restarting (and adding additional partition to kafka topics). However after establishing all needed connections it staed on te same level for all next tests.

Note: 

Instance CPU usage is lower comparing to Lotus as well because of changing instance types for Morning Glory release. (4VCPU vs 8 VCPU).


Note: RDS CPU usage is more or less the same for both releases and it's ±80%.




Update success 1 resource usage