Data Import series of test to understand a range of duration of each DI test (Orchid)


Overview

This document contains the results of testing Data Import for MARC Bibliographic records in the Orchid release to detect the range of duration of each DI test for ncp5. PERF-606 - Getting issue details... STATUS

Summary

  • Duration for Orchid is increased almost twice due to fixing differences in the database schemas, mostly adding triggers functions. For example, 50K MARC Create took 39 min 27 sec for Orchid compared to 21 min 11 s, and 21 min 37s for Nolana and Morning Glory but for Lotus, it was 32 min 28 s. So we can assume that the trigger was missing for the previous 2 releases in our database.

Results


Profile

Duration ncp5

test1

Duration ncp5

test2

Duration ncp5

test3

Duration ncp5

test4

Duration ncp5

test5

ResultsAVGMax Dev from AVGR/W split enabled *R/W split enabled **

1K MARC Create

PTF - Create 234 sec38 sec30 sec31 sec31 secCompleted33 sec15.8%30 sec29 sec
5K MARC CreatePTF - Create 22 min 27 sec2 min 21 sec4 min 42 sec2 min 21 sec2 min 21 secCompleted2 min 50 sec65.4%2 min 23 sec2 min 3 sec

10K MARC Create 

PTF - Create 24 min 47 sec4 min 34 sec4 min 50 sec4 min 41 sec4 min 41 secCompleted4 min 43 sec2.6%5 min 12 sec3 min 58 sec
25K MARC CreatePTF - Create 211 min 29 sec11 min 27 sec12 min11 min 43 sec12 min 40 secCompleted11 min 52 sec6.7%11 min 45 sec10 min 5 sec
50K MARC CreatePTF - Create 223 min 02 sec22 min 14 sec23 min 40 sec23 min 55 sec 23 min 47 secCompleted23 min 20 sec2.5%23 min 36 sec20 min 46 sec
100K MARC CreatePTF - Create 248 min 36 sec47 min 47 sec47 min 50 sec50 min 37 sec49 minCompleted48 min 46 sec3.8%49 min 28 sec44  min 18 sec
5K MARC UpdatePTF - Updates Success - 1







2 min 48 sec2 min 45 sec
10K MARC UpdatePTF - Updates Success - 1







5 min 23 sec5 min 23 sec
25K MARC UpdatePTF - Updates Success - 1







14 min 12 sec14 min 19 sec
50K MARC UpdatePTF - Updates Success - 1







27 min 52 sec28 min
100K MARC UpdatePTF - Updates Success - 1







57 min 41 sec55 min

 * - enabled for 

  • mod-data-import
  • mod-source-record-storage
  • mod-source-record-manager
  • mod-di-converter-storage

 ** - enabled for:

  • mod-data-import
  • mod-source-record-storage
  • mod-source-record-manager
  • mod-di-converter-storage
  • mod-inventory-storage


Appendix

Infrastructure ncp5

PTF -environment ncp5

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731
  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics


Modules memory and CPU parameters before update

Modules

Version

Task Definition

Running Tasks 

CPU

Memory

MemoryReservation

MaxMetaspaceSize

Xmx

mod-inventory-storage26.0.0122204840963690
512
3076
mod-inventory20.0.4121024288025925121814
mod-source-record-storage5.6.72722048560050005123500
mod-quick-marc3.0.061128228821765121664
mod-source-record-manager3.6.4182
2048
5600
5000
5123500
mod-di-converter-storage2.0.5821281024896128768
mod-data-import2.7.1101256204818445121292
okapi5.0.183102416841440512922
nginx-okapi2023.06.14721281024896--
pub-okapi2023.06.14721281024896-768

Methodology/Approach

Tested 1k, 5k, 10k, 250k, 50k, 100k records files -5 times.

To test Baseline multitenant for DI JMeter scripts were used.