Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...


Info

It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations.


Table of Contents


Overview

This document contains the results of testing Data Import for MARC Bibliographic records in the Nolana release to detect performance trends.

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-341

The figures achieved in PTF performance testing have not been achieved in Nolana Bugfest. Developers are reviewing the results to determinethe causes for the differences, in

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyMODDATAIMP-752

Infrastructure

  • 10 m6i.2xlarge EC2 instances  
  • 2 instances of db.r6.xlarge database instances, one reader and one writer
  • MSK
    • 4 m5.2xlarge brokers in 2 zones 
    • auto.create-topics.enable = true
    • log.retention.minutes=480
    • 2 partitions per DI topics
    • default.replication.factor=3
  • mod-inventory memory
    • 1024 CPU units, 2592MB mem
    • inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
    • inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
    • kafka.consumer.max.poll.records=10
  • mod-inventory-storage
    • 1024 CPU units, 1962MB mem
  • mod-source-record-storage
    • 1024 CPU units, 1440MB mem
  • mod-source-record-manager
    • 1024 CPU units, 3688MB mem
  • mod-data-import
    • 256 CPU units, 1844MB mem
  • mod-data-import-cs 
    • 128 CPU units, 896MB mem

...

  • Data Import in Nolana has more or less the same DI durations as Morning Glory. For instance it's +20 seconds for 10K creation, however it's - 40 s for updates, for 50K records it's +2 minutes on creation and -2 minutes for updates.
  • One issue was detected it's MODSOURMAN-908 This is deadlocks in database which make DI work slow (when issue happens on 50 K duration increases up to 6 hours).
    • After MODSOURMAN-908 was fixed - we were not able to reproduce this deadlock issue.
  • R/W Split Enabled:
    • For most of tests we can see an improvement of DI duration. For example 10K create with R/W split - 3m 43s, without R/W split it's 4m 55 s.
    • We can see that RDS CPU usage on writer node is even higher than it was without read/write split enable.
    • With R/W split for data import creates/updates - reader node took on 15-17% of DB load.
  • MARC BIB Update and Create take less time for Nolana with new version of DI modules*. 
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyPERF-388
    reproduced for 10k MARC BIB Update without Check-In/Check-Out. For job started after 5k MARC BIB Update with less than a minute timeout between jobs.

...

We'll investigate potential memory leak in future. PERF-358

Update: After additional investigation in scope of PERF-358 we can say that there is no memory leaks in any of the modules. For most of them memory will grow up to some point and then stop growing. See details in ticket. 

...

Note: Here is the same growing of memory usage as was in previous tests, ticket to investigate: PERF-358

Update: After additional investigation in scope of PERF-358 we can say that there is no memory leaks in any of the modules. For most of them memory will grow up to some point and then stop growing. See details in ticket. 

...

Note: All tests were performed on "fresh" modules this can explain growing memory usage on modules. They will stop growing memory according to PERF-358


Note: During "create" imports with R/W split - we can see that there was 15% load on reader DB node.

...