Data Import Test report (Lotus)

It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations.

Overview

This document contains the results of testing Data Import in Lotus to detect performance trends. 

Infrastructure

  • 6 m5.xlarge EC2 instances 
  • 2 instances of db.r6.xlarge database instances, one reader and one writer
  • MSK
    • 4 m5.2xlarge brokers in 2 zones
    • auto.create-topics.enable = true
    • log.retention.minutes=480
  • mod-inventory memory
    • 256 CPU units, 1814MB mem
    • inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
    • inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
    • kafka.consumer.max.poll.records=10
  • mod-inventory-storage
    • 128 CPU units, 544MB mem
  • mod-source-record-storage
    • 128 CPU units, 908MB mem
  • mod-source-record-manager
    • 128 CPU units, 1292MB mem
  • mod-data-import
    • 128 CPU units, 1024MB mem

Software versions

  • mod-data-import v2.4.0
  • mod-source-record-manager v3.3.0
  • mod-source-record-storage v5.3.0
  • mod-inventory v18.1.0
  • mod-inventory-storage v23.0.0


Results

Summary

Lotus release is more stable and faster than Kiwi. We was able to run imports up to 100K including "update success 1" job profile. 

All of jobs in table below were performed one by one in a row without containers restart and errors. 



Profile

Duration Lotus

Lotus rerunDuration KIWI
1K MARC CreatePTF - Create 21 min 9 s
DNR
1K MARC CreatePTF - Updates Success - 11 min 30 s
DNR
2K MARC CreatePTF - Create 21 min 34 s
DNR
2K MARC CreatePTF - Updates Success - 11 min 54 s
DNR

5K MARC Create

PTF - Create 2

3 min 54 s


5 min, 8 min

5K MARC Update

PTF - Updates Success - 1

4 min 12 s


11 min, 13 min

10K MARC Create 

PTF - Create 2

6 min 45 s

8 min 12 s

11 min , 14 min

10K MARC Update

PTF - Updates Success - 1

8 min 4 s


22 min, 24 min

25K MARC CreatePTF - Create 216 min 8s20 min 3 s23 mins, 25 mins, 26 mins
25K MARC UpdatePTF - Updates Success - 119 min 50s
1 hour 20 mins (completed with errors) *, 56 mins
50K MARC CreatePTF - Create 232 min 28 s40 min 40 sCompleted with errors, 1 hr 40 mins
50K UpdatePTF - Updates Success - 139 min 5 s
2 hr 32 mins (job stuck at 76% completion)
100K MARC CreatePTF - Create 21 hr 11 min1 hr 23 minDNR
100K UpdatePTF - Updates Success - 11 hr 19 min
DNR
500K MARC CreatePTF - Create 27 hr 4 min (Completed with errors)
DNR



Resources usage

 

CPU usage shows supposed trend without spikes. In each test CPU behaves as expected.






NOTE: This time we can't see spikes of events cache topic. All imports become more stable and fast in Lotus release.


500K Create analysis

Job has status "Completed with Errors" and it's finished (didn't stack)


Records created: 

SRSInstancesHoldingsItems
500000499992499992499992





CPU usage shows stable trend as it supposed to be, without sudden spikes and anomalies






ERRORS

mod-inventory-storage


23:55:36 978559/instance-storage fs09000000 9eb67301-6f6e-468f-9b1a-6134dc39a684 mod_inventory_storage ERROR Conn ERROR: Cannot update record af8bcc5f-2aa1-49f3-9287-3a01bb118e75 because it has been changed (optimistic locking): Stored _version is 3, _version of request is 2 (23F09)
io.vertx.pgclient.PgException: ERROR: Cannot update record af8bcc5f-2aa1-49f3-9287-3a01bb118e75 because it has been changed (optimistic locking): Stored _version is 3, _version of request is 2 (23F09)
at io.vertx.pgclient.impl.codec.ErrorResponse.toException(ErrorResponse.java:31) ~?
at io.vertx.pgclient.impl.codec.QueryCommandBaseCodec.handleErrorResponse(QueryCommandBaseCodec.java:57) ~?
at io.vertx.pgclient.impl.codec.ExtendedQueryCommandCodec.handleErrorResponse(ExtendedQueryCommandCodec.java:90) ~?
at io.vertx.pgclient.impl.codec.PgDecoder.decodeError(PgDecoder.java:246) ~?
at io.vertx.pgclient.impl.codec.PgDecoder.decodeMessage(PgDecoder.java:132) ?
at io.vertx.pgclient.impl.codec.PgDecoder.channelRead(PgDecoder.java:112) ?
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) ?
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ?
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ?
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ?
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ?
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ?
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ?
at io.netty.channel.DefaultChannelPipeline.fire