Data Import MARC Authorities (Lotus Snapshot)

It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations.


PERF-210 - Getting issue details... STATUS

UIDATIMP-1033 - Getting issue details... STATUS


Overview 

This document contains the results of testing Data Import MARC Authorities in pre-lotus to detect performance trends. 

This document contains only stand alone MARC Authorities imports without background activities (Check-in/Check-out). Check-in/Check-out testing wasn't performed in parallel to imports due to issues with Check-in/Check-out workflow (possibly because of snapshot versions of modules)

Infrastructure

  • 6 m5.xlarge EC2 instances 
  • 2 instances of db.r6.xlarge database instances, one reader and one writer
  • MSK
    • 4 m5.2xlarge brokers in 2 zones
    • auto.create-topics.enable = true
    • log.retention.minutes=120
  • mod-inventory memory
    • 256 CPU units, 1814MB mem
    • inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
    • inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
    • kafka.consumer.max.poll.records=10
  • mod-inventory-storage
    • 128 CPU units, 544MB mem
  • mod-source-record-storage
    • 128 CPU units, 908MB mem
  • mod-source-record-manager
    • 128 CPU units, 1292MB mem
  • mod-data-import
    • 128 CPU units, 1024MB mem


Software versions

All of this snapshot versions were taken in the week of January 10

  • mod-data-import-2.3.0-SNAPSHOT.220
  • mod-data-import-converter-storage-1.13.0-SNAPSHOT.183
  • mod-source-record-storage-5.3.0-SNAPSHOT.378
  • mod-source-record-manager-3.3.0-SNAPSHOT.526
  • mod-inventory-18.1.0-SNAPSHOT.475
  • mod-inventory-storage-22.1.0-SNAPSHOT.650
  • mod-search-1.6.0-SNAPSHOT.172
  • mod-quick-marc-2.3.0-SNAPSHOT.145
  • folio/marc-authorities 1.0.100034 (folio_marc-authorities-1.0.100034)


Results


ProfileMorning Glory Duration
1KDefault - Create SRS MARC Authority

40 sec

5K

Default - Create SRS MARC Authority1 min 14 sec
10KDefault - Create SRS MARC Authority2 min 32 sec
25KDefault - Create SRS MARC Authority6 min
50KDefault - Create SRS MARC Authority12 min

User Story Acceptance Criteria :

    Answering questions:
  • What is the recommended maximum file size a user can upload to create MARC authority records?    
    • Maximum file size tried - 371K, however there is problems with this particular file (chunk issue). Maximum file that passed successfully - 50K 
  • How many MARC authority records can be created in 30 minutes?
    • ±100K . all of tests above were performed in a row. 
  • How many MARC authority records can be created in an hour?
    • -----
  • How long will it take to create 1,000 MARC authority records via data import?
    • 40 sec
  • How long will it take to create 5,000 MARC authority records via data import?
    • 1 min 14 sec
  • How long will it take to create 10,000 MARC authority records via data import?
    • 2 min 32 sec
  • How long will it take to create 20,000 MARC authority records via data import
    • <6 min

Resource Usages


Notable observations: 

  • More or less resource usages are the same as for KIWI release Data Import Test Report (Kiwi)#5K-10KImports.
  • With part of files we've got error: [vert.x-worker-thread-10] ERROR rdChunksKafkaHandler [72160511eqId] RecordCollection processing has failed with errors with event: 'null', chunkId: '6c96f806-e9f5-4720-b680-b44361a3a0e7', chunkNumber '2119'-'19' with recordId: 'null'.   UIDATIMP-1087 - Getting issue details... STATUS


max resource usage:

testmodule CPU %
50k mod-source-record-manager685
50kmod-data-import-cs404
50kokapi385
50kmod-inventory235
50kmod source-record-storage231
50kmod-inventory-storage117
50kmod-data-import111




max resource usage:

module RAM %
mod-source-record-manager60
mod-data-import-cs80
okapi76
mod-inventory87
mod source-record-storage78-80
mod-inventory-storage51
mod-data-import22






For all test set instances CPU was less than 50%





Results after modules updating 



ProfileDuration newDuration old

5K

Default - Create SRS MARC Authority2 min 15 s1 min 14 sec
10KDefault - Create SRS MARC Authority3 min 58 s2 min 32 sec
25KDefault - Create SRS MARC Authority10 min 47 s6 min
50KDefault - Create SRS MARC Authority18 min 18 s12 min