Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue

Overview

  • This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Quesnelia release on qcp1 environments with Kafka consolidated topics and file splitting features enabled on a non-ecs environment.

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-832
 

Summary

Test Results and Comparison

Test №1

Test with 1k, 10k, 25k and 50k records files DI started on one tenant only.

...

% creates

...

File

...

DI duration 
Morning Glory

...

DI duration
Nolana

...

DI duration 
Orchid

...

DI duration 
Poppy

...

Test №2

Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K and 50K started on one tenant only.

  • Сomparative Data Import and Check-In\Check-Out results between Baseline and Quesnelia.

...

# of records 

...

DI Duration

with CICO

...

CI time Avg
without

...

Baseline CI delta

...

CI time 95th pct

...

Baseline CI delta

...

CO time Avg

...

Baseline CO Avg

Delta

...

CO time 95th pct

...

Baseline CO delta

...

Table of Contents
outlinetrue

Overview

  • This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Quesnelia release on qcp1 environments with Kafka consolidated topics and file splitting features enabled on a non-ecs environment.

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-832
 

Summary

  • Data import tests finished successfully, only Test №5 had one failed record for Tenant 2(qcp1-01) when processed 50k files. Duration of DI grew in correspondence with the number of records in files.
  • Check-in and Check-out with 5 virtual users was performed during DI Create new MARC authority records jobs for non-matches No issues.
  • Data Import in Quesnelia without CICO perform faster than with it.
  • Comparing Poppy and Quesnelia releases
    • Check-in / Check-out perform better in Quesnelia.  Response time improved during Create jobs for long period of work time on 15% in Avarage.
    • DI durations improved  - 11%-14% in Average.
  • Used different order for Tenants when load files, started load files from Tenant 3(qcp1-02) → Tenant 2(qcp1-01) → Tenant 1(qcp1-00) to avoid problem when mod-permissions spiked and system stacked.

Test Results and Comparison

Test №1

Test with 1k, 10k, 25k and 50k records files DI started on one tenant only, and comparative results between Poppy and Quesnelia.


# of records 

% creates

File

DI duration 
Morning Glory

DI duration
Nolana

DI duration 
Orchid

DI duration 
Poppy

DI duration 
Quesnelia
1,0001001k_marc_authority.mrc?api=v2 24 s27 s41 sec29 sec22 sec
-24%
5,000100 LC_SUBJ_msplit00000000.mrc?api=v21 min 21 s1 min 15 s1min 21s1 min 38 sec1 min 19 sec
-19%
10,000100msplit00000000.mrc?api=v2 2 min 32 s2 min 31 s2min 53s2 min 53 sec2 min 36 sec
-9.8%
22778
(for Poppy test)
25000
(for Quesnelia test)
100 msplit00000013.mrc?api=v211 min 14 s7 min 7 s5 min 42s6 min 24 sec6 min 19 sec
-1.3%
50,00010050000_authorityrecords.mrc?api=v222 min11 min 24 s11 min 11s13 min 48 sec11 min 59 sec
-13%


Test №2

Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K and 50K started on one tenant only.


  • Сomparative Baseline Check-In\Check-Out results without Data Import between Poppy and Quesnelia.

CICO, Median time without
DI
(Poppy)

CICO, 95% time without
DI
(Poppy)
CICO, Median time without
DI
(Quesnelia)
CICO, 95% time without
DI
(Quesnelia)
CICO, Avg time without
DI
(Quesnelia)
Check-In516 ms567 ms503 ms
-2.5%

593 ms
+4.5%

511 ms

Check-Out910 ms2094 ms 836 ms
-8%
1117 ms
-46%
876 ms


  • Сomparative  Check-In\Check-Out results between Baseline (Quesnelia) and  Check-In\Check-Out plus Data Import (Quesnelia.)
# of records
(Quesnelia)

DI Duration with CICO
(Quesnelia)

CI time Avg
(Quesnelia)
CI time 95th pct
(Quesnelia)
CO time Avg
(Quesnelia)
CO time 95th pct
(Quesnelia)
Baseline CI
Avg delta
Baseline CI 
95th pct delta
Baseline CO
Avg delta
Baseline CO 
95th pct delta
1,000

20 sec

0.5600.7541.1641.313+9%+27%+32%+17%
5,0001 min 19 sec0.7011.1711.1411.790+37%+97%+30%+60%
10,0002 min 35 se0.7231.0241.1791.494+41%+72%+34%+34%
25,0006 min 26 sec0.7221.0241.1801.494+41%+72%+35%+34%
50,000

12 min 16 sec

0.7771.045

1.265

1.550+52%+76%+44%+39%


...

  • Сomparative Data Import and Check-In\Check-Out results between Poppy and Quesnelia.

# of records 
(Poppy)

DI Duration with CICO
(Poppy)

CI time Avg
(Poppy)

CI time 95th pct
(Poppy)

CO time Avg
(Poppy)

CO time 95th pct
(Poppy)

# of records
(Quesnelia)

DI Duration with CICO
(Quesnelia)

CI time Avg
(Quesnelia)
CI time 95th pct
(Quesnelia)
CO time Avg
(Quesnelia)
CO time 95th pct
(Quesnelia)
1,00035 sec0.5250.5761.0781.3261,000

20 sec
-42.8%

0.560
+6%
0.754
+30%
1.164
+8%
1.313
-1%
5,0001 min 41 sec0.5130.6120.91.0195,0001 min 19 sec
-21.7%
0.701
+36%
1.171
+91%
1.141
+26%
1.790
+75%
10,0003 min 4 sec0.5810.6851.0161.32110,0002 min 35 sec
-15.7%
0.723
+24%
1.024
+49%
1.179
+16%
1.494
+13%
22,7786 min 32 sec0.5981.5421.2441.72925,0006 min 26 sec
-1.5%
0.722
+20%
1.024
-33%
1.180
-5%
1.494
-13%
50,00013 min 48 sec0.6711.9531.512.0950,000

12 min 16 sec
-11%

0.777
+15%
1.045
-46%

1.265
-16%

1.550
-25%


...

PTF - environment Quesnelia (qcp1)

  • 10 db.r6g.xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instances, writer


    NameMemory GIBvCPUs

    db.r6g.xlarge

    32 GiBvCPUs


  • MSK ptf-mobius-testing2
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

...

  • The above files are all stored here - MARC Resources
    - 22k file what was provided from MARC Resources  does nor work, so 50k file was split to file with 25k records and used instead of 22k file.
  • At the time of the test run, Grafana was not available. As a result, response times for Check-In/Check-Out were parsed manually from a .jtl file, using the start and finish dates of the data import tests. These results were visualized in JMeter using a Listener (Response Times Over Time).

Test set

  • Test 1: Manually tested 1k, 10k, 25k and 50k records files DI started on one tenant only.
  • Test 2: Manually tested 1k, 10k, 25k and 50k records files DI started on one tenant only plus Check-in and Checkout (CICO) for 5 concurrent users.
  • Test 3: Manually tested 1k, 10k, 25k and 50k records files DI started on 3 tenants concurrentlyOrder for load file without pause between files: 50k, 25k, 10k, 5k, and 1k for order tenantsTenant 3(qcp1-02), Tenant 2(qcp1-01) and Tenant 1(qcp1-00)
  • Test 4: Manually tested 1k, 10k, 25k and 50k records files DI started on 3 tenants concurrently. Order for load file with pause between files: 50k, 25k, 10k, 5k, and 1k for order tenantsTenant 3(qcp1-02), Tenant 1(qcp1-00) and Tenant 2(qcp1-01)
  • Test 5: Manually tested 1k, 10k, 25k and 50k records files DI started on 3 tenants concurrentlyOrder for load file without pause between files: 1k, 5k, 10k, 25k and 50k for order tenantsTenant 3(qcp1-02), Tenant 2(qcp1-01) and Tenant 1(qcp1-00)

...