DATA Import Marc Authorities Testing - 8/16/2022

Overview

This is an effort to retest MARC Authorities data import in the Lotus release on the lcp1 ECS cluster to compare the results of the Morning Glory release tests that were performed on the lcp2 ECS cluster.  The retests were done with comparable infrastructure (on the Morning Glory release) so that we could have an apples to apples comparison. 

Summary

  1. Morning Glory's DI MARC Authorities performance is about the same as Lotus'
  2. mod-source-record-manager's memory setting has a small impact on performance.  Doubling the memory allocation in some cases improves the speed of import.
  3. DI performance improves when the number of partitions equals the number of topic consumers.
  4. Having lots more partitions (50) without an equal number of topic consumers does not improve performance

Bugfest-like Settings

Tests were done with Morning-Glory's Bugfest-like setting (in order to do apples to apples comparison with Morning Glory's)

ModuleVersionCPUMemoryNumber of TasksTask Definition Revision
mod-data-import2.4.22561844 | 204816
mod-data-import-cs1.13.3128896 | 102426
mod-source-record-storage5.3.310241296 | 1440213
mod-source-record-storage-manager3.3.810243688 | 4096217
mod-inventory18.1.610242592 | 2880 226
mod-inventory-storage23.0.510241684 | 1872512

Kafka

All DI topics had 1 partition and all inventory topics (item | holdings-record | instance) also had 1 partition each.


Tests 1 Results (On lcp1)

File SizeImport Duration 
1K46s
5K159s (2m 39s)
10K239s (3m 59s)
25K690s (11m 30s)
50K1286s (21m 26s)

Morning Glory (with lcp2's) Settings

Tests were done with PTF Morning-Glory environment's settings for apples to apples comparison.  Here only mod-srm's memory deviates from Bugfest's, and it is 50% less than Bugfest's.

ModuleVersionCPUMemoryNumber of TasksTask Definition Revision
mod-data-import2.4.22561844 | 204816
mod-data-import-cs1.13.3128896 | 102426
mod-source-record-storage5.3.310241296 | 1440213
mod-source-record-storage-manager3.3.810241844 | 2048218
mod-inventory18.1.610242592 | 2880 226
mod-inventory-storage23.0.510241684 | 1872212

Kafka

All DI topics had 1 partition and all inventory topics (item | holdings-record | instance) also had 1 partition each.

Tests 2 Results (On lcp2)

File SizeImport Duration versus Test 1 Results
1K79s (1m 19s)72% (slower)
5K125s (2m 5s)21% (faster)
10K322s (5m 22s)35% (slower)
25K600s (10m)15% (faster)
50K1387s (23m 7s)8% (slower)

Conclusion 1

  • When mod-srm has less memory, data import of MARC Authority is a bit slower, 33s for 1K, 1min 23s for 10K, and 1min 41s for 50K.  Interesting, the 5K and 25K imports were a bit faster.

Tests 3 Results 

After making all DI topics having 2 partitions each.

File SizeImport Duration Versus Test 2 Results
1K35s56% (faster)
5K82s (1m 22s)34% faster 
10K189s (3m 9s)41% faster
25K520s (8m 40s)13% faster
50K1341s (22m 21s)3% faster

Conclusion 2:

  • Comparing the results of Tests 3 to Tests 2, when each DI topic has two partitions versus one, the import's performance improved remarkably, especially for smaller file


By default all inventory topics (inventory.holdings-record, inventory.item, inventory.instance) have 50 partitions each after first time enabling the modules.  However, after Kafka automatically recreates the topics (if they had been deleted), these topics are defaulted to 1 partition, and such was the case in the PTF Kafka cluster.  The following tests were done after making all inventory topics have 50 partitions again.

File SizeImport Duration (Tests 4)Import Duration (Tests 5)Import Duration (Tests 6)
1K1m 56s55s23s
5K1m 56s7m 7s1m 39s
10K2m 56s3m 12s2m 26s
25K7m 52s16m 43s6m 57s
50K42m 51s31m 2s18m 27s

Results of tests 5 and 6 were unexpected because the response time gets faster. We hypothesized that having 50 partitions would not improve performance and/or instead makes the performance worse because there isn't the same number of consumer available (only having 2 mod-inventory-storage tasks) to take advantage of the 50 "lanes" concurrency.  

Running Tests 7 after setting the inventory topics back to 1 partition to see if performance improve, and it does.

File SizeImport Duration (Tests 4)Import Duration w/1 partition (Tests 7)
1K1m 56s38s
5K1m 56s1m 47s
10K2m 56s3m 02s
25K7m 52s9m 50s
50K42m 51s21m 1s

Here we picked Test 4 to compare against Test 7 and did not choose Tests 5 or 6 simply because we don't think they are consistent with our hypothesis. However, this needs more testing going forward to see if higher number of partitions helps or hurts performance.  Here we can tentatively conclude the folowing:

Conclusion 3

  • Having less partitions (1 or 2) or up to the number of consumers (2) optimizes performance. 

Tests 8: Lotus vs Morning Glory - comparison with different numbers of Kafka partitions.


50 Kafka partitions1 Kafka partition
File SizeImport Duration (lcp1)Import Duration (lcp2)Import Duration (lcp1)Import Duration (lcp2)
1K23s21s39s26s
5K1m 39s1m 18s1m 48s1m 58s
10K2m 26s2m 19s3m 2s4m 45s
25K6m 57s6m 25s9m 50s12m 25s
50K18m 27s15m 53s21m 1s19m 49s

Test 8 results show that the DI MARC Authority performance is about the same between the Lotus and Morning Glory releases.

Jobs comparison with different users' network connections for lcp2, 50 Kafka partitions (18/08/2022).

File Size

Import Duration

(No throttling)

Import Duration

(Fast 3G)

Import Duration

(Slow 3G)

1K19s39s55s
5K1m 10s2m 1s2m 46s
10K2m 14s3m 54s5m 40s
25K9m 59s13m 2s13m 59s
50K16m 52s24m 45s32m 13s

Observations:

Jobs overall time is the time from the beginning of file upload (which is strongly affected by user network speed with user interaction) to the actual finish of the import. To have more accurate results from testing it is better to run the import process right after the file upload process. For example, if after uploading the file a user leaves the computer for 5 minutes before coming back to kick off the import then the overall import time also includes the 5 idle minutes.