DATA Import Marc Authorities Testing - 8/16/2022
Overview
This is an effort to retest MARC Authorities data import in the Lotus release on the lcp1 ECS cluster to compare the results of the Morning Glory release tests that were performed on the lcp2 ECS cluster. The retests were done with comparable infrastructure (on the Morning Glory release) so that we could have an apples to apples comparison.
Summary
- Morning Glory's DI MARC Authorities performance is about the same as Lotus'
- mod-source-record-manager's memory setting has a small impact on performance. Doubling the memory allocation in some cases improves the speed of import.
- DI performance improves when the number of partitions equals the number of topic consumers.
- Having lots more partitions (50) without an equal number of topic consumers does not improve performance
Bugfest-like Settings
Tests were done with Morning-Glory's Bugfest-like setting (in order to do apples to apples comparison with Morning Glory's)
Module | Version | CPU | Memory | Number of Tasks | Task Definition Revision |
---|---|---|---|---|---|
mod-data-import | 2.4.2 | 256 | 1844 | 2048 | 1 | 6 |
mod-data-import-cs | 1.13.3 | 128 | 896 | 1024 | 2 | 6 |
mod-source-record-storage | 5.3.3 | 1024 | 1296 | 1440 | 2 | 13 |
mod-source-record-storage-manager | 3.3.8 | 1024 | 3688 | 4096 | 2 | 17 |
mod-inventory | 18.1.6 | 1024 | 2592 | 2880 | 2 | 26 |
mod-inventory-storage | 23.0.5 | 1024 | 1684 | 1872 | 5 | 12 |
Kafka
All DI topics had 1 partition and all inventory topics (item | holdings-record | instance) also had 1 partition each.
Tests 1 Results (On lcp1)
File Size | Import Duration |
---|---|
1K | 46s |
5K | 159s (2m 39s) |
10K | 239s (3m 59s) |
25K | 690s (11m 30s) |
50K | 1286s (21m 26s) |
Morning Glory (with lcp2's) Settings
Tests were done with PTF Morning-Glory environment's settings for apples to apples comparison. Here only mod-srm's memory deviates from Bugfest's, and it is 50% less than Bugfest's.
Module | Version | CPU | Memory | Number of Tasks | Task Definition Revision |
---|---|---|---|---|---|
mod-data-import | 2.4.2 | 256 | 1844 | 2048 | 1 | 6 |
mod-data-import-cs | 1.13.3 | 128 | 896 | 1024 | 2 | 6 |
mod-source-record-storage | 5.3.3 | 1024 | 1296 | 1440 | 2 | 13 |
mod-source-record-storage-manager | 3.3.8 | 1024 | 1844 | 2048 | 2 | 18 |
mod-inventory | 18.1.6 | 1024 | 2592 | 2880 | 2 | 26 |
mod-inventory-storage | 23.0.5 | 1024 | 1684 | 1872 | 2 | 12 |
Kafka
All DI topics had 1 partition and all inventory topics (item | holdings-record | instance) also had 1 partition each.
Tests 2 Results (On lcp2)
File Size | Import Duration | versus Test 1 Results |
---|---|---|
1K | 79s (1m 19s) | 72% (slower) |
5K | 125s (2m 5s) | 21% (faster) |
10K | 322s (5m 22s) | 35% (slower) |
25K | 600s (10m) | 15% (faster) |
50K | 1387s (23m 7s) | 8% (slower) |
Conclusion 1
- When mod-srm has less memory, data import of MARC Authority is a bit slower, 33s for 1K, 1min 23s for 10K, and 1min 41s for 50K. Interesting, the 5K and 25K imports were a bit faster.
Tests 3 Results
After making all DI topics having 2 partitions each.
File Size | Import Duration | Versus Test 2 Results |
---|---|---|
1K | 35s | 56% (faster) |
5K | 82s (1m 22s) | 34% faster |
10K | 189s (3m 9s) | 41% faster |
25K | 520s (8m 40s) | 13% faster |
50K | 1341s (22m 21s) | 3% faster |
Conclusion 2:
- Comparing the results of Tests 3 to Tests 2, when each DI topic has two partitions versus one, the import's performance improved remarkably, especially for smaller file
By default all inventory topics (inventory.holdings-record, inventory.item, inventory.instance) have 50 partitions each after first time enabling the modules. However, after Kafka automatically recreates the topics (if they had been deleted), these topics are defaulted to 1 partition, and such was the case in the PTF Kafka cluster. The following tests were done after making all inventory topics have 50 partitions again.
File Size | Import Duration (Tests 4) | Import Duration (Tests 5) | Import Duration (Tests 6) |
---|---|---|---|
1K | 1m 56s | 55s | 23s |
5K | 1m 56s | 7m 7s | 1m 39s |
10K | 2m 56s | 3m 12s | 2m 26s |
25K | 7m 52s | 16m 43s | 6m 57s |
50K | 42m 51s | 31m 2s | 18m 27s |
Results of tests 5 and 6 were unexpected because the response time gets faster. We hypothesized that having 50 partitions would not improve performance and/or instead makes the performance worse because there isn't the same number of consumer available (only having 2 mod-inventory-storage tasks) to take advantage of the 50 "lanes" concurrency.
Running Tests 7 after setting the inventory topics back to 1 partition to see if performance improve, and it does.
File Size | Import Duration (Tests 4) | Import Duration w/1 partition (Tests 7) |
---|---|---|
1K | 1m 56s | 38s |
5K | 1m 56s | 1m 47s |
10K | 2m 56s | 3m 02s |
25K | 7m 52s | 9m 50s |
50K | 42m 51s | 21m 1s |
Here we picked Test 4 to compare against Test 7 and did not choose Tests 5 or 6 simply because we don't think they are consistent with our hypothesis. However, this needs more testing going forward to see if higher number of partitions helps or hurts performance. Here we can tentatively conclude the folowing:
Conclusion 3
- Having less partitions (1 or 2) or up to the number of consumers (2) optimizes performance.
Tests 8: Lotus vs Morning Glory - comparison with different numbers of Kafka partitions.
50 Kafka partitions | 1 Kafka partition | |||
File Size | Import Duration (lcp1) | Import Duration (lcp2) | Import Duration (lcp1) | Import Duration (lcp2) |
---|---|---|---|---|
1K | 23s | 21s | 39s | 26s |
5K | 1m 39s | 1m 18s | 1m 48s | 1m 58s |
10K | 2m 26s | 2m 19s | 3m 2s | 4m 45s |
25K | 6m 57s | 6m 25s | 9m 50s | 12m 25s |
50K | 18m 27s | 15m 53s | 21m 1s | 19m 49s |
Test 8 results show that the DI MARC Authority performance is about the same between the Lotus and Morning Glory releases.
Jobs comparison with different users' network connections for lcp2, 50 Kafka partitions (18/08/2022).
File Size | Import Duration (No throttling) | Import Duration (Fast 3G) | Import Duration (Slow 3G) |
---|---|---|---|
1K | 19s | 39s | 55s |
5K | 1m 10s | 2m 1s | 2m 46s |
10K | 2m 14s | 3m 54s | 5m 40s |
25K | 9m 59s | 13m 2s | 13m 59s |
50K | 16m 52s | 24m 45s | 32m 13s |
Observations:
Jobs overall time is the time from the beginning of file upload (which is strongly affected by user network speed with user interaction) to the actual finish of the import. To have more accurate results from testing it is better to run the import process right after the file upload process. For example, if after uploading the file a user leaves the computer for 5 minutes before coming back to kick off the import then the overall import time also includes the 5 idle minutes.