[Ramsons] Migrate/Update large number of Marc Authority records [NON-ECS]
Overview
This document contains the results of testing marc-migration for over 16 mln records in the Ramsons release. https://folio-org.atlassian.net/browse/PERF-929
Summary
The duration for marc-migration is up to 9,5 hours (7 hours mapping, 2.5 hours saving ) for default configurations with background jobs (e.g. Deleting old records from marc_indexers).
The best results are for test #9 with CHUNK_FETCH_IDS_COUNT=12000 and RECORDS_CHUNK_SIZE=4000 - total duration 4 hours 2 min (3 hours 35 minutes for data mapping and 27 minutes for data saving).
Memory utilization increased for mod-entities-links up to 75%, mod-search -62%, and mod-marc-migrations - 60% due to previous module restarting and everyday cluster shutdown processes - no memory leak is suspected for all modules.
CPU utilization was for mod-entities-links up to 453%, mod-search -20%, mod-marc-migrations - 15%. Average CPU usage did not exceed 10 % for all other modules. But the CPU for mod-marc-migrations is set to 0 and actual CPU usage was about 30% for the AWS instance where mod-marc-migrations was located.
For all tests, approximate DB CPU usage is up to 70%.
Additional tests
Test with 2 containers (tasks) for mod-marc-migrations and 2 tenants in parallel:
It does not work for 2 tenants in parallel, even with 2 containers for mod-marc-migrations. The 2nd job will queued with the status - "new" while the 1st is running. The 2nd job will be started and finished successfully after the 1st one.
Test with 1 container (task) for mod-marc-migrations:
There is no reason to use 2 containers for mod-marc-migrations because the duration for 1 container is the same as for 2 (the process uses only one mod-marc-migrations container of 2 allocated).
Recommendations and Jiras
Increase default CPU allocation for mod-entities-links service or set it to 0.
Use CHUNK_FETCH_IDS_COUNT=12000 and RECORDS_CHUNK_SIZE=4000 to decrease migration time, but mod-entities-links will use 25% more CPU.
Use only 1 container (task) for mod-marc-migrations.
While the data mapping is running, files with data will be stored directly in the working mod-marc-migrations container. Further, all files will be deleted from the container and relocated to the S3 bucket (if the S3 bucket is not provided - data mapping fails).
If the container falls during the data mapping process - all files will be lost and data mapping will hang forever.
Results
Migrated and saved 16 462 866 records.
| Configurations | Duration | ||
|---|---|---|---|---|
Test # | CHUNK_FETCH_IDS_COUNT | RECORDS_CHUNK_SIZE | data mapping | data saving |
1 * | default | default | 07:09:31 | 02:27:48 |
2* | 500 | 500 | 06:00:51 | 01:47:44 |
3 | 500 | 500 | 04:48:07 | 01:21:40 |
4 | 2000 | 1000 | 04:14:25 | 00:45:15 |
4a | 2000 | 1000 | 04:10:54 | 00:46:32 |
5 | 4000 | 2000 | 03:42:54 | 00:30:56 |
6 | 5000 | 2500 | 04:05:18 | 00:35:02 |
6a | 5000 | 2500 | 03:58:10 | 00:34:02 |
7 | 7000 | 3500 | 03:40:23 | 00:38:46 |
8 | 10000 | 5000 | 03:34:16 | 00:33:35 |
9 | 12000 | 4000 | 03:35:15 | 00:27:19 |
* - Tests were performed with enabled "Deleting old records from marc_indexers" jobs
4a - repeat of test 4
Additional testing
Tested with 1 container for mod-marc-migrations.
| Configurations | Duration | ||
|---|---|---|---|---|
Test # | CHUNK_FETCH_IDS_COUNT | RECORDS_CHUNK_SIZE | data mapping | data saving |
10 | 12000 | 4000 | 03:48:11 | 00:29:12 |
Test with 2 containers for mod-marc-migrations and 2 tenants in parallel.
|
| Configurations | Duration | ||||
|---|---|---|---|---|---|---|---|
Test # 11 | Records number | CHUNK_FETCH_IDS_COUNT | RECORDS_CHUNK_SIZE | Started | Finished | data mapping | data saving |
Tenant 1 | 16 462 866 | 12000 | 4000 | 2024-09-17 09:37 | 2024-09-17 13:15 | 03:38:16 | 00:28:18 |
Tenant 2 | 3 514 957 | 12000 | 4000 | 2024-09-17 09:38 (actual mapping start 2024-09-17 13:15) | 2024-09-17 13:49 | 00:34:01 (actual running time) | 00:06:01 |
Resource utilization for Test #9
Memory Utilization
Memory utilization increased for mod-entities-links up to 75%, mod-search -62%, and mod-marc-migrations - 60% due to previous module restarting and everyday cluster shutdown processes - no memory leak is suspected for all modules.
mod-entities-links, mod-search, mod-marc-migrations
Instance CPU Utilization
Instance CPU Utilization was up to 32%. The red line on the graph is the instance i-063e3abc2c04eb056, where one of the containers of mod-marc-migration is located.
Service CPU Utilization
CPU utilization was for mod-entities-links up to453%, mod-search -20%, mod-marc-migrations - 15% (It is set to 0 in the configurations). Average CPU usage did not exceed 10 % for all other modules.
mod-entities-links, mod-search, mod-marc-migrations
RDS CPU Utilization
Approximate DB CPU usage is up to 70%.
RDS Database Connections
Cluster metrics
Appendix
Infrastructure
Records count :
mod_entities_links.authority = 16462866
PTF -environment rcp1
10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance - writer
MSK fse-tenant
4 kafka.m7g.xlarge brokers in 2 zones
Apache Kafka version 3.7.x
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Modules memory and CPU parameters
Cluster Resources - rcp1-pvt (Wed Sep 11 11:00:51 UTC 2024) |
|
|
|
|
| ||||
Module | Task Definition Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft Limit | CPU Units | Xmx | Metaspace Size | Max Metaspace Size |
mod-remote-storage | 1 | mod-remote-storage:3.2.1-SNAPSHOT.171 | 2 | 4920 | 4472 | 1024 | 3960 | 512 | 512 |
mod-ncip | 1 | mod-ncip:1.14.6-SNAPSHOT.233 | 2 | 1024 | 896 | 0 | 768 | 88 | 128 |
mod-finance-storage | 1 | mod-finance-storage:8.7.0-SNAPSHOT.181 | 2 | 1024 | 896 | 1024 | 700 | 88 | 128 |
mod-agreements | 1 | mod-agreements:7.1.0-SNAPSHOT.237 | 2 | 1592 | 1488 | 0 | 0 | 0 | 0 |
mod-ebsconet | 1 | mod-ebsconet:2.3.0-SNAPSHOT.80 | 2 | 1248 | 1024 | 128 | 700 | 128 | 256 |
mod-organizations | 1 | mod-organizations:2.0.0-SNAPSHOT.93 | 2 | 1024 | 896 | 0 | 700 | 88 | 128 |
edge-sip2 | 1 | edge-sip2:3.3.0-SNAPSHOT.264 | 2 | 1024 | 896 | 0 | 768 | 88 | 128 |
mod-settings | 1 | mod-settings:1.0.4-SNAPSHOT.67 | 2 | 1024 | 896 | 200 | 768 | 88 | 128 |
mod-serials-management | 1 | mod-serials-management:1.1.0-SNAPSHOT.46 | 2 | 2480 | 2312 | 0 | 1792 | 384 | 512 |
edge-dematic | 1 | edge-dematic:2.3.0-SNAPSHOT.143 | 1 | 1024 | 896 | 0 | 768 | 88 | 128 |
mod-data-import | 1 | mod-data-import:3.2.0-SNAPSHOT.189 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 |
mod-search | 1 | mod-search:3.3.0-SNAPSHOT.261 | 2 | 2592 | 2480 | 2048 | 1440 | 512 | 1024 |
mod-inn-reach | 1 | mod-inn-reach:3.2.1-SNAPSHOT.102 | 2 | 3600 | 3240 | 1024 | 2880 | 512 | 1024 |
mod-record-specifications | 10 | mod-record-specifications:1.0.0-SNAPSHOT.4 | 2 | 1024 | 896 | 0 | 768 | ||