Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue

...

Test №statustotal_num_of_recordsmapped_num_of_recordssaved_num_of_recordspercentage of Unsaved Records
Test №1DATA_SAVING_FAILED1206725012067250119836920.69%
Test №2DATA_SAVING_FAILED1206725012067250119592810.89%
Test №3DATA_SAVING_FAILED1206725012067250119214421.21%
Test №4DATA_SAVING_FAILED1206725012067250118119762.12%
Test №5DATA_SAVING_FAILED1206725012067250118159272.08%
Test №6DATA_SAVING_FAILED1206725012067250116726973.27%
Test №7DATA_SAVING_FAILED1206725012067250114217435.35%

Recommendations and Jiras

  • Increase default CPU allocation for mod-entities-links service or set it to 0.
  • Use CHUNK_FETCH_IDS_COUNT=12000 and RECORDS_CHUNK_SIZE=4000 to decrease migration time, but mod-entities-links will use 25% more CPU.
  • Use only 1 container (task) for mod-marc-migrations.
  • While the data mapping is running, files with data will be stored directly in the working mod-marc-migrations container. Further, all files will be deleted from the container and relocated to the S3 bucket (if the S3 bucket is not provided - data mapping fails).
    If the container falls during the data mapping process - all files will be lost and data mapping will hang forever.

Test Results

This table contains duration time for Migrated and saved Marc Authority records

...

Here we can see that mod-entities-links  module had spikes up to 90% Instances CPU power and mod-marc-migrations module used 20% Instances CPU power.

...

Here we can see that mod-entities-links had spikes up to 90% memory.


Kafka metrics

OpenSearch Data Nodes metrics

...

PTF - Baseline RCON environment configuration

  • 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer


    NameMemory GIBvCPUs

    db.r6g.xlarge

    32 GB4 vCPUs


  • Open Search ptf-test 
    • Data nodes
      • Instance type - r6g.2xlarge.search
      • Number of nodes - 4
      • Version: OpenSearch_2_7_R20240502
    • Dedicated master nodes
      • Instance type - r6g.large.search
      • Number of nodes - 3
  • MSK fse-tenant
    • brokers, kafka.m7g.xlarge brokers in 2 zones
    • Apache Kafka version 3.7.x 

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

...