Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Table of Contents

...

This document contains the results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with a new MSK instance type. The main idea is to see how the kafka.m7g.2xlarge affects FOLIO performance. Compared results for main workflows with different instance types: kafka.m5.2xlarge against kafka.m7g.2xlarge.

Ticket:

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-921

Summary

...

Table of Contents
Overview

This document contains the results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with a new MSK instance type. The main idea is to see how the kafka.m7g.2xlarge affects FOLIO performance. Compared results for main workflows with different instance types: kafka.m5.2xlarge against kafka.m7g.2xlarge

...

  • Data Import durations and CI/CO response time do not differ significantly. The number of requests during 2 hour CI/CO with Data Import test stayed the same 287669 in m5 and 287155 in m7g MSK instance type.

...

  • Memory utilization didn't differ a lot between two MSK clusters
  • Disk consume less resources in both (idle and CICO+DI) scenarios with m7g instance type. CPU utilization the same in idle state and from 3% to 18% less under load with m7g instance type.
  • Delta for CPU utilization shows in mod-di-converter-storage-b 20% decrease for update job and 10% decrease for mod-feesfines-b module. The most part of modules CPU utilization deltas fluctuate under 10%. 
  • Average DB CPU usage for both MSK clusters during data import is 85% during create jobs and 87& during Update jobs. Check-In/Check-Out period without DI - 15%.
  • Average connection count for both MSK clusters during data import is about 850 connections for create and update jobs with CI/CO. And 730 connections for CI/CO without data import
  • MSK instance CPU and Disk utilization remain on the same level in kafka.m7g.2xlarge against kafka.m5.2xlarge or even decreased.

Test Runs 

...

Test #

...

Scenario

...

Test Results

This table shows results of Check-In/Check-out and Data Import create and update jobs.

The only difference between tests is MSK cluster instance type. Cluster ptf-mobius-testing2 has kafka.m5.2xlarge and cluster PERF-921 has kafka.m7g.2xlarge

...

Check-in/Check-out without DI

...

Response time, sec
MSK instance: kafka.m5.2xlarge

...

Response time, sec
MSK instance: kafka.m7g.2xlarge

...

Comparison

Data Import durations and Check-In/Check-Out response time comparison

Data Import durations and CI/CO response time do not differ significantly. The number of requests during 2 hour CI/CO with Data Import test stayed the same 287669 in m5 and 287155 in m7g MSK instance type.

...

Kafka resource utilization comparison table

The m7g instance type consumes fewer resources in both idle and CICO+DI scenarios. CPU utilization is almost the same in the idle state but it is 3% to 18% lower under load during CI/CO + DI with the m7g instance type.

...

This table show comparison results of CICO without Data Import in two MSK clusters

...

.

Ticket:

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-921

Summary

  • Comparing kafka.m5.2xlarge against kafka.m7g.2xlarge instance type
    • The main KPI for the workflows do not differ significantly (Data Import durations and CI/CO response time). During 2 hour CI/CO with Data Import tests the number of requests were similar for both MSK clusters- 287669 in m5 and 287155 in m7g MSK instance type. Duration of data import update job with 25k records is longer for 2 minutes with m7g instance type
    • MSK resources utilization. CPU decreased slowly (1% - 6%) with m7g instance type. Memory usage is on the same level.
  • Resource utilization
    • Memory utilization didn't differ a lot between two MSK clusters
    • Average DB CPU utilization is 85% during create jobs and 87% during update jobs for tests with different MSK instance types. DB CPU utilized 15% during Check-In/Check-Out period without DI.
    • Average connection count is about 850 connections for create and update jobs with CI/CO and 730 connections for CI/CO without data import for tests with different MSK instance types.
    • MSK instance CPU and Disk utilization is similar in both kafka.m7g.2xlarge and kafka.m5.2xlarge.
    • Deltas for CPU utilization shows in mod-di-converter-storage-b 20% decrease for update job and 10% decrease for mod-feesfines-b module. The rest modules CPU utilization deltas fluctuate under 10%. 

Test Runs 

Test #

MSK instance type

Scenario

Load level
1kafka.m5.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
2DI MARC Bib Create5K, 25K sequentially
3CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
4DI MARC Bib Update5K, 25K sequentially
5kafka.m7g.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
6DI MARC Bib Create5K, 25K sequentially
7CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
8DI MARC Bib Update5K, 25K sequentially

Test Results

This table shows results of Check-In/Check-out and Data Import create and update jobs.

MSK instance: kafka.m5.2xlarge
Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
PTF - Create 25k00:02:3100:02:540.8991.409
25k00:11:4900:12:490.7241.152
PTF - Updates Success - 65k00:03:0600:03:140.8071.257
25k00:15:0000:15:300.7841.275
MSK instance: kafka.m7g.2xlarge
Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
PTF - Create 25k00:03:0500:02:390.7071.104
25k00:12:0300:12:080.7181.129
PTF - Updates Success - 65k00:03:3600:03:340.7421.124
25k00:17:0500:17:330.7561.148


Check-in/Check-out without DI

ScenarioLoad levelRequest

Response time, sec
MSK instance: kafka.m5.2xlarge

Response time, sec
MSK instance: kafka.m7g.2xlarge

95 percaverage95 percaverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6690.5700.7200.606
Check-out1.1520.9601.2410.969

Comparison

Data Import durations and Check-In/Check-Out response time comparison

  • Data Import durations and CI/CO response time do not differ significantly. The number of requests in 2 hour CI/CO with Data Import tests were similar for both MSK clusters- 287669 in m5 and 287155 in m7g MSK instance type.
Job ProfileFile sizeDELTA, DI without CI/CODELTA, DI+CI/CODELTA, CI with DIDELTA, CO with DI
PTF - Create 25k00:00:3400:00:150.1920.305
25k00:00:1400:00:410.0060.023
PTF - Updates Success - 65k00:00:3100:00:200.0650.133
25k00:02:0600:02:030.0280.127


Check-in/Check-out without DI

ScenarioLoad levelRequest

Response time, sec
MSK instance: kafka.m5.2xlarge

Response time, sec
MSK instance: kafka.m7g.2xlarge


Delta

95 percaverage95 percaverageAverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6690.5700.7200.6060.036
Check-out1.1520.9601.2410.9690.009


MSK resource utilization (CPU)

0.009
Load scenarioBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlarge95 percAverage95 percAverageAverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6690.570.720.6060.036
Check-out1.1520.961.2410.969Delta, %
CICO113.762502510.6770835-3.08
211.947916259.87916575-2.06
CICO+DI138.0916662531.13749875-6.95
233.8229112532.53334625-1.28

Response time

MSK instance: kafka.m5.2xlarge

...

MSK instance: kafka.m7g.2xlarge

Service CPU Utilization

Delta for CPU utilization shows in mod-di-converter-storage-b 20% decrease for update job and 10% decrease for mod-feesfines-b module. The most part of modules CPU utilization deltas fluctuate under 10%. 

...

MSK instance: kafka.m7g.2xlarge

Service Memory Utilization

Expand
titleMSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge


ModuleMemory (kafka.m5.2xlarge)Memory (kafka.m7g.2xlarge)Delta
mod-dcb-b68.8174.375.56
mod-inventory-b68.2370.812.58
mod-users-b50.1750.370.2
mod-di-converter-storage-b48.6246.7-1.92
mod-feesfines-b45.5645.44-0.12
mod-inventory-storage-b45.3233.4-11.92
mod-source-record-storage-b44.2755.5311.26
okapi-b41.8542.50.65
mod-data-import-b41.4243.552.13
mod-patron-blocks-b41.0442.381.34
mod-search-b40.5545.585.03
mod-users-bl-b39.8245.365.54
mod-configuration-b38.7838.68-0.1
mod-source-record-manager-b38.4541.913.46
mod-pubsub-b36.8635.94-0.92
mod-quick-marc-b31.2542.611.35
mod-patron-b31.1930.52-0.67
mod-entities-links-b27.1234.497.37
mod-authtoken-b26.1727.321.15
mod-circulation-b24.1725.150.98
edge-patron-b22.7722.38-0.39
mod-circulation-storage-b20.0222.342.32
nginx-okapi4.694.58-0.11
pub-okapi4.524.46-0.06
pub-edge4.464.41-0.05

MSK instance: kafka.m5.2xlarge

Image Removed

MSK instance: kafka.m7g.2xlarge

Image Removed

DB CPU Utilization

Average DB CPU usage for both MSK clusters during data import is 85% during create jobs and 87& during Update jobs. Check-In/Check-Out period without DI - 15%.

MSK instance: kafka.m5.2xlarge

Image Removed

MSK instance: kafka.m7g.2xlarge

Image Removed

DB Connections

...

.15
mod-circulation-b24.1725.150.98
edge-patron-b22.7722.38-0.39
mod-circulation-storage-b20.0222.342.32
nginx-okapi4.694.58-0.11
pub-okapi4.524.46-0.06
pub-edge4.464.41-0.05


MSK instance: kafka.m5.2xlarge

Image RemovedImage Added

MSK instance:

...

 kafka.m7g.2xlarge

Image Added

Image Removed

MSK instance resource utilization

...

titleMSK resources table

...

DB CPU Utilization

Average DB CPU utilization is 85% during create jobs and 87% during update jobs for tests with different MSK instance types. DB CPU utilized 15% during Check-In/Check-Out period without DI.

MSK instance: kafka.m5.2xlarge

Image Added

MSK instance: kafka.m7g.2xlarge

...

Image Added

...


...

DB Connections

Average connection count is about 850 connections for create and update jobs with CI/CO and 730 connections for CI/CO without data import for tests with different MSK instance types.

MSK instance: kafka.m5.2xlarge

Image Added

MSK instance: kafka.m7g.2xlarge

Image Added

MSK instance resource utilization

Expand
titleMSK resources table

MSK resource utilization (CPU)

Load scenarioBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta, %
CICO113.762502510.6770835-3.08542-22.42%08
211.947916259.87916575-2.06875-17.31%06
CICO+DI138.0916662531.13749875-6.9541795-18.26%
233.8229112532.53334625-1.28957-3.81%28

MSK resource utilization (DIsk) was 4,6% with kafka.m5.2xlarge and 4,3% with kafka.m7g.2xlarge which may be neglected.

Disk usage by broker

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

CPU (User) usage by broker

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

DB load

MSK instance: kafka.m5.2xlarge

Top SQL-queries:

MSK instance: kafka.m7g.2xlarge


Top SQL-queries:

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)

INSERT INTO fs09000000_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)

INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)

Appendix

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731


  • MSK ptf-mobius-testing2
    • 2 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2
  • MSK perf-921-g2
    • m7g.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.2.tiered

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2

...

Expand
titleAll qcp1 modules


ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSize
qcp1-pvt








Tue Jun 04 07:31:53 UTC 2024








mod-remote-storage4mod-remote-storage:3.2.024920447210243960512512
mod-ncip4mod-ncip:1.14.42102489612876888128
mod-finance-storage4mod-finance-storage:8.6.021024896102470088128
mod-agreements4mod-agreements:7.0.0215921488128000
mod-ebsconet4mod-ebsconet:2.2.0212481024128700128256
mod-organizations4mod-organizations:1.9.02102489612870088128
mod-consortia2mod-consortia:1.1.023072204812820485121024
edge-sip22edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
mod-serials-management4mod-serials-management:1.0.02248023121281792384512
mod-settings4mod-settings:1.0.32102489620076888128
mod-data-import7mod-data-import:3.1.01204818442561292384512
edge-dematic4edge-dematic:2.2.01102489612876888128
mod-search4mod-search:3.2.0225922480204814405121024
mod-inn-reach2mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
mod-tags4mod-tags:2.2.02102489612876888128
edge-courses4edge-courses:1.4.02102489612876888128
mod-authtoken5mod-authtoken:2.15.121440115251292288128
mod-inventory-update4mod-inventory-update:3.3.02102489612876888128
mod-notify4mod-notify:3.2.02102489612876888128
mod-configuration4mod-configuration:5.10.02102489612876888128
mod-orders-storage4mod-orders-storage:13.7.02102489651270088128
edge-caiasoft4edge-caiasoft:2.2.02102489612876888128
mod-login-saml4mod-login-saml:2.8.02102489612876888128
mod-erm-usage-harvester4mod-erm-usage-harvester:4.5.02102489612876888128
mod-licenses4mod-licenses:6.0.02248023121281792384512
mod-gobi4mod-gobi:2.8.02102489612870088128
mod-password-validator4mod-password-validator:3.2.0214401298128768384512
mod-bulk-operations4mod-bulk-operations:2.0.023072260010241536384512
mod-fqm-manager4mod-fqm-manager:2.0.12300026001282048384512
edge-dcb4edge-dcb:1.1.02102489612876888128
mod-graphql5mod-graphql:1.12.12102489612876888128
mod-finance4mod-finance:4.9.02102489612870088128
mod-erm-usage4mod-erm-usage:4.7.02102489612876888128
mod-batch-print5mod-batch-print:1.1.02102489612876888128
mod-copycat4mod-copycat:1.6.02102451212876888128
mod-lists4mod-lists:2.0.02300026001282048384512
mod-entities-links5mod-entities-links:3.0.0225922480400144001024
mod-permissions8mod-permissions:6.5.02168415445121024384512
pub-edge3pub-edge:2023.06.142102489612876800
mod-orders4mod-orders:12.8.022048144010241024384512
edge-patron4edge-patron:5.1.02102489625676888128
edge-ncip4edge-ncip:1.9.22102489612876888128
edge-inn-reach2edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
mod-users-bl4mod-users-bl:7.7.021440115251292288128
mod-oa2mod-oa:2.1.0-SNAPSHOT.622102489612876888128
mod-inventory-storage4mod-inventory-storage:27.1.024096369020483076384512
mod-invoice5mod-invoice:5.8.021440115251292288128
mod-user-import4mod-user-import:3.8.02102489612876888128
mod-sender5mod-sender:1.12.02102489612876888128
edge-oai-pmh4edge-oai-pmh:2.9.021512136010241440384512
mod-data-export-worker4mod-data-export-worker:3.2.123072204810242048384512
mod-rtac4mod-rtac:3.6.02102489612876888128
mod-circulation-storage4mod-circulation-storage:17.2.022880259215361814384512
mod-calendar4mod-calendar:3.1.02102489612876888128
mod-source-record-storage4mod-source-record-storage:5.8.025600500020483500384512
mod-event-config4mod-event-config:2.7.02102489612876888128
mod-courses4mod-courses:1.4.102102489612876888128
mod-circulation-item4mod-circulation-item:1.0.021024896128000
mod-inventory4mod-inventory:20.2.022880259210241814384512
mod-email4mod-email:1.17.02102489612876888128
mod-pubsub4mod-pubsub:2.13.02153614401024922384512
mod-circulation4mod-circulation:24.2.022880259215361814384512
mod-di-converter-storage4mod-di-converter-storage:2.2.02102489612876888128
edge-rtac4edge-rtac:2.7.12102489612876888128
edge-orders4edge-orders:3.0.02102489612876888128
mod-users5mod-users:19.3.12102489612876888128
mod-template-engine4mod-template-engine:1.20.02102489612876888128
mod-patron-blocks4mod-patron-blocks:1.10.021024896102476888128
mod-audit4mod-audit:2.9.02102489612876888128
edge-fqm4edge-fqm:2.0.02102489612876888128
mod-source-record-manager5mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-edge3nginx-edge:2023.06.1421024896128000
mod-quick-marc4mod-quick-marc:5.1.01228821761281664384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b4okapi:5.3.03168414401024922384512
mod-feesfines4mod-feesfines:19.1.02102489612876888128
mod-invoice-storage4mod-invoice-storage:5.8.021872153610241024384512
mod-dcb5mod-dcb:1.1.02102489612876888128
mod-service-interaction4mod-service-interaction:4.0.12204818442561290384512
mod-data-export13mod-data-export:5.0.412048
1844
2048000
mod-patron4mod-patron:6.1.02102489612876888128
mod-oai-pmh4mod-oai-pmh:3.13.024096369020483076384512
edge-connexion4edge-connexion:1.2.02102489612876888128
mod-kb-ebsco-java4mod-kb-ebsco-java:4.0.02102489612876888128
mod-notes4mod-notes:5.2.021024896128952384512
mod-data-export-spring4mod-data-export-spring:3.2.01204818442561536384512
mod-organizations-storage4mod-organizations-storage:4.7.02102489612870088128
mod-login4mod-login:7.11.02144012981024768384512
pub-okapi3pub-okapi:2023.06.142102489612876800
mod-eusage-reports4mod-eusage-reports:2.1.12102489612876888128


Methodology/Approach

  • Populate ptf-mobius-testing2 cluster with topics from tenant cluster
  • Run CICO for 2 hours
  • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
  • Run alone Data Imports
  • Create new kafka cluster
  • Populate NEW cluster with topics from tenant cluster
  • Run CICO for 2 hours
  • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
  • Run alone Data Imports
  • Compare resource utilization of MSK and main KPI for CICO & DI

Additional/Files

Topics:

View file
nameptf-kafka-tenantCluster-topics_2replicationfactor.csv
height250

View file
namePERF-921_MSK_Instance_Comparison.xlsx
height250