Skip to end of banner
Go to start of banner

Data Import with Check-ins Check-outs (Quesnelia)[non-ECS] MSK instance type comparison

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Overview IN PROGRESS

This document contains the results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with a new MSK instance type. The main idea is to see how the m7g series affects FOLIO performance. Compared results for main workflows with different instance types: kafka.m5.2xlarge against kafka.m7g.2xlarge.

Ticket: PERF-921 - Getting issue details... STATUS

Summary

  • Comparing kafka.m5.2xlarge against kafka.m7g.2xlarge instance type
    • Data Import durations and CI/CO response time do not differ significantly. The number of requests during 2 hour CI/CO with Data Import test stayed the same 287669 in m5 and 287155 in m7g MSK instance type.
  • Resource utilization
    • Memory utilization didn't differ a lot between two MSK clusters
    • Disk consume less resources in both (idle and CICO+DI) scenarios with m7g instance type. CPU utilization the same in idle state and from 3% to 18% less under load with m7g instance type.
    • Delta for CPU utilization shows in mod-di-converter-storage-b 20% decrease for update job and 10% decrease for mod-feesfines-b module. The most part of modules CPU utilization fluctuate under 10%. 
    • Average DB CPU usage for both MSK clusters during data import is 85% during create jobs and 87& during Update jobs. Check-In/Check-Out period without DI - 15%.
    • Average connection count for both MSK clusters during data import is about 850 connections for create and update jobs with CI/CO. And 730 connections for CI/CO without data import
    • MSK instance CPU and Disk utilization remain on the same level in kafka.m7g.2xlarge against kafka.m5.2xlarge or even decreased.

Test Runs 

Test #

MSK instance type

Scenario

Load level
1kafka.m5.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
2DI MARC Bib Create5K, 25K sequentially
3CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
4DI MARC Bib Update5K, 25K sequentially
5kafka.m7g.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
6DI MARC Bib Create5K, 25K sequentially
7CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
8DI MARC Bib Update5K, 25K sequentially

Test Results

This table shows results of Check-In/Check-out and Data Import create and update jobs.

The only difference between tests is MSK cluster instance type. Cluster ptf-mobius-testing2 has kafka.m5.2xlarge and cluster PERF-921 has kafka.m7g.2xlarge

MSK instance: kafka.m5.2xlargeFile sizeDI Duration without CI/CODI DurationCI Average secCO Average sec
Create5k00:02:3100:02:540.8991.409

25k00:11:4900:12:490.7241.152
Update5k00:03:0600:03:140.8071.257

25k00:15:0000:15:300.7841.275
MSK instance: kafka.m7g.2xlarge




Create5k00:03:0500:02:390.7071.104

25k00:12:0300:12:080.7181.129
Update5k00:03:3600:03:340.7421.124

25k00:17:0500:17:330.7561.148

Check-in/Check-out without DI

ScenarioLoad levelRequest

Response time, sec
MSK instance: kafka.m5.2xlarge

Response time, sec
MSK instance: kafka.m7g.2xlarge

95 percaverage95 percaverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6690.5700.7200.606
Check-out1.1520.9601.2410.969

Comparison

Data Import durations and Check-In/Check-Out response time comparison

Data Import durations and CI/CO response time do not differ significantly. The number of requests during 2 hour CI/CO with Data Import test stayed the same 287669 in m5 and 287155 in m7g MSK instance type.

Job ProfileFile sizeDELTA, DIDELTA, DI+CICODELTA, CIDELTA, CO
PTF - Create 25k00:00:3400:00:150.1920.305
25k00:00:1400:00:410.0060.023
PTF - Updates Success - 65k00:00:3100:00:200.0650.133
25k00:02:0600:02:030.0280.127

Kafka resource utilization comparison table

Disk consume less resources in both (idle and CICO+DI) scenarios with m7g instance type. CPU utilization the same in idle state and from 3% to 18% less under load with m7g instance type.

Utilized resourcesScenario/LoadBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta
Disk usageIdle state





10.4740.356-0.118


20.4760.357-0.119

CICO+DI (update 25k)





14.61202044.35-0.26202


24.6111044.35-0.2611
CPU usageIdle state





16.4312498756.639582750.208333


26.15166666.2270836250.075417







CICO113.762502510.6770835-3.08542


211.947916259.87916575-2.06875

CICO+DI (update 25k)138.0916662531.13749875-6.95417


233.8229112532.53334625-1.28957

This table show comparison results of CICO without Data Import in two MSK clusters

ScenarioLoad levelRequestResponse time, sec
Response time, sec
Delta



MSK instance: kafka.m5.2xlarge
MSK instance: kafka.m7g.2xlarge




95 percAverage95 percAverageAverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6690.570.720.6060.036


Check-out1.1520.961.2410.9690.009

Response time

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

Service CPU Utilization

Delta for CPU utilization shows in mod-di-converter-storage-b 20% decrease for update job and 10% decrease for mod-feesfines-b module. The most part of modules CPU utilization fluctuate under 10%. 

 MSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge
MSK instance: kafka.m5.2xlarge
MSK instance: kafka.m7g.2xlargeDELTA
ModuleCPU (CICO + 25k Create)CPU (CICO + 25k Update)
ModuleCPU (CICO + 25k Create)CPU (CICO + 25k Update)Delta, CreateDelta, Update
mod-inventory-b110.54145.5
mod-inventory-b115.21136.944.67-8.56
mod-quick-marc-b90.64102.38
mod-quick-marc-b95.1596.44.51-5.98
mod-di-converter-storage-b78.09121.08
mod-di-converter-storage-b81.26100.433.17-20.65
nginx-okapi64.198.19
nginx-okapi70.5888.946.48-9.25
okapi-b39.1458.73
okapi-b38.8950.55-0.25-8.18
mod-source-record-storage-b28.0644.84
mod-source-record-storage-b31.6139.133.55-5.71
mod-users-b23.4120.28
mod-users-b23.622.120.191.84
mod-inventory-storage-b20.1724.74
mod-inventory-storage-b21.3719.91.2-4.84
mod-source-record-manager-b18.919.54
mod-feesfines-b18.289.11-0.62-10.43
mod-feesfines-b17.748.11
mod-configuration-b17.610.52-0.142.41
mod-configuration-b14.5510.3
mod-source-record-manager-b17.3918.272.847.97
mod-dcb-b12.311.91
mod-authtoken-b17.0413.374.741.46
mod-authtoken-b7.6711.87
mod-dcb-b13.2312.335.560.46
mod-search-b7.326
mod-search-b7.951.830.63-4.17
mod-pubsub-b6.356.8
mod-pubsub-b6.826.490.47-0.31
mod-entities-links-b3.582.26
pub-okapi3.563.64-0.021.38
pub-okapi3.423.4
mod-circulation-storage-b3.352.7-0.07-0.7
mod-patron-b2.842.77
mod-patron-b2.722.79-0.120.02
mod-circulation-storage-b2.832.91
mod-entities-links-b2.242.23-0.59-0.68
mod-data-import-b2.041.65
mod-circulation-b1.981.8-0.060.15
mod-circulation-b1.921.6
mod-data-import-b1.761.88-0.160.28
edge-patron-b1.151.16
edge-patron-b1.131.16-0.020
mod-patron-blocks-b0.990.81
mod-patron-blocks-b0.971-0.020.19
mod-users-bl-b0.852.51
mod-users-bl-b0.680.68-0.17-1.83
pub-edge0.070.07
pub-edge0.060.06-0.01-0.01

DI MARC BIB Create and Update + CICO

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

Service Memory Utilization

 MSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge
ModuleMemory (kafka.m5.2xlarge)Memory (kafka.m7g.2xlarge)Delta
mod-dcb-b68.8174.375.56
mod-inventory-b68.2370.812.58
mod-users-b50.1750.370.2
mod-di-converter-storage-b48.6246.7-1.92
mod-feesfines-b45.5645.44-0.12
mod-inventory-storage-b45.3233.4-11.92
mod-source-record-storage-b44.2755.5311.26
okapi-b41.8542.50.65
mod-data-import-b41.4243.552.13
mod-patron-blocks-b41.0442.381.34
mod-search-b40.5545.585.03
mod-users-bl-b39.8245.365.54
mod-configuration-b38.7838.68-0.1
mod-source-record-manager-b38.4541.913.46
mod-pubsub-b36.8635.94-0.92
mod-quick-marc-b31.2542.611.35
mod-patron-b31.1930.52-0.67
mod-entities-links-b27.1234.497.37
mod-authtoken-b26.1727.321.15
mod-circulation-b24.1725.150.98
edge-patron-b22.7722.38-0.39
mod-circulation-storage-b20.0222.342.32
nginx-okapi4.694.58-0.11
pub-okapi4.524.46-0.06
pub-edge4.464.41-0.05

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

DB CPU Utilization

Average DB CPU usage for both MSK clusters during data import is 85% during create jobs and 87& during Update jobs. Check-In/Check-Out period without DI - 15%.

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge


DB Connections

Average connection count for both MSK clusters during data import is about 850 connections for create and update jobs with CI/CO. And 730 connections for CI/CO without data import

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

MSK instance resource utilization

 MSK resources table

MSK instance CPU and Disk utilization remain on the same level in kafka.m7g.2xlarge against kafka.m5.2xlarge or even decreased.



MSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta%
Disk usageIdle state




10.4740.356-0.118

20.4760.357-0.119

Under load




14.61202044.35-0.26202

24.6111044.35-0.2611
CPU usageIdle state




16.4312498756.639582750.208333

26.15166666.2270836250.075417

Under load



CICO113.762502510.6770835-3.08542-22.42%

211.947916259.87916575-2.06875-17.31%
CICO+DI138.0916662531.13749875-6.95417-18.26%

233.8229112532.53334625-1.28957-3.81%

Disk usage by broker

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

CPU (User) usage by broker

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

DB load

MSK instance: kafka.m5.2xlarge

Top SQL-queries:

MSK instance: kafka.m7g.2xlarge


Top SQL-queries:

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)

INSERT INTO fs09000000_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)

INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)

Appendix

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731
  • MSK ptf-mobius-testing2
    • 2 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2
  • MSK perf-921-g2
    • 2 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2

Task count for modules mod-oa-b, mod-graphql set to 0 before test start.

Modules

 All qcp1 modules
ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSize
qcp1-pvt








Tue Jun 04 07:31:53 UTC 2024








mod-remote-storage4mod-remote-storage:3.2.024920447210243960512512
mod-ncip4mod-ncip:1.14.42102489612876888128
mod-finance-storage4mod-finance-storage:8.6.021024896102470088128
mod-agreements4mod-agreements:7.0.0215921488128000
mod-ebsconet4mod-ebsconet:2.2.0212481024128700128256
mod-organizations4mod-organizations:1.9.02102489612870088128
mod-consortia2mod-consortia:1.1.023072204812820485121024
edge-sip22edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
mod-serials-management4mod-serials-management:1.0.02248023121281792384512
mod-settings4mod-settings:1.0.32102489620076888128
mod-data-import7mod-data-import:3.1.01204818442561292384512
edge-dematic4edge-dematic:2.2.01102489612876888128
mod-search4mod-search:3.2.0225922480204814405121024
mod-inn-reach2mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
mod-tags4mod-tags:2.2.02102489612876888128
edge-courses4edge-courses:1.4.02102489612876888128
mod-authtoken5mod-authtoken:2.15.121440115251292288128
mod-inventory-update4mod-inventory-update:3.3.02102489612876888128
mod-notify4mod-notify:3.2.02102489612876888128
mod-configuration4mod-configuration:5.10.02102489612876888128
mod-orders-storage4mod-orders-storage:13.7.02102489651270088128
edge-caiasoft4edge-caiasoft:2.2.02102489612876888128
mod-login-saml4mod-login-saml:2.8.02102489612876888128
mod-erm-usage-harvester4mod-erm-usage-harvester:4.5.02102489612876888128
mod-licenses4mod-licenses:6.0.02248023121281792384512
mod-gobi4mod-gobi:2.8.02102489612870088128
mod-password-validator4mod-password-validator:3.2.0214401298128768384512
mod-bulk-operations4mod-bulk-operations:2.0.023072260010241536384512
mod-fqm-manager4mod-fqm-manager:2.0.12300026001282048384512
edge-dcb4edge-dcb:1.1.02102489612876888128
mod-graphql5mod-graphql:1.12.12102489612876888128
mod-finance4mod-finance:4.9.02102489612870088128
mod-erm-usage4mod-erm-usage:4.7.02102489612876888128
mod-batch-print5mod-batch-print:1.1.02102489612876888128
mod-copycat4mod-copycat:1.6.02102451212876888128
mod-lists4mod-lists:2.0.02300026001282048384512
mod-entities-links5mod-entities-links:3.0.0225922480400144001024
mod-permissions8mod-permissions:6.5.02168415445121024384512
pub-edge3pub-edge:2023.06.142102489612876800
mod-orders4mod-orders:12.8.022048144010241024384512
edge-patron4edge-patron:5.1.02102489625676888128
edge-ncip4edge-ncip:1.9.22102489612876888128
edge-inn-reach2edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
mod-users-bl4mod-users-bl:7.7.021440115251292288128
mod-oa2mod-oa:2.1.0-SNAPSHOT.622102489612876888128
mod-inventory-storage4mod-inventory-storage:27.1.024096369020483076384512
mod-invoice5mod-invoice:5.8.021440115251292288128
mod-user-import4mod-user-import:3.8.02102489612876888128
mod-sender5mod-sender:1.12.02102489612876888128
edge-oai-pmh4edge-oai-pmh:2.9.021512136010241440384512
mod-data-export-worker4mod-data-export-worker:3.2.123072204810242048384512
mod-rtac4mod-rtac:3.6.02102489612876888128
mod-circulation-storage4mod-circulation-storage:17.2.022880259215361814384512
mod-calendar4mod-calendar:3.1.02102489612876888128
mod-source-record-storage4mod-source-record-storage:5.8.025600500020483500384512
mod-event-config4mod-event-config:2.7.02102489612876888128
mod-courses4mod-courses:1.4.102102489612876888128
mod-circulation-item4mod-circulation-item:1.0.021024896128000
mod-inventory4mod-inventory:20.2.022880259210241814384512
mod-email4mod-email:1.17.02102489612876888128
mod-pubsub4mod-pubsub:2.13.02153614401024922384512
mod-circulation4mod-circulation:24.2.022880259215361814384512
mod-di-converter-storage4mod-di-converter-storage:2.2.02102489612876888128
edge-rtac4edge-rtac:2.7.12102489612876888128
edge-orders4edge-orders:3.0.02102489612876888128
mod-users5mod-users:19.3.12102489612876888128
mod-template-engine4mod-template-engine:1.20.02102489612876888128
mod-patron-blocks4mod-patron-blocks:1.10.021024896102476888128
mod-audit4mod-audit:2.9.02102489612876888128
edge-fqm4edge-fqm:2.0.02102489612876888128
mod-source-record-manager5mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-edge3nginx-edge:2023.06.1421024896128000
mod-quick-marc4mod-quick-marc:5.1.01228821761281664384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b4okapi:5.3.03168414401024922384512
mod-feesfines4mod-feesfines:19.1.02102489612876888128
mod-invoice-storage4mod-invoice-storage:5.8.021872153610241024384512
mod-dcb5mod-dcb:1.1.02102489612876888128
mod-service-interaction4mod-service-interaction:4.0.12204818442561290384512
mod-data-export13mod-data-export:5.0.412048
1844
2048000
mod-patron4mod-patron:6.1.02102489612876888128
mod-oai-pmh4mod-oai-pmh:3.13.024096369020483076384512
edge-connexion4edge-connexion:1.2.02102489612876888128
mod-kb-ebsco-java4mod-kb-ebsco-java:4.0.02102489612876888128
mod-notes4mod-notes:5.2.021024896128952384512
mod-data-export-spring4mod-data-export-spring:3.2.01204818442561536384512
mod-organizations-storage4mod-organizations-storage:4.7.02102489612870088128
mod-login4mod-login:7.11.02144012981024768384512
pub-okapi3pub-okapi:2023.06.142102489612876800
mod-eusage-reports4mod-eusage-reports:2.1.12102489612876888128

Methodology/Approach

  • Populate ptf-mobius-testing2 cluster with topics from tenant cluster
  • Run CICO for 2 hours
  • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
  • Run alone Data Imports
  • Create new kafka cluster
  • Populate NEW cluster with topics from tenant cluster
  • Run CICO for 2 hours
  • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
  • Run alone Data Imports
  • Compare resource utilization of MSK and main KPI for CICO & DI
  • No labels