Skip to end of banner
Go to start of banner

Data Import Create MARC Holdings Records [ECS] [Quesnelia]

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Overview IN PROGRESS

The purpose of the document is getting results of testing Data Import Create MARC holdings records and to detect performance trends in Quesnelia in scope of ticket  PERF-855 - Getting issue details... STATUS

Compared with results in previous test report: Data Import Create MARC holdings records [Poppy]

Summary

  • Data import create holdings job durations increased significantly in Quesnelia release. 4 times longer with 10k file. And not defined increasing in 80k file because it was stopped after 4 hours of test run with only 46 committed jobs (total for the test was 81).
  • Top CPU utilization: mod-inventory-b - 16%, nginx-okapi - 5%, mod-source-record-storage-b - 4%, mod-quick-marc-b - 7%. Such low resource utilization from modules side can be explained by DB queries huge average latency during INSERT and UPDATE processes which had lock on the same tuple.
  • Top memory consumption: mod-inventory-storage-b - 85%, mod-data-import-b - 52%, mod-source-record-storage-b - 45%, mod-source-record-manager-b - 43%. Growing trend was defined in tests set #1 for mod-inventory-storage-b - 85%
  • DI job duration for the same file size grew from test to test if to use the same instance HRID to create holdings
  • DI perform faster if to use files with 1 unique instance HRID for every 1000 records. DI duration corresponds to file size with such approach. Memory utilized without growing trend. CPU and RDS utilization increased because there are less locks in DB.

Recommendations & Jiras

  • Investigate growing trend for mod-inventory-storage in tests set #1 (using 1 instance HRID to create all Holdings).

Errors

  • error status for the job - SNAPSHOT_UPDATE_ERROR

Test Runs 

Profile used for testing - Default - Create Holdings and SRS MARC Holdings

Set of tests â„–ScenarioTest ConditionsStatus
1

DI Holdings Create (previous approach)

1 instance HRID for all created holdings

1K, 5K, 10K, 80K sequentially Completed
2

DI Holdings Create (new approach)

1 instance HRID for every 1000 created holdings

1K, 5K, 10K, 80K sequentially 

Completed

Test Results

Set 1 - Files used to test DI create Holdings had 1 instance HRID for all created Holdings

Test

File

Duration: Orchid

(previous results)

Duration: Poppy

(previous results)

Duration: Quesnelia
11k45s32s

1 min 22 sec

25k7m 47s2m 14s8 min
310k19m 46s4m 35s22 min 40 sec
480k20m (error*)36m 25s

4 hours 13 min

Stopped by user after 46 job COMMITTED from 81 - 56% finished

1 job status - ERROR, with error status - SNAPSHOT_UPDATE_ERROR

(job number - 32, file_name = '1718290065265-80k_holdings_Create_32.mrc')

Set 2 - Files used to test DI create Holdings had 1 unique instance HRID for every 1000 created Holdings (new approach)

Test

File

Duration: Orchid

(previous results)

Duration: Poppy

(previous results)

Duration: Quesnelia
11k45s32s1 min 3 sec
25k7m 47s2m 14s4 min 16 sec
310k19m 46s4m 35s8 min 59 sec
480k20m (error*)36m 25s52 min 5 sec

Comparison

Table contains comparison between Quesnelia and Poppy

Set #1

TestFileDuration: PoppyDuration: Quesnelia set #1Difference absoluteDifference percentage
11k00:00:3200:01:2200:00:50156%
25k00:02:1400:08:0000:05:46258%
310k00:04:3500:22:4000:18:05395%
480k00:36:2504:13:0003:36:35595%

Set #2

TestFileDuration: PoppyDuration: Quesnelia set #2Difference absoluteDifference percentage
11k00:00:3200:01:0300:00:3197%
25k00:02:1400:04:1600:02:0291%
310k00:04:3500:08:5900:04:2496%
480k00:36:2500:52:0500:15:4043%

Service CPU Utilization

 CPU utilization, 1k, 5k, 10k, 80k

Set #1

ModuleCPU (1k)CPU (5k)CPU (10k)CPU (80k)
mod-inventory-b20.2324.0320.516.34
mod-di-converter-storage-b12.313.516.213.94
nginx-okapi10.6713.0710.375.13
mod-inventory-storage-b9.9314.4912.8710.95
mod-quick-marc-b8.247.347.956.82
mod-source-record-storage-b7.998.426.473.79
mod-users-b7.165.955.936.58
okapi-b6.767.86.153.75
mod-pubsub-b5.695.735.785.69
mod-data-import-b5.151.271.61.23
mod-source-record-manager-b3.74.554.013.06
mod-authtoken-b3.691.261.460.99
mod-password-validator-b2.362.342.343.33
mod-feesfines-b2.342.122.22.16
mod-configuration-b2.222.081.982.71
mod-permissions-b1.660.81.110.94
mod-circulation-storage-b0.610.610.610.68
mod-circulation-b0.360.390.330.4
pub-okapi0.190.130.130.13

Set #2

ModuleCPU (1k)CPU (5k)CPU (10k)CPU (80k)
mod-inventory-b11.9328.8532.9733.23
mod-quick-marc-b7.198.18.197.64
mod-pubsub-b6.26.566.726.55
mod-users-b5.796.476.096.82
mod-configuration-b3.123.213.473.4
mod-feesfines-b2.392.472.552.34
mod-password-validator-b2.262.43.272.41
mod-di-converter-storage-b1.9510.229.398.98
mod-source-record-storage-b1.7911.4511.6410.84
mod-source-record-manager-b1.656.446.425.87
mod-data-import-b1.361.481.831.48
okapi-b1.2813.811414.93
mod-authtoken-b1.021.241.61.6
mod-circulation-storage-b0.70.710.710.73
nginx-okapi0.5920.9920.3922.89
mod-permissions-b0.484.951.651.63
mod-circulation-b0.350.360.350.36
mod-inventory-storage-b0.3314.2414.5413.96
pub-okapi0.170.240.240.23

Set #1: mod-inventory-b - 16%, nginx-okapi - 5%, mod-source-record-storage-b - 4%, mod-quick-marc-b - 7%

Set #2: mod-inventory-b - 33%, nginx-okapi - 23%, mod-source-record-storage-b - 11%, mod-quick-marc-b - 7%

Set #1

Set #2


Memory Utilization

 Memory consumption

Set #1

ModuleMemory
mod-inventory-storage-b85.62
mod-data-import-b51.63
mod-source-record-storage-b44.97
mod-source-record-manager-b42.86
mod-users-b40.38
mod-inventory-b39.47
mod-permissions-b35.82
okapi-b33.4
mod-di-converter-storage-b33.26
mod-feesfines-b32.37
mod-quick-marc-b31.46
mod-configuration-b29.41
mod-pubsub-b25.66
mod-authtoken-b20.55
mod-circulation-storage-b18.93
mod-circulation-b17.87
nginx-okapi4.8
pub-okapi4.8

Set #2

ModuleMemory
mod-inventory-storage-b56.04
mod-data-import-b55.45
mod-inventory-b45.63
mod-source-record-manager-b41.19
mod-users-b38.95
mod-source-record-storage-b37.37
mod-quick-marc-b33.59
mod-permissions-b33.45
okapi-b32.82
mod-feesfines-b32.65
mod-di-converter-storage-b31.91
mod-configuration-b28.49
mod-circulation-storage-b26.86
mod-pubsub-b25.83
mod-circulation-b20.14
mod-authtoken-b19.97
nginx-okapi4.69
pub-okapi4.58


Set #1

Set #2

RDS CPU Utilization

Set #1

62% for major part of the tests which is 20% less than in Poppy. It raised to 73% with 80k file after 50 minutes of test tun.

Set #2

99% during all tests

DB Connections

Set #1

DB connections number- 1430

Set #2

DB connections number- 1500

DB Load

Set #1

Set #2


SQL queries

Set #1

Set #2

TOP SQL Queries for Set #1
UPDATE cs00000int_0001_mod_inventory_storage.holdings_record SET jsonb = $1::jsonb WHERE id = '[UUID]'
INSERT INTO cs00000int_0001_mod_inventory_storage.holdings_record (id, jsonb) VALUES ($1, $2) RETURNING jsonb
autovacuum: VACUUM cs00000int_mod_entities_links.authority
autovacuum: VACUUM cs00000int_mod_entities_links.authority_archive
autovacuum: VACUUM pg_toast.pg_toast_40004

  INSERT INTO cs00000int_mod_search.consortium_instance (tenant_id, instance_id, json, created_date, updated_date)
  VALUES ($1, $2, $3::json, $4, $5)
  ON CONFLICT (tenant_id, instance_id)
  DO UPDATE SET json = EXCLUDED.json, updated_date = EXCLUDED.updated_date

SELECT jsonb,id FROM cs00000int_0001_mod_inventory_storage.instance_holdings_item_view WHERE id='db87a6b4-d1f5-4e3d-b34b-d4bf06426127' LIMIT 1 OFFSET 0

TOP SQL Queries for Set #2
INSERT INTO cs00000int_0001_mod_inventory_storage.holdings_record (id, jsonb) VALUES ($1, $2) RETURNING jsonb
UPDATE cs00000int_0001_mod_inventory_storage.holdings_record SET jsonb = $1::jsonb WHERE id = '47ee9b78-3d8f-4e8b-b09e-82e9396eb3b3'

with "cte" as (select count(*) from "records_lb" where ("records_lb"."snapshot_id" <> cast($1 as uuid) and "records_lb"."external_id" = cast($2 as uuid) and "records_lb"."record_type" = $3::"record_type")) select "records_lb"."id", "records_lb"."snapshot_id", "records_lb"."matched_id", "records_lb"."generation", "records_lb"."record_type", "records_lb"."external_id", "records_lb"."state", "records_lb"."leader_record_status", "records_lb"."order", "records_lb"."suppress_discovery", "records_lb"."created_by_user_id", "records_lb"."created_date", "records_lb"."updated_by_user_id", "records_lb"."updated_date", "records_lb"."external_hrid", "marc_records_lb"."content" as "parsed_record_content", "raw_records_lb"."content" as "raw_record_content", "error_records_lb"."content" as "error_record_content", "error_records_lb"."description", "count" from "records_lb" left outer join "marc_records_lb" on "records_lb"."id" = "marc_records_lb"."id" left outer join "raw_records_lb" on "records_lb"."id" = "raw_records_lb"."id" left outer join "error_records_lb" on "records_lb"."id" = "error_records_lb"."id" right outer join (select * from "cte") as "alias_80949780" on true where ("records_lb"."snapshot_id" <> cast($4 as uuid) and "records_lb"."external_id" = cast($5 as uuid) and "records_lb"."record_type" = $6::"record_type") offset $7 rows fetch next $8 rows only

INSERT INTO cs00000int_0001_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)
INSERT INTO cs00000int_0001_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)

autovacuum: VACUUM pg_toast.pg_toast_40004

Infrastructure

PTF - environment qcon

NameMemory GIBvCPUs

db.r6g.4xlarge

128 GiB16 vCPUs
  • MSK ptf-mobius-testing2
    • 2 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0
    • EBS storage volume per broker 300 GiB
    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2
 Modules
ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSize
qcon-pvt








16/06/2024








mod-inventory-b








mod-quick-marc-b








nginx-okapi








mod-di-converter-storage-b








okapi-b








mod-source-record-storage-b








mod-source-record-manager-b








mod-inventory-storage-b








mod-pubsub-b








mod-users-b








mod-data-import-b








mod-organizations-storage-b








mod-notes-b








mod-gobi-b








mod-permissions-b








mod-search-b








mod-circulation-storage-b








mod-circulation-b








pub-okapi








  • No labels