Data Import with Check-ins Check-outs (Quesnelia)[non-ECS]
Overview
This document contains the results of testing Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release.
Ticket: - PERF-824Getting issue details... STATUS
Summary
- Data import tests finished successfully for all files. Duration of DI grew in correspondence with the number of records in files.
- Check-in and Check-out with 8 virtual users was performed during DI Create and Update jobs. No issues.
- Data Import in Quesnelia with CICO perform faster than without it.
- Comparing Poppy and Quesnelia releases
- Check-in / Check-out perform better in Quesnelia. Response time improved during Create jobs - 30% in Average, and during DI Update jobs - 15% in Average.
- DI durations do not differ much
- Resource utilization
- Average CPU utilization did not exceed 150% for all modules. The highest consumption observed from mod-inventory - 144% in DI Update job with 25k file. The same maximum level as in Poppy.
- No memory leaks observed during tests.
- Average DB CPU usage during data import has 5% decreasing in Quesnelia - close to 90%.
- Average connection count during data import is about 750 connections for create jobs that is 450 connections higher than in Poppy. For update jobs - 730 connections.
Test Runs
Test # | Scenario | Load level | Comment |
---|---|---|---|
1 | DI MARC Bib Create | 5K, 10K, 25K, 50K, 100K consequentially | |
CICO | 8 users | ||
2 | DI MARC Bib Update | 5K, 10K, 25K, 50K, 100K consequentially | |
CICO | 8 users |
Test Results
Data import
Files for Data Import update jobs prepared during previous tests. So no need to run Data Export.
Profile | MARC File |
Quesnelia (hh:mm:ss) | Check In, Check Out Response time (8 users) Quesnelia | |
---|---|---|---|---|
CI Average sec | CO Average sec | |||
DI MARC Bib Create (PTF - Create 2) | 5K.mrc | 00:02:32 | 0.645 | 0.901 |
10K.mrc | 00:05:03 | 0.628 | 0.922 | |
25K.mrc | 00:11:58 | 0.639 | 0.960 | |
50K.mrc | 00:23:29 | 0.678 | 1.003 | |
100K.mrc | 00:46:07 | 0.686 | 0.998 | |
DI MARC Bib Update (PTF - Updates Success - 1) | 5K.mrc | 00:03:24 | 0.628 | 0.975 |
10K.mrc | 00:06:29 | 0.664 | 1.018 | |
25K.mrc | 00:16:15 | 0.717 | 1.062 | |
50K.mrc | 00:33:33 | 0.721 | 1.071 | |
100K.mrc | 01:10:14 | 0.739 | 1.081 |
Check-in/Check-out without DI
Scenario | Load level | Request | Response time, sec Quesnelia | |
---|---|---|---|---|
95 perc | average | |||
Circulation Check-in/Check-out (without Data import) | 8 users | Check-in | 0.609 | 0.521 |
Check-out | 1.070 | 0.803 |
Comparison
CICO with DI comparison
DI duration results without Check-In and Check-Out for Quesnelia were taken from the report Data Import test report (Quesnelia)[non-ECS].
Profile | MARC File | DI Duration | Deviation, % (compared DI Quesnelia without CICO and with CICO) | DI Delta, (hh:mm:ss) Poppy/Quesnelia (with CICO) | Check In, Check Out Response time (8 users) | Check In, Check Out Response time (8 users) | Delta, % | ||||||
without CI/CO | with CI/CO | Poppy | Quesnelia | Poppy/Quesnelia | Poppy/Quesnelia | ||||||||
Poppy | Quesnelia | Poppy | Quesnelia | CI Average sec | CO Average sec | CI Average sec | CO Average sec | CI | CO | ||||
DI MARC Bib Create (PTF - Create 2) | 5K.mrc | 00:02:39 | 00:03:20 | 00:02:53 | 00:02:32 | - 24% / 48 sec | - 00:00:21 | 0.901 | 1.375 | 0.645 | 0.901 | -28.41% | -34.47% |
10K.mrc | 00:05:00 | 00:06:00 | 00:04:32 | 00:05:03 | - 15% / 57 sec | + 00:00:31 | 0.902 | 1.47 | 0.628 | 0.922 | -30.38% | -37.28% | |
25K.mrc | 00:11:15 | 00:13:41 | 00:11:14 | 00:11:58 | - 12% / 1 min 43 sec | + 00:00:44 | 1 | 1.571 | 0.639 | 0.96 | -36.10% | -38.89% | |
50K.mrc | 00:22:16 | 00:21:59 | 00:21:55 | 00:23:29 | + 6% / 1 min 34 sec | + 00:01:34 | 0.981 | 1.46 | 0.678 | 1.003 | -30.89% | -31.30% | |
100K.mrc | 00:49:58 | 00:40:16 | 00:47:02 | 00:46:07 | + 14% / 5 min 51 sec | - 00:00:55 | 1.018 | 1.491 | 0.686 | 0.998 | -32.61% | -33.07% | |
DI MARC Bib Update (PTF - Updates Success - 6) | 5K.mrc | 00:02:28 | 00:07:10 | 00:03:19 | 00:03:24 | - 52% / 3 min 46 sec | + 00:00:05 | 0.755 | 1.169 | 0.628 | 0.975 | -16.82% | -16.60% |
10K.mrc | 00:05:31 | 00:10:27 | 00:06:20 | 00:06:29 | - 37% / 3 min 58 sec | + 00:00:09 | 0.75 | 1.307 | 0.664 | 1.018 | -11.47% | -22.11% | |
25K.mrc | 00:14:50 | 00:23:16 | 00:14:04 | 00:16:15 | - 30% / 7 min 1 sec | + 00:02:11 | 0.822 | 1.403 | 0.717 | 1.062 | -12.77% | -24.31% | |
50K.mrc | 00:32:53 | 00:40:52 | 00:29:59 | 00:33:33 | - 17% / 7 min 19 sec | + 00:03:34 | 0.893 | 1.424 | 0.721 | 1.071 | -19.26% | -24.79% | |
100K.mrc | 01:14:39 | 01:02:00 | 01:03:03 | 01:10:14 | + 13% / 8 min 14 sec | + 00:07:11 | 0.908 | 1.51 | 0.739 | 1.081 | -18.61% | -28.41% |
The following table compares test results of current release (Quesnelia ) to the previous release (Poppy).
* Poppy DI and CICO results are taken from Data Import with Check-ins Check-outs Poppy
Detailed CICO response time comparison
Scenario | Load level | Request | Response time, sec Poppy | Response time, sec Quesnelia | ||
---|---|---|---|---|---|---|
95 perc | average | 95 perc | average | |||
Circulation Check-in/Check-out (without Data import) | 8 users | Check-in | 0.489 | 0.431 | 0.609 | 0.521 |
Check-out | 0.969 | 0.828 | 1.070 | 0.803 |
Detailed CICO response time for CICO with DI in Poppy
Request* | Response time (avg, sec) | ||
---|---|---|---|
Pure CICO | CICO + 100K MARC BIB Create | CICO + 100K MARC BIB Update | |
Request | Pure | Create | Update |
Check-Out Controller | 803.14 | 998.18 | 1081.19 |
Check-In Controller | 521.62 | 686.69 | 739.48 |
POST_circulation/check-out-by-barcode (Submit_barcode_checkout) | 289.16 | 397.54 | 432.5 |
POST_circulation/check-in-by-barcode (Submit_barcode_checkin) | 210.45 | 318.34 | 331.21 |
GET_circulation/loans (Submit_barcode_checkout) | 150.84 | 186.52 | 203.77 |
GET_users (Get_check_in_page) | 76.98 | 90.13 | 113.47 |
GET_inventory/items (Submit_barcode_checkin) | 59.48 | 89.73 | 99.01 |
GET_inventory/items (Submit_barcode_checkout) | 55.23 | 79.96 | 85.51 |
GET_circulation/requests_status_openAwaitingPickup (Submit_patron_barcode) | 20.51 | 22.54 | 24.33 |
GET_circulation/requests (Submit_barcode_checkin) | 20.17 | 22.5 | 23.68 |
GET_circulation/loans (Submit_patron_barcode) | 20.06 | 22.33 | 23.63 |
*Top-10 requests were taken for analysis.
Response time
DI MARC BIB Create + CICO
DI MARC BIB Update + CICO
Service CPU Utilization
Average CPU utilization did not exceed 150% for all the modules. The highest consumption observed from mod-inventory. The same level as in Poppy.
Spike for mod-data-import module observed instantly after the start in DI Create job with 100k file by 150%. For other tests it didn't exceed 110%.
DI MARC BIB Create and Update + CICO
Service Memory Utilization
There is memory utilization increasing observed which is caused by previous modules restarting (everyday cluster shut down process).
Top 5 modules with highest memory consumption: mod-inventory - 87%, mod-search - 87%, mod-oa - 75%, mod-dcb - 68%, mod-source-record-manager - 66%,
Mod-data-export-worker-b was no level of 93% before and after tests.
DB CPU Utilization
Average DB CPU usage during data import is about 90% It decreased 5% from 95% in Poppy.
DB Connections
Average connection count during data import is about 750 connections for create jobs that is 450 connections higher than in Poppy. For update jobs - 730 connections.
DB load
Create jobs
Update jobs
Top SQL-queries:
Create jobs
insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)
INSERT INTO fs09000000_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)
INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)
Update jobs
insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)
INSERT INTO fs09000000_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)
INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)
UPDATE fs09000000_mod_inventory_storage.instance SET jsonb = $1::jsonb WHERE id='99b071cf-f789-4dfe-a238-d62c40bccfc0'
Appendix
Infrastructure
PTF -environment qcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Task count for modules mod-agreements, mod-serials-management, mod-graphql set to 0 during tests.
Modules
Methodology/Approach
- To run CI/CO - Ubuntu AWS instance was used as load generator
- DI tests were started from UI