Data Import MARC BIB + CI/CO (Ramsons) [ECS]
Overview
- This document contains the results of testing Check-in/Check-out and Data Import for MARC Bibliographic records on the Ramsons[ECS] release environment.
- PERF-978Getting issue details... STATUS
Summary
- Data Import (DI) with Check In/Check Out tests finished successfully with PTF - Create 2 and PTF - Updates Success - 6 job profiles with files 5K, 10K, 25K, 50K, 100K records.
- Comparison results:
- DI create jobs duration with CI/CO degraded 25% in average. There's degradation without CI/CO 22% with 10k and 8% with 25k but there's improvement 14% with 50k and 17% with 100k.
- DI update jobs durations with CI/CO degraded 20% for 25k, 50k, 100k and 50% for 5k, 10k. There's degradation without CI/CO 20% with 10k, 3% with 50k, 12% with 100k but there's 4% improvement with 25k.
- DI with CI/CO degraded compared to DI without CI/CO. DI create jobs - 70%, DI update jobs - 30% in average.
- CI/CO response times degraded with DI 20% in average.
- DI create job with 100k file finished successfully but total duration in database was updated in 6 hours after it was finished. Additional 100k finished in 56 minutes (total duration on db side).
- Memory showed some memory growing trend for mod-pubsub and it's possible that a memory leak has been detected on a module side.
Recommendations & Jiras
- The previous results report:
- The mod-pubsub module memory leak investigation ticket: - MODPUBSUB-311Getting issue details... STATUS
- New behaviour from the
mod_orders_storage.po_line table detected during data import create jobs. Consider also [tenant]
_mod_orders_storage.internal_lock query. It should be investigated.
Test Runs
Test № | Scenario | Test Conditions | Results |
---|---|---|---|
1 | DI MARC Bib Create | 5K, 10K, 25K, 50K, 100K consequentially (with 5 min pause) | Completed |
CICO | 8 users | ||
2 | DI MARC Bib Update | 5K, 10K, 25K, 50K, 100K consequentially (with 5 min pause) |
|
CICO | 8 users |
Test Results
This table contains durations for Data Import.
Profile | MARC File | DI Duration Ramsons (hh:mm:ss) | Check In, Check Out Response time (8 users) Ramsons | |
---|---|---|---|---|
CI Average, sec | CO Average, sec | |||
DI MARC Bib Create (PTF - Create 2) | 5K | 0:04:11 | 1.01 | 2 |
10K | 0:06:39 | 0.95 | 1.88 | |
25K | 0:16:13 | 1.18 | 2.25 | |
50K | 0:29:47 | 1.11 | 2.29 | |
100K | 0:56:00 | 1.6 | 2.4 | |
DI MARC Bib Update (PTF - Updates Success - 6) | 5K | 0:06:19 | 0.99 | 2.2 |
10K | 0:12:10 | 1.1 | 2.5 | |
25K | 0:24:31 | 1.04 | 2.1 | |
50K | 0:49:53 | 1.03 | 2.2 | |
100K | 1:48:00 | 1 | 2.1 |
Check-in/Check-out without DI
Scenario | Load level | Request | Response time, sec Quesnelia | |
---|---|---|---|---|
95 perc | average | |||
Circulation Check-in/Check-out (without Data import) | 8 users | Check-in | 1 | 0.65 |
Check-out | 1.9 | 1.2 |
Comparison
This table contains DI durations with CICO comparison between Quesnelia and Ramsons releases.
Profile | MARC File | DI Duration | DI Delta without CI/CO, % | DI Delta with CI/CO, % | DI Delta Ramsons/Ramsons without/with CI/CO, % | Check In, Check Out Response time (8 users) | Check In, Check Out Response time (8 users) | Delta, % | ||||||
without CI/CO | with CI/CO | Quesnelia | Ramsons | Quesnelia/Ramsons | Quesnelia/Ramsons | |||||||||
Quesnelia | Ramsons | Quesnelia | Ramsons | CI Average sec | CO Average sec | CI Average sec | CO Average sec | CI | CO | |||||
DI MARC Bib Create (PTF - Create 2) | 5K.mrc | 00:04:11 | 00:03:21 | 00:04:11 | 24.88% | 0.00% | 0.831 | 1.357 | 1.01 | 2 | 21.54% | 47.38% | ||
10K.mrc | 00:04:14 | 00:05:10 | 00:06:51 | 00:06:39 | 22.05% | -2.92% | 28.71% | 0.845 | 1.41 | 0.95 | 1.88 | 12.43% | 33.33% | |
25K.mrc | 00:09:41 | 00:10:30 | 00:12:41 | 00:16:13 | 8.43% | 27.86% | 54.44% | 0.719 | 1.333 | 1.18 | 2.25 | 64.12% | 68.79% | |
50K.mrc | 00:18:18 | 00:15:43 | 00:23:19 | 00:29:47 | -14.12% | 27.73% | 89.50% | 0.691 | 1.327 | 1.11 | 2.29 | 60.64% | 72.57% | |
100K.mrc | 00:38:36 | 00:31:51 | 00:51:24 | 00:56:00 | -17.49% | 8.95% | 75.82% | 0.664 | 1.335 | 1.6 | 2.4 | 140.96% | 79.78% | |
DI MARC Bib Update (PTF - Updates Success - 6) | 5K.mrc | 00:04:12 | 00:06:19 | 50.40% | 0.764 | 1.458 | 0.99 | 2.2 | 29.58% | 50.89% | ||||
10K.mrc | 00:05:59 | 00:07:10 | 00:08:15 | 00:12:10 | 19.78% | 47.47% | 69.77% | 0.779 | 1.377 | 1.1 | 2.5 | 41.21% | 81.55% | |
25K.mrc | 00:19:52 | 00:19:03 | 00:20:38 | 00:24:31 | -4.11% | 18.82% | 28.70% | 0.755 | 1.401 | 1.04 | 2.1 | 37.75% | 49.89% | |
50K.mrc | 00:37:53 | 00:38:53 | 00:43:06 | 00:49:53 | 2.64% | 15.74% | 28.29% | 0.75 | 1.444 | 1.03 | 2.2 | 37.33% | 52.35% | |
100K.mrc | 01:14:00 | 01:23:00 | 01:29:09 | 01:48:00 | 12.16% | 21.14% | 30.12% | 0.73 | 1.458 | 1 | 2.1 | 36.99% | 44.03% |
Detailed CICO response time comparison without DI
Scenario | Load level | Request | Response time, sec Quesnelia | Response time, sec Ramsons | ||
---|---|---|---|---|---|---|
95 perc | average | 95 perc | average | |||
Circulation Check-in/Check-out (without Data import) | 8 users | Check-in | 0.64 | 0.49 | 1 | 0.65 |
Check-out | 1.24 | 1.08 | 1.9 | 1.2 |
Test №1
Response times
Cluster resource utilization
Service CPU Utilization
Service Memory Utilization
DB resources utilization
RDS CPU Utilizatoin
DB Connections
DB load
Top SQL-queries
Top applications
MSK resources utilization
CPU (User) usage by broker
Maximum utilization was 50% with 100k file
Test №2
Response times
Cluster resource utilization
Service CPU Utilization
Service Memory Utilization
DB resources utilization
RDS CPU Utilization
DB Connections
DB load
Top SQL-queries
MSK resources utilization
CPU (User) usage by broker
Maximum utilization was 60% with 50k file
Appendix
Infrastructure
PTF -environment rcon |
---|
|
DB table records size:
|
---|
Modules:
Methodology/Approach
DI tests scenario (DI MARC Bib Create and Update) were started from UI with delay.
Test runs:
- Test 1: Manually tested 5K, 10K, 25K, 50K, 100K consequentially (with 5 min pause) records files, DI (DI MARC Bib Create ) started on College tenant(cs00000int_0001) only, and CICO with 8 users on background.
- Test 2: Manually tested 5K, 10K, 25K, 50K, 100K consequentially (with 5 min pause) records files, DI (DI MARC Bib Update) started on College tenant(cs00000int_0001) only, and CICO with 8 users on background.
Description
Testing includes data preparation step for Check-in/Check-out with 8 virtual users, 3 hours duration (as a rule it's enough for series of DI create / update jobs). Any convenient duration may be applied.
- Data preparation of Check-in/Check-out for DI tests takes up to 20 minutes and consists of truncating involved in testing tables, populating data and updating statuses of items.
- Test itself depends on duration and virtual users number creating necessary load.
In Ramsons token expiration set to 10 minutes by default so to run any tests use new login implementation from the script. Pay attention to Backend Listener. Replace value of application parameter to make the results visible in Grafana dashboard.
Module configuration recommended setup
Update revision in source-record-storage module to exclude every 30 minutes SQL statements - delete rows in marc_indexers
(mi
) WITH deleted_rows
{ "name": "srs.marcIndexers.delete.interval.seconds", "value": "86400" }, |
Update mod-serials module. Set number of task with 0 to exclude significant database connection growth.
DB trigger setup in Ramsons
Usual PTF CI/CO data preparation script won’t work in Ramsons. To solve that disable trigger updatecompleteupdateddate_item_insert_update before data preparation for the tenant and enable it before test start.
The sql file was updated to do that step from the script.
Data preparation
- To prepare data establish connection by AWS key
- Run CICO_db_preparation.sh script located in /scripts folder. Before use the file tenats.csv to edit the list of tenants to restore the database.
- Files location: Buckets/fse-ptf/Scripts/CICO/Ramsons/
To start test from AWS instance (load generator) use template for the command. Test locally before start.
To create widgets in AWS dashboard to monitor and collect CI/CO related modules parameters (service CPU and Memory) use these json: