Circulation log export report [Nolana Snapshot - 10/24-2022]
Overview
Circulation log is one of the export workflows that are being tested together - PERF-315Getting issue details... STATUS to see how the effect they have on each other. This document contains the results of standalone (export) Circulation Log testing, which would be used to compare with the results from testing all the workflows. - PERF-314Getting issue details... STATUS . The modules tested are running snapshot versions of Nolana, so this is a pre-release of Nolana testing effort.
Infrastructure
PTF -environment
- 9 m6i.2xlarge EC2 instances located in us-west-2.
- 2 instances of db.r6.xlarge database instances, one reader and one writer
- MSK (ptf-kafka-1 cluster)
- 4 m5.2xlarge brokers in 2 zones
- auto.create-topics.enable = true
- log.retention.minutes=480
- default.replication.factor=3
- Apache Kafka v2.8.0
- EBS storage volume per broker = 300GB
- Kafka topics
- .data-export.job.command - 50 partitions
- data-export.job.update - 50 partitions
Memory parameters for relevant modules:
Module | Version | Max Metaspace Size (MB) | XmX (MB) | Soft Limit (MB) | Hard Limit (MB) |
---|---|---|---|---|---|
mod-data-export-spring | 1.5.0-SNAPSHOT.58 | 512 | 1536 | 1844 | 2048 |
mod-data-export-worker | 1.5.0-SNAPSHOT.76 | 512 | 2048 | 2600 | 3072 |
mod-circulation | 23.2.0-SNAPSHOT.539 | 128 | 768 | 896 | 1024 |
mod-circulation-storage | 15.1.0-SNAPSHOT.287 | 512 | 896 | 1440 | 1536 |
mod-audit | 2.6.0-SNAPSHOT.148 | 128 | 768 | 896 | 1024 |
Summary
- Circulation log workflow is working stable and without any issues until the number of records to be exported reaches 100K records. After this number system becomes unstable and the job becomes unpredictable. The high load causes high DB CPU usage. Moreover the timestamp of when the job starts changes during that behaviour. See more information here MODAUD-136
- To manipulate with number of records to export we just change dates in from-to section.
- Most loaded part of the system during this workflow is Database. CPU usage on DB reaching 70% during tests;
- Most loaded service is mod-audit and it CPU level reached 10%;
- We can observe growing of memory usage on mod-data-export-worker (from 54% to 62%). It may need additional investigation in future.
DB query used:
2022-10-18 20:06:19 UTC:10.23.10.209(58674):fs09000000_mod_audit@folio:[7873]:LOG: duration: 1363.198 ms statement: SELECT * FROM fs09000000_mod_audit.circulation_logs WHERE (CASE WHEN length(lower(f_unaccent('2022-10-01T00:00:00.000'))) <= 600 THEN left(lower(f_unaccent(circulation_logs.jsonb->>'date')),600) >= lower(f_unaccent('2022-10-01T00:00:00.000')) ELSE left(lower(f_unaccent(circulation_logs.jsonb->>'date')),600) >= left(lower(f_unaccent('2022-10-01T00:00:00.000')),600) AND lower(f_unaccent(circulation_logs.jsonb->>'date')) >= lower(f_unaccent('2022-10-01T00:00:00.000')) END) AND (CASE WHEN length(lower(f_unaccent('2022-10-18T23:59:59.999'))) <= 600 THEN left(lower(f_unaccent(circulation_logs.jsonb->>'date')),600) <= lower(f_unaccent('2022-10-18T23:59:59.999')) ELSE left(lower(f_unaccent(circulation_logs.jsonb->>'date')),600) <= left(lower(f_unaccent('2022-10-18T23:59:59.999')),600) AND lower(f_unaccent(circulation_logs.jsonb->>'date')) <= lower(f_unaccent('2022-10-18T23:59:59.999')) END) ORDER BY left(lower(f_unaccent(circulation_logs.jsonb->>'date')),600) DESC, lower(f_unaccent(circulation_logs.jsonb->>'date')) DESC LIMIT 100 OFFSET 106100
Note: ticker to investigate slow query PERF-347
Results
test # | number of records | time |
---|---|---|
1 | 20K | ±5 min |
2 | 40K | ±11 min |
3 | 60K | ±17 min |
Resources usage
Service CPU Utilization
CPU usage on related modules is hardly visible (usually lower than 10%). Most loaded module is mod-audit
Service memory usage
mod-data-export worker has a memory leak suspect, possibly we'll need to do some investigation on that.
ticket created to investigate memory leak PERF-348
DB CPU Utilization
Failed test
Note: Here resource usage of failed test.
Service CPU usage
Service memory usage
DB CPU usage