Bursar (Nolana)
Overview
Bursar workflow was tested with Nolana snapshot. The purpose of this testing is to define baseline of bursar workflow, find possible memory leaks, bottlenecks, etc.
According to numbers mentioned here Feature - Team Responsibility Matrix - we did test bursar workflow from 0 records to 2000 records subsequently.
Infrastructure
PTF -environment (ncp1)
- 9 m6i.2xlarge EC2 instances located in us-west-2.
- 2 instances of db.r6.xlarge database instances, one reader and one writer
- MSK (ptf-kafka-1 cluster)
- 4 m5.2xlarge brokers in 2 zones
- auto.create-topics.enable = true
- log.retention.minutes=480
- default.replication.factor=3
- Apache Kafka v2.8.0
- EBS storage volume per broker = 300GB
- Kafka topics
- .data-export.job.command - 50 partitions
- data-export.job.update - 50 partitions
Memory parameters for relevant modules:
Module | Version | Max Metaspace Size (MB) | XmX (MB) | Soft Limit (MB) | Hard Limit (MB) | CPU | Number of ECS Tasks |
---|---|---|---|---|---|---|---|
mod-data-export-spring | 1.5.0-SNAPSHOT.58 | 512 | 1536 | 1844 | 2048 | 256 | 1 |
mod-data-export-worker | 2.0.2 | 512 | 2048 | 2600 | 3072 | 1024 | 2 |
mod-feesfines: | 18.2.0-SNAPSHOT.132 | 128 | 768 | 896 | 1024 | 128 | 2 |
mod-users | 128 | 768 | 896 | 1024 | 128 | 2 | |
okapi | 4.14.4 | 512 | 922 | 1360 | 1512 | 1024 | 3 |
nginx-okapi | 2022.03.02 | - | - | 896 | 1024 | 128 | 2 |
pub-okapi | 2022.03.02 | - | - | 896 | 1024 | 128 | 2 |
High Level Summary
All jobs from 100 records to 2000 records execution times lays between 9-12 seconds. We did perform 69 jobs (all of them ended successfully)
job # | Records | time |
---|---|---|
1 | 100 | 11 s |
2 | 200 | 11s |
3 | 300 | 10s |
4 | 400 | 9s |
5 | 500 | 9s |
6 | 1000 | 10s |
7 | 1200 | 9s |
8 | 1500 | 9s |
9 | 1700 | 9s |
10 | 2000 | 9s |
Resource Usage
Note: Each spike on CPU graph corresponds to one test. As you can see with higher number of records it requires more CPU usage.
Note: we can observe growing of memory usage on mod-feesfines. It looks like there is a memory leak (actual memory growing from 31% to 43% during set of tests). Ticket to investigate potential memory leak: PERF-357
Same screenshot with another scale:
Note: Max CPU usage on RDS database is 10%