Overview
This page is created to investigate Aurora serverless performance by comparing it with configured database for DB xlarge instance type. Data Import (DI) on oasl-pvt cluster with Check-in Check-out (CICO) script running as background.
For Aurora serverless changes in configuration for Deployments and tasks parameter from 2 to 4 for additional set of tests of DI without CICO were performed in mod-inventory, mod-inventory-storage, mod-circulation, mod-circulation-storage.
Summary
A set of tests were carried out in accordance with the conditions described in
Serverless v2 (32 - 128 ACUs) DB instance type configuration performs much better than (0.5 - 128 ACUs) due to increased capacity and its performance closer to 8xlarge. But to cut costs in more efficient way it's better to use (0.5 - 128 ACUs) for DB reader instance role.
In addition running tests for CICO used more DB connections than any of DB instance types. The results mentioned in summary table show better response times over time for runs with 20 users. And no significant changes between different DB instance types. High latency was observed for all tests. I could observe that xlarge
To capture additional data from performance insights during DI with 50K file three DI operations for different DB instance types were carried out. All snapshots are located in Average Active Sessions table.
The table includes test results from running on different database instance types Here we observe that RDS CPU utilization for db.r6g.xlarge has maximum values and test duration grows proportional to file size.
But after database was switched to Aurora serverless RDS CPU didn't exceed 25% for any file size. Execution time or test duration has tendency to decreasing for bigger file size because of bigger ACUs.
DI CICO Results | 8xlarge | xlarge db.r6g.xlarge | serverless Serverless v2 (0.5 - 128 ACUs) | serverless Serverless v2 (32 - 128 ACUs) | ||||||||
Users | File - Records | Duration (CICO) | RDS max CPU utilization | Duration | RDS max CPU utilization | Duration | RDS max CPU utilization | Duration | RDS max CPU utilization | Duration | ||
1 | DI | 10k | 37 27 | 00:05:15 00:03:21 | 96 | 00:09:59 | 17 | 00:10:07 | 16 | 00:07:17 | ||
25k | 45 30 | 00:10:04 00:08:08 | 96 | 00:18:19 | 24 | 00:13:43 | 22 | 00:11:44 | ||||
50k | 30 | 00:15:54 | 93 | 00:37:05 | 25 | 00:22:57 | 24 | 00:20:01 | ||||
2 | CICO + DI | 20 | 10k | 90 min | 39 | 00:04:32 | 94 | 00:08:08 | 19 | 00:09:12 | ||
25k | 47 | 00:09:01 | 96 | 00:19:21 | 26 | 00:14:30 |
CPU Utilization DI and DI+CICO | |||
---|---|---|---|
8xlarge | xlarge | serverless | |
RDS | CPU starts with spikes at the beginning of the tests and comes to normal after finish. Test date: 2023-05-29 | For xlarge database instance type CPU was maximum but it didn't affect DI any way. So it ran successfully | For serverless CPU was stable and was not higher than 25% |
Service | Data imports during CICO. The services worked stable and returned to there normal state after tests | CICO background process didn't affect DI and it worked as expected | Stable work of services |
CICO Results
Additional set of tests in accordance with
Testing results for CICOTest date: 2023-06-02 LG: us-west-2a | db.r6g.xlarge | db.r6g.8xlarge | Serverless v2 (0.5 - 128 ACUs) | Serverless v2 (32 - 128 ACUs) | |||||||||
Users | Duration (CICO) | RDS max CPU utilization | DB connections | RDS max CPU utilization | DB connections | RDS max CPU utilization | ACUs | DB connections | RDS max CPU utilization | ACUs | DB connections | ||
1 | CICO | 8 | 30 min | 16 | 460 | 2 | 364 | 2.5 | 7.5 | 380 | 1.5 | 32 | 380 |
20 | 30 min | 21 | 430 | 2.5 | 378 | 4.7 | 6.2 | 396 | 2 | 32 | 380 |
CICO
db.r6g.xlarge | db.r6g.8xlarge | Serverless v2 (0.5 - 128 ACUs) | Serverless v2 (32 - 128 ACUs) | |
---|---|---|---|---|
Response Times Over Time | 8 users 20 users | 8 users 20 users | 8 users 20 users | 8 users 20 users |
Throughput | 8 users 20 users | 8 users 20 users | 8 users 20 users | 8 users 20 users |
RDS CPU utilization | 8 users 20 users | 8 users 20 users | 8 users 20 users | 8 users 20 users |
Service CPU utilization | 8 users 20 users | 8 users 20 users | 8 users 20 users | 8 users 20 users |
Summary table for CICO
8 users | 20 users | |||||||||||
Requests | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | ||
db.r6g.xlarge | Check-In Controller | 0 | 2.878 | 3.114 | 2.785 | 2.16 | 0 | 2.889 | 3.116 | 2.784 | 2.118 | |
Check-Out Controller | 9.173 | 4.103 | 4.526 | 3.948 | 3.212 | 13.786 | 4.061 | 4.422 | 3.862 | 3.079 | ||
db.r6g.8xlarge | Check-In Controller | 0 | 2.946 | 3.203 | 2.849 | 2.17 | 0 | 2.914 | 3.121 | 2.805 | 2.107 | |
Check-Out Controller | 10.419 | 4.178 | 4.565 | 3.973 | 3.239 | 13.683 | 4.075 | 4.434 | 3.875 | 3.112 | ||
Serverless v2 (0.5 - 128 ACUs) | Check-In Controller | 0 | 3.088 | 3.372 | 2.99 | 2.361 | 0 | 2.971 | 3.214 | 2.86 | 2.24 | |
Check-Out Controller | 9.255 | 4.465 | 4.862 | 4.268 | 3.453 | 13.099 | 4.236 | 4.696 | 4.039 | 3.291 | ||
Serverless v2 (32 - 128 ACUs) | Check-In Controller | 0 | 2.972 | 3.238 | 2.86 | 2.212 | 0 | 2.933 | 3.149 | 2.825 | 2.135 | |
Check-Out Controller | 10.545 | 4.191 | 4.652 | 3.998 | 3.274 | 13.477 | 4.106 | 4.525 | 3.915 | 3.174 |
Compare table for response times during 10k and 25k Data Import
Response times getting better for bigger files during DI. Delta shows difference in %.
10k DI | 25k DI | |||||||||||||||||
db.r6g.xlarge | Serverless | delta, 75% | delta, 95% | db.r6g.xlarge | Serverless | delta, 75% | delta, 95% | |||||||||||
Requests | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | ||||||
Check-In Controller | 3.218 | 3.71 | 3.138 | 3.347 | 3.867 | 3.118 | -4.01 | -4.23 | 3.249 | 3.665 | 3.076 | 3.134 | 3.398 | 2.99 | 3.54 | 7.29 | ||
Check-Out Controller | 4.989 | 6.361 | 4.834 | 5.006 | 5.986 | 4.602 | -0.34 | 5.90 | 5.246 | 6.298 | 4.666 | 4.719 | 5.19 | 4.333 | 10.05 | 17.59 |
Average Active Sessions for DI with 50k file
Serverless v2 (0.5 - 128 ACUs) | db.r6g.8xlarge | db.r6g.xlarge |
---|---|---|
Example of growing ACUs for data import
Aurora Capacity Units |
---|
serverless |
Test date: 2023-05-31 ACUs grow in accordance with load and scale down without it gradually |
Response times for all DB configurations
db.r6g.8xlarge | All | Before 10K DI | During 10K DI | During 25K DI | After 25K DI | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Requests | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | 75th pct | 95th pct | Average | ||||
Check-In Controller | 2.901 | 3.103 | 2.792 | 2.866 | 3.128 | 2.772 | 2.936 | 3.232 | 2.827 | 2.933 | 3.138 | 2.815 | 2.893 | 3.064 | 2.764 | ||||
Check-Out Controller | 4.255 | 4.767 | 3.956 | 4.212 | 4.6 | 4.017 | 4.333 | 4.728 | 4.088 | 4.352 | 4.787 | 4.065 | 4.259 | 4.731 | 3.902 |
db.r6g.xlarge | All | Before 10K DI | During 10K DI | During 25K DI | After 25K DI | ||||||||||||||||||||||||
Requests | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | ||||
Check-In Controller | 0 | 3.053 | 3.472 | 2.942 | 2.506 | 0 | 2.904 | 3.2 | 2.837 | 2.199 | 0 | 3.218 | 3.71 | 3.138 | 2.726 | 0 | 3.249 | 3.67 | 3.076 | 2.672 | 0 | 2.952 | 3.17 | 2.856 | 2.242 | ||||
Check-Out Controller | 43.379 | 4.656 | 5.824 | 4.284 | 4.343 | 9.188 | 4.322 | 4.9 | 4.205 | 3.474 | 16.061 | 4.989 | 6.36 | 4.834 | 4.914 | 36.691 | 5.246 | 6.3 | 4.666 | 4.841 | 67.369 | 4.271 | 4.83 | 3.935 | 3.427 |
Serverless | Before 10K DI | During 10K DI | During 25K DI | After 25K DI | |||||||||||||||||||
Requests | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | % KO | 75th pct | 95th pct | Average | Latency | |||
Check-In Controller | 0 | 2.992 | 3.315 | 2.888 | 2.33 | 0 | 3.347 | 3.867 | 3.118 | 2.854 | 0 | 3.134 | 3.398 | 2.99 | 2.45 | 0 | 2.961 | 3.164 | 2.85 | 2.237 | |||
Check-Out Controller | 13.753 | 4.382 | 4.923 | 4.176 | 3.481 | 15.459 | 5.006 | 5.986 | 4.602 | 4.506 | 27.453 | 4.719 | 5.19 | 4.333 | 3.786 | 61.16 | 4.351 | 4.892 | 3.984 | 3.461 |
Links to Grafana
Test date: 2023-05-29 - 2023-05-31
Baseline 8xlarge
Baseline xlarge
Aurora Serverless
Test date: 2023-06-02 - 2023-06-06
db.r6g.xlarge
8 users:
20 users:
db.r6g.8xlarge
8 users:
20 users
Serverless v2 (0.5 - 128 ACUs)
8 users:
20 users:
Serverless v2 (32 - 128 ACUs)
8 users:
20 users:
Configuration
DI
Version of modules: |
---|
Source Record Manager Module (mod-source-record-manager-3.6.2) |
Source Record Storage Module (mod-source-record-storage-5.6.5) |
Inventory Module (mod-inventory-20.0.4) |
Inventory Storage Module (mod-inventory-storage-26.0.0) |
Inventory Update Module (mod-inventory-update-3.0.1) |
Data Import Module (mod-data-import-2.7.1) |
quickMARC (mod-quick-marc-3.0.0) |
CICO
Version of modules: |
---|
Okapi (okapi-5.0.1) |
users (mod-users-19.1.1) |
Remote storage API module (mod-remote-storage-2.0.2) |
Pubsub (mod-pubsub-2.9.1) |
Patron Blocks Module (mod-patron-blocks-1.8.0) |
Inventory Storage Module (mod-inventory-storage-26.0.0) |
Inventory Module (mod-inventory-20.0.4) |
feesfines (mod-feesfines-18.2.1) |
Configuration (mod-configuration-5.9.1) |
Circulation Storage Module (mod-circulation-storage-16.0.0) |
Circulation Module (mod-circulation-23.5.4) |
authtoken (mod-authtoken-2.13.0) |
Kafka version
kafka version | 2.8.0 | q-ty | |
aurora-serverless-test | broker type serverless | kafka.m5.large | 2 |
tenant | broker type | kafka.m5.2xlarge | 2 |
Environment:
- Cluster: oasl (INT acc us-west-2)
- UI endpoint: https://aurora-serverless-test.int.aws.folio.org/
- Okapi endpoint: https://okapi-aurora-serverless-test.int.aws.folio.org/
- Environment is configured to use shared MSK and ES