Data Import on Aurora Serverless with multiple tenants
Overview
This page is created to investigate Aurora serverless performance by comparing RDS DB xlarge and Aurora serverless instance types under load running Data Import (DI).
Ticket: - PERF-628Getting issue details... STATUS
Summary
- Tests showed that there is significant performance improvement for three parallel data import jobs comparing xlarge and 0.5-128 ACU configurations. Show more details...
- DI Create overall jobs duration decreased from 40 min to 30 min.
- DI Update overall jobs duration decreased from 53 min to 48 min.
- Serverless DB configuration consumes more service CPU resources than RDS configuration: 88% → 130% for DI Create, 160% → 210% for DI Update. Same behaviour was observed for Instance CPU. Show more details...
- Serverless DB configuration consumes less DB CPU resources than RDS configuration: 96% → 24% for DI Create, 93% → 18% for DI Update. Show more details...
As a conclusion, there is significant performance improvement with usage of serverless DB. According to the results, it could be an adequate replacement for regular RDS.
Results
Load schema:
- Job on 2nd tenant was started when job on 1st tenant was loaded for 30%.
- Job on 3rd tenant was started when job on 1st tenant was loaded for 60%.
Overall duration - duration from the begining of the job on 1st tenant till the end of the job on 3rd tenant.
Job profiles - PTF - Create 2, PTF - Updates Success - 1
DI Job | Tenant | RDS xlarge | Serverless 0.5-128 ACU | ||
---|---|---|---|---|---|
Duration | max DB CPU | Duration | max DB CPU | ||
DI Create 25K | ptf-ncp5-00 | 30 min | 96% | 24 min | 24% |
ptf-ncp5-01 | 37 min | 21 min | |||
ptf-ncp5-02 | 31 min | 15 min | |||
Overall | 40 min | 30 min (-10 min) | |||
DI Update 25K | ptf-ncp5-00 | 23 min | 93% | 24 min | 18% |
ptf-ncp5-01 | 47 min* | 41 min | |||
ptf-ncp5-02 | 42 min* | 35 min | |||
Overall | 53 min | 48 min (-5 min) |
* some records were discarded (less than 1%)
Instance CPU Utilization
RDS xlarge
DI Create
Maximum CPU utilization - 34%
DI Update
Maximum CPU utilization - 45%
Serverless 0.5-128 ACU
DI Create
Maximum CPU utilization - 43%
DI Update
Maximum CPU utilization - 51%
Service CPU Utilization
RDS xlarge
DI Create
Maximum CPU utilization - 88% (mod-inventory)
DI Update
Maximum CPU utilization - 160% (mod-inventory)
Serverless 0.5-128 ACU
DI Create
Maximum CPU utilization - 130% (mod-inventory)
DI Update
Maximum CPU utilization - 210% (mod-inventory)
Memory Utilization
RDS xlarge
DI Create
DI Update
Serverless 0.5-128 ACU
DI Create
DI Update
DB CPU Utilization
RDS xlarge
DI Create
Maximum CPU utilization - 96%
DI Update
Maximum CPU utilization - 93%
Serverless 0.5-128 ACU
DI Create
Maximum CPU utilization - 24%
DI Update
Maximum CPU utilization - 18%
DB Connections
RDS xlarge
DI Create
DI Update
Serverless 0.5-128 ACU
DI Create
DI Update
ACU utilization
Serverless 0.5-128 ACU
DI Create
DI Update
Appendix
Environment
PTF -environment ncp5
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- RDS Configuration 1: 1writer instance of db.r6.xlarge type
- RDS Configuration 2: 1writer instance of Serverless v2 (0.5 - 128 ACUs) type
Configuration
Testing approach
DI jobs were run on three tenants: ptf-ncp5-00, ptf-ncp5-01, ptf-ncp5-02.
Load schema:
- Job on 2nd tenant was started when job on 1st tenant was loaded for 30%.
- Job on 3rd tenant was started when job on 1st tenant was loaded for 60%.
Overall duration was caclulated as duration from the begining of the job on 1st tenant till the end of the job on 3rd tenant.