Data Import on Aurora Serverless

Overview

This page is created to investigate Aurora serverless performance by comparing DB xlarge, 8xlarge and Aurora serverless instance types under load running Data Import (DI) with Check-in Check-out (CICO) running as background. 

Summary

  • The environment can handle the load with all compared DB instance types. 
  • No significant changes were observed comparing response times for CICO between two instance types db.r6g.xlarge and serverless. 
  • In Aurora serverless DI duration better for larger DI files.
  • Serverless v2 (32 - 128 ACUs) DB instance type configuration performs better from the start than (0.5 - 128 ACUs) due to increased capacity and its performance closer to 8xlarge. But to cut costs it's better to use (0.5 - 128 ACUs) for DB reader instance role. 
  • Aurora serverless RDS CPU didn't exceed 25% for any file size. Execution time or test duration has tendency to decreasing for bigger file size because of bigger ACUs.
  • Time duration of DI without CICO didn't change after task count: mod-inventory, mod-inventory-storage, mod-circulation, mod-circulation-storage x 4.

Results

The table includes test results from running on different database instance types Here we observe that RDS CPU utilization for db.r6g.xlarge has maximum values and test duration grows proportional to file size.

But after database was switched to Aurora serverless RDS CPU didn't exceed 25% for any file size. Execution time or test duration has tendency to decreasing for bigger file size because of bigger ACUs.

DI CICO Total results

Create
Job profile: Default - Create instance and SRS MARC Bib

RDS

db.r6g.8xlarge

RDS

db.r6g.xlarge

Serverless

Serverless v2 (0.5 - 128 ACUs)

Serverless

Serverless v2 (32 - 128 ACUs)



UsersFile - Records

Duration (CICO)

Max CPU utilizationDuration

Max CPU utilization

Duration

Max CPU utilization

Duration

ACUsMax CPU utilizationDuration
1

DI Create
10k

37

27

00:05:15

00:03:21

9600:09:591700:10:07
16

00:07:17 ↓ 28%



25k

45

30

00:10:04

00:08:08

9600:18:192400:13:43
2200:11:44 ↓ 15%


50k
3000:15:549300:37:052500:22:57
2400:20:01 ↓ 11%
2CICO + DI Create2010k90 min3900:04:329400:08:081900:09:12




25k
4700:09:019600:19:212600:14:30


3


CICO DI Create

JP: PTF - Create 2

2010k90 min

9400:09:561400:13:2219



25k


9400:21:062400:23:4925

CICO DI Update

JP: PTF - Updates Success - 1

2010k90 min

7000:12:311200:17:4412



25k


7000:29:121200:31:3513

RDS CPU Utilization


8xlargexlargeserverless
RDS

CPU starts with spikes at the beginning of the tests and comes to normal after finish.


Test date: 2023-05-25

For xlarge database instance type CPU was maximum but it didn't affect DI any way. So it ran successfully 

Test date: 2023-05-29

For serverless CPU was stable and was not higher than 25%


Test date: 2023-05-30

Service

Data imports during CICO. The services worked stable and returned to there normal state after tests

CICO background process didn't affect DI and it worked as expected


Stable work of services


CICO resource consumption

Running tests for CICO PERF-593 - Getting issue details... STATUS I could observe that xlarge used more DB connections than any of DB instance types. The results mentioned in summary table show better response times over time for runs with 20 users. And no significant changes between different DB instance types. High latency was observed for all tests.

Testing results for CICO

Test date: 2023-06-02

LG: us-west-2a

RDS (db.r6g.xlarge)


RDS (db.r6g.8xlarge)


Serverless v2 (0.5 - 128 ACUs)



Serverless v2 (32 - 128 ACUs)





Users

Duration (CICO)

RDS max CPU utilizationDB connections

RDS max CPU utilization

DB connections

RDS max CPU utilization

ACUs

DB connectionsRDS max CPU utilizationACUsDB connections
1CICO830 min1646023642.5

7.5

3801.532380


2030 min214302.53784.76.2396232380

CICO Graphs


db.r6g.xlarge

db.r6g.8xlargeServerless v2 (0.5 - 128 ACUs)Serverless v2 (32 - 128 ACUs)
Response Times Over Time

8 users

20 users

8 users

20 users

8 users

20 users

8 users

20 users

Throughput

8 users

20 users

8 users

20 users

8 users

20 users

8 users

20 users

RDS CPU utilization

8 users

20 users

8 users

20 users

8 users

20 users

8 users

20 users

Service CPU utilization

8 users

20 users

8 users

20 users

8 users

20 users

8 users

20 users

Summary table for CICO



8 users
20 users

Requests% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
db.r6g.xlargeCheck-In Controller02.8783.1142.7852.16
02.8893.1162.7842.118
Check-Out Controller9.1734.1034.5263.9483.212
13.7864.0614.4223.8623.079
db.r6g.8xlargeCheck-In Controller02.9463.2032.8492.17
02.9143.1212.8052.107
Check-Out Controller10.4194.1784.5653.9733.239
13.6834.0754.4343.8753.112
Serverless v2 (0.5 - 128 ACUs)Check-In Controller03.0883.3722.992.361
02.9713.2142.862.24
Check-Out Controller9.2554.4654.8624.2683.453
13.0994.2364.6964.0393.291
Serverless v2 (32 - 128 ACUs)Check-In Controller02.9723.2382.862.212
02.9333.1492.8252.135
Check-Out Controller10.5454.1914.6523.9983.274
13.4774.1064.5253.9153.174


Comparison table for response times during 10k and 25k Data Import

Response times getting better for bigger files during DI. Delta shows difference in %.


10k DI25k DI

RDS (db.r6g.xlarge)


Serverlessdelta, 75%delta, 95%RDS (db.r6g.xlarge)
Serverlessdelta, 75%delta, 95%
Requests75th pct95th pctAverage
75th pct95th pctAverage

75th pct95th pctAverage
75th pct95th pctAverage

Check-In Controller3.2183.713.138
3.3473.8673.118-4.01-4.233.2493.6653.076
3.1343.3982.993.547.29
Check-Out Controller4.9896.3614.834
5.0065.9864.602-0.345.905.2466.2984.666
4.7195.194.33310.0517.59

Average Active Sessions for DI with 50k file

To capture additional data from performance insights during DI with 50K file PERF-602 - Getting issue details... STATUS three DI operations for different DB instance types were carried out.

Serverless v2 (0.5 - 128 ACUs)RDS (db.r6g.8xlarge)db.r6g.xlarge

Example of growing ACUs for data import 

Aurora Capacity Units

serverless

Test date: 2023-05-31

ACUs grow in accordance with load and scale down without it gradually

Response times for all DB configurations

Error rate correlates with DI file size - it grows with bigger files. The lowest error rate was with Serverless during 25 DI. All errors are in Check-Out Controller for POST_circulation/check-out-by-barcode (Submit_barcode_checkout)_POST_422. 

RDS db.r6g.8xlarge

All
Before 10K DI
During 10K DI
During 25K DI
After 25K DI
Requests75th pct95th pctAverage
75th pct95th pctAverage
75th pct95th pctAverage
75th pct95th pctAverage
75th pct95th pctAverage
Check-In Controller2.9013.1032.792
2.8663.1282.772
2.9363.2322.827
2.9333.1382.815
2.8933.0642.764
Check-Out Controller4.2554.7673.956
4.2124.64.017
4.3334.7284.088
4.3524.7874.065
4.2594.7313.902

RDS

db.r6g.xlarge

All
Before 10K DI
During 10K DI
During 25K DI
After 25K DI
Requests% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
Check-In Controller03.0533.4722.9422.506
02.9043.22.8372.199
03.2183.713.1382.726
03.2493.673.0762.672
02.9523.172.8562.242
Check-Out Controller43.3794.6565.8244.2844.343
9.1884.3224.94.2053.474
16.0614.9896.364.8344.914
36.6915.2466.34.6664.841
67.3694.2714.833.9353.427

Serverless

Before 10K DI
During 10K DI
During 25K DI
After 25K DI
Requests% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
% KO75th pct95th pctAverageLatency
Check-In Controller02.9923.3152.8882.33
03.3473.8673.1182.854
03.1343.3982.992.45
02.9613.1642.852.237
Check-Out Controller13.7534.3824.9234.1763.481
15.4595.0065.9864.6024.506
27.4534.7195.194.3333.786
61.164.3514.8923.9843.461