/
Data Import on Aurora Serverless with multiple tenants

Data Import on Aurora Serverless with multiple tenants

Overview

This page is created to investigate Aurora serverless performance by comparing RDS DB xlarge and Aurora serverless instance types under load running Data Import (DI). 

Ticket: PERF-628 - Getting issue details... STATUS

Summary

  • Tests showed that there is significant performance improvement for three parallel data import jobs comparing xlarge and 0.5-128 ACU configurations. Show more details...
  • DI Create overall jobs duration decreased from 40 min to 30 min.
  • DI Update overall jobs duration decreased from 53 min to 48 min. 
  • Serverless DB configuration consumes more service CPU resources than RDS  configuration: 88% → 130% for DI Create, 160% → 210% for DI Update. Same behaviour was observed for Instance CPU. Show more details...
  • Serverless DB configuration consumes less DB CPU resources than RDS  configuration: 96% → 24% for DI Create, 93% → 18% for DI Update. Show more details...

As a conclusion, there is significant performance improvement with usage of serverless DB. According to the results, it could be an adequate replacement for regular RDS.

Results

Load schema:

  • Job on 2nd tenant was started when job on 1st tenant was loaded for 30%.
  • Job on 3rd tenant was started when job on 1st tenant was loaded for 60%.

Overall duration - duration from the begining of the job on 1st tenant till the end of the job on 3rd tenant.

Job profiles - PTF - Create 2, PTF - Updates Success - 1

DI JobTenantRDS xlargeServerless 0.5-128 ACU
 Durationmax DB CPUDurationmax DB CPU
DI Create 25K


ptf-ncp5-00

30 min96%24 min24%

ptf-ncp5-01

37 min21 min
ptf-ncp5-0231 min15 min
Overall40 min30 min (-10 min)
DI Update 25K


ptf-ncp5-00

23 min93%24 min18%

ptf-ncp5-01

47 min*41 min
ptf-ncp5-0242 min*35 min
Overall53 min48 min (-5 min)

* some records were discarded (less than 1%)

Instance CPU Utilization

RDS xlarge

DI Create

Maximum CPU utilization - 34%

DI Update

Maximum CPU utilization - 45%

Serverless 0.5-128 ACU

DI Create

Maximum CPU utilization - 43%

DI Update

Maximum CPU utilization - 51%

Service CPU Utilization

RDS xlarge

DI Create

Maximum CPU utilization - 88% (mod-inventory)

DI Update

Maximum CPU utilization - 160% (mod-inventory)

Serverless 0.5-128 ACU

DI Create

Maximum CPU utilization - 130% (mod-inventory)

DI Update

Maximum CPU utilization - 210% (mod-inventory)


Memory Utilization

RDS xlarge

DI Create

DI Update

Serverless 0.5-128 ACU

DI Create

DI Update


DB CPU Utilization

RDS xlarge

DI Create

Maximum CPU utilization - 96%

DI Update

Maximum CPU utilization - 93%

Serverless 0.5-128 ACU

DI Create

Maximum CPU utilization - 24%

DI Update

Maximum CPU utilization - 18%

DB Connections

RDS xlarge

DI Create

DI Update

Serverless 0.5-128 ACU

DI Create

DI Update

ACU utilization

Serverless 0.5-128 ACU

DI Create

DI Update

Appendix

Environment

PTF -environment ncp5

  • m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • RDS Configuration 1: 1writer instance of db.r6.xlarge type
  • RDS Configuration 2: 1writer instance of Serverless v2 (0.5 - 128 ACUs) type

Configuration

Module

Task Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
mod-inventory-storage-b10579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-inventory-storage:26.0.022208195210241440384512false
mod-data-import-b8579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import:2.7.11204818442561292384512false
mod-data-import-cs-b4579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import-converter-storage:1.16.0-SNAPSHOT.1322102489612876888128false
mod-source-record-storage-b25579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-source-record-storage:5.6.624096368810243076384512false
mod-quick-marc-b5579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-quick-marc:3.0.01228821761281664384512false
mod-inventory-b8579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-inventory:20.0.422880259210241814384512false
mod-source-record-manager-b17579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-source-record-manager:3.6.324096368810243076384512false

Testing approach

DI jobs were run on three tenants: ptf-ncp5-00, ptf-ncp5-01, ptf-ncp5-02.

Load schema:

  • Job on 2nd tenant was started when job on 1st tenant was loaded for 30%.
  • Job on 3rd tenant was started when job on 1st tenant was loaded for 60%.

Overall duration was caclulated as duration from the begining of the job on 1st tenant till the end of the job on 3rd tenant.

Related content