Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
Overview

...

  • Tests showed the Lists App refresh of concurrent lists on 3 tenants are:
    • 1.5 mins for 3 concurrent lists refresh test (1 list refresh on each tenant);
    • 2.3 mins for 10 concurrent lists refresh test (3-4 lists refresh on each tenant).
  • Load test for 30 lists (10 lists per tenant) failed due to DB overload (100% of refresh transactions failed). After the test end "isRefreshing" status remained "true" for each list. It was reset manually directly through the database.
  • During the 10 lists test CPU utilization reached 200% for mod-fqm-manager and 111% for mod-lists. Also, mod-permissions' CPU utilization exceeded 100% during 30 lists test.
  • Maximum DB CPU utilization reached 83% (writer instance) and 99% (reader instance) during the 30 lists test. In comparison with testing with R/W split disabled, RDS CPU utilization didn't decrease when DB R/W split was enabled.
  • Memory utilization for mod-permissions increased from 48% to 76% during the tests. No memory leak is suspected for all the modules.

...

TransactionDuration, avgReleaseTenantsNumber of listsR/W splitOther conditions

Lists App refresh

previous test results*

10 min 40 sec

[Orchid]1 tenant10 disabled
8.5 min[Poppy]1 tenant10 disabled
17.7 min[Poppy]1 tenant10 disabledTesting in parallel with DI and CICO
Lists App refresh

current test results**
1.5 min[Poppy]3 tenantsenabled
2.3 min[Poppy]3 tenants10 enabled
error[Poppy]3 tenants30 enabled

1 list refresh failed for the 1st tenant

8 list refreshes failed for the 3rd tenant***

...

**Query used in lists - "Item status != Available". List refresh result is about 200K records.

***Details on the issue can be found at Failed list refresh investigation part

Instance CPU Utilization

Service CPU Utilization

...

In order to investigate the issue there were conducted addtional tests for single tenant (30 lists in parallel) and multiple tenants (10 lists on each of three tenants in parallel).

Test

Failed list refresh

1 tenant

(30 lists in parallel)

3 tenants 

(10 lists on each in parallel)

11 list failed

2 lists failed for the 1st tenant

10 lists failed for the 3rd tenant

21 list failed

1 lists failed for the 1st tenant

8 lists failed for the 3rd tenant

32 lists failed

2 lists failed for the 1st tenant

8 lists failed for the 3rd tenant 

! For the second tenant there were no failed list refreshes during all the tests.

! After the test end "isRefreshing" status remained "true" for each of the failed lists. It was reset manually directly through the database.

Log messages during the tests:

...

Infrastructure

PTF -environment pcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731


  • MSK tenant
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

...