Folijet - Lotus Snapshot Performance testing

Following resources are used:

  • 6 m4large EC2 spot instances for kubernetes cluster;
  • 1 db.r5.xlarge instance for RDS service (writer)
  • one m5.large per 2 zones for kafks on MSK

Previous Kiwi testing performance results:

Data Import Test Report (Kiwi)

Modules:

Data Import Module (mod-data-import-2.3.0-SNAPSHOT.224)

Source Record Manager Module (mod-source-record-manager-3.3.0-SNAPSHOT.556)

Source Record Storage Module (mod-source-record-storage-5.3.0-SNAPSHOT.394)

Inventory Module (mod-inventory-18.1.0-SNAPSHOT.493)

Inventory Storage Module (mod-inventory-storage-23.0.0-SNAPSHOT.657)

Invoice business logic module (mod-invoice-5.3.0-SNAPSHOT.284)


Initial configuration:

kcp1-mod-data-import: "-Dvertx.logger-delegate-factory-class-name=io.vertx.core.logging.SLF4JLogDelegateFactory -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/ms/mod-data-import.hprof -XX:OnOutOfMemoryError=/usr/ms/heapdump.sh -XX:MaxRAMPercentage=66.0 -XX:MetaspaceSize=384m -XX:MaxMetaspaceSize=384m -Xmx1024m"  (Hard/Soft limits: 2048/1024)

kcp1-mod-source-record-manager: "-Dvertx.logger-delegate-factory-class-name=io.vertx.core.logging.SLF4JLogDelegateFactory -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/ms/mod-source-record-manager.hprof -XX:OnOutOfMemoryError=/usr/ms/heapdump.sh -XX:MaxRAMPercentage=66.0 -XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=512m -Xmx1292m"  (Hard/Soft limits: 2048/1844)

kcp1-mod-source-record-storage: "-Dvertx.logger-delegate-factory-class-name=io.vertx.core.logging.SLF4JLogDelegateFactory -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/ms/mod-source-record-storage.hprof -XX:OnOutOfMemoryError=/usr/ms/heapdump.sh -XX:MaxRAMPercentage=66.0 -XX:MetaspaceSize=384m -XX:MaxMetaspaceSize=384m -Xmx908m"  (Hard/Soft limits: 1440/1296)

kcp1-mod-inventory: "-Dport=8082 -XX:MaxRAMPercentage=66.0 -Dorg.folio.metadata.inventory.storage.type=okapi -XX:MetaspaceSize=182m -XX:MaxMetaspaceSize=182m -Xmx1814m" (Hard/Soft limits: 2880/2592)

kcp1-mod-inventory-storage: "-Dvertx.logger-delegate-factory-class-name=io.vertx.core.logging.SLF4JLogDelegateFactory -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/ms/mod-inventory-storage.hprof -XX:OnOutOfMemoryError=/usr/ms/heapdump.sh -XX:MetaspaceSize=232m -XX:MaxMetaspaceSize=232m -Xmx544m"  (Hard/Soft limits: 846/778)

kcp1-mod-invoice: "-Dvertx.logger-delegate-factory-class-name=io.vertx.core.logging.SLF4JLogDelegateFactory -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/ms/mod-invoice.hprof -XX:OnOutOfMemoryError=/usr/ms/heapdump.sh -XX:MetaspaceSize=88m -XX:MaxMetaspaceSize=88m -Xmx360m" (Hard/Soft limits: 512/360)


Performance-optimized configuration:

MAX_REQUEST_SIZE = 4000000 (for all modules)

2 items for all items

2 partition for all DI kafka topics:

Examples:

delete old topic
# ./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --delete --topic perf-eks-folijet.Default.fs09000000.DI_ERROR

recreate topic with "--partitions 2 --replication-factor 1"
# ./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --create --topic perf-eks-folijet.Default.fs09000000.DI_ERROR --partitions 2 --replication-factor 1
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic perf-eks-folijet.Default.fs09000000.DI_ERROR.

get topic info
# ./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --describe --topic perf-eks-folijet.Default.fs09000000.DI_ERROR
Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR PartitionCount: 2   ReplicationFactor: 1    Configs: min.insync.replicas=1,message.format.version=2.6-IV0,unclean.leader.election.enable=true
    Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR Partition: 0    Leader: 1   Replicas: 1 Isr: 1
    Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR Partition: 1    Leader: 2   Replicas: 2 Isr: 2 (edited) 


  • mod-source-record-manager: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/mod-source-record-manager.hprof -XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=512m -Xmx1292m -XX:+UseG1GC"  (Memory Hard/Soft limits: 2048Mi/1844Mi, CPU Hard/Soft limits: 1024m/512m)
  • mod-source-record-storage: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/mod-source-record-storage.hprof -XX:MetaspaceSize=384m -XX:MaxMetaspaceSize=384m -Xmx908m -XX:+UseG1GC"  (Memory Hard/Soft limits: 1440Mi/1296Mi, CPU Hard/Soft limits: 1024m/512m)
  • mod-inventory: "-Dorg.folio.metadata.inventory.storage.type=okapi -XX:MetaspaceSize=182m -XX:MaxMetaspaceSize=182m -Xmx1814m -XX:+UseG1GC" (Memory Hard/Soft limits: 2880Mi/2592Mi, CPU Hard/Soft limits: 1024m/512m)
  • mod-inventory-storage: "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/mod-inventory-storage.hprof -XX:MetaspaceSize=232m -XX:MaxMetaspaceSize=232m -Xmx544m -XX:+UseG1GC"  (Memory Hard/Soft limits: 846Mi/778Mi, CPU Hard/Soft limits: 512m/512m)

Tests:

envprofile

records

number

time

kafka

partition

number

module

instance

number

cpudecription

Kiwi

https://folio-kiwi.dev.folio.org

default

10258 min11

Lotus Perf Rancherdefault
102523 min11128inventory - CPU 200
Lotus Perf Rancherdefault
102525 min12128

inventory - CPU 256

2 pods for every module, except data-import

Lotus Perf Rancherdefault
102511 min11256all modules - CPU 256, except data-import
Lotus Perf Rancherptf-create2
102535 min11128inventory - CPU 200
Lotus Perf Rancherptf-create2
102515 min11256

all modules - CPU 256, except data-import

4 errors: "id value already exists in table holdings_record"

Lotus Perf Rancherptf-create210256 min11512all modules - CPU 512, except data-import
Lotus Perf Rancherptf-create2
10254 min11

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create210254 min21

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create210253 min22

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf RancherPTF - Updates Success - 110254 min22

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create2
500056 min11256all modules - CPU 256, except data-import
Lotus Perf Rancherptf-create2500016 min11

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create2500011 min22

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create250008 min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf RancherPTF - Updates Success - 1500013 min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create21000036 min11

srm, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create21000019 min22

srm, srs, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import
Lotus Perf RancherPTF - Updates Success - 11000025 min22

srm, srs, inventory 512/1024

512

srm, inventory 512/1024, other modules 512, except data-import

Completed with errors:

org.folio.processing.exceptions.MatchingException

"Found multiple records matching"

Lotus Perf Rancherptf-create2
300005h 34min11256all modules - CPU 256, except data-import
Lotus Perf Rancherptf-create2
300001h 4 min11

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf RancherPTF - Updates Success - 1300001h 31min11

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create23000045 min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create2500001h 19min / 1h 25min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create2500001h 34min11

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf RancherPTF - Updates Success - 1500002h 17min11

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf Rancherptf-create21000003h 21min11

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import
Lotus Perf RancherPTF - Updates Success - 11000004h 40min11

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import

22 errors → 

io.vertx.core.impl.NoStackTraceThrowable: Current retry number 1 exceeded or equal given number 1 
for the Item update for jobExecutionId '3eff3c04-055c-4663-8730-05985688a911'
Lotus Perf Rancherptf-create21000002h 24min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import

22 errors → 11 selected as ERROR without any error messages + 2 + 9

  1. io.vertx.core.impl.NoStackTraceThrowable
  2. id value already exists in table holdings_record/instances/items - (9)
  3. proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.657 
            http://mod-inventory-storage: Connection was closed: POST /item-storage/items

Lotus Perf Rancherptf-create250000012h 42min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import

61 error:

"contributors[0].name" = "null - (7)

Field 'title' is a required field and can not be null - (1)

proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.657 http://mod-inventory-storage: Connection was closed - (53)

Lotus Perf Rancherptf-create250000015h 37min22

srm, srs, inventory 512/1024

512

srm, srs, inventory 512/1024, other modules 512, except data-import

31 error:

"contributors[0].name" = "null - (7)

Field 'title' is a required field and can not be null - (1)

proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.657 http://mod-inventory-storage: Connection was closed - (23)

1 pod (srm, inventory 512/1024)

SRM

SRM

inventory

inventory

SRS

SRS

inventory-storage

inventory-storage


2 pods (srm, srs, inventory 512/1024)

SRM

inventory

SRS

Inventory-storage



24h:

SRM:


SRS:


Inventory:


Inventory storage (2 restarts):


Comparation SRM with CPU 256m and 999m

SRM CPU 256mSRM CPU 999m



Tasks for improvement that were created by the results of the tests:

  1. MODSOURMAN-695 - Getting issue details... STATUS
  2. MODSOURCE-461 - Getting issue details... STATUS
  3. MODSOURMAN-694 - Getting issue details... STATUS
  4. MODINV-643 - Getting issue details... STATUS
  5. MODSOURCE-464 - Getting issue details... STATUS
  6. MODSOURCE-462 - Getting issue details... STATUS
  7. MODINV-646 - Getting issue details... STATUS
  8. MODSOURMAN-699 - Getting issue details... STATUS
  9. MODINV-642 - Getting issue details... STATUS
  10. MODINVSTOR-878 - Getting issue details... STATUS
  11. MODSOURCE-467 - Getting issue details... STATUS
  12. MODSOURMAN-710 - Getting issue details... STATUS
  13. MODSOURMAN-711 - Getting issue details... STATUS
  14. MODSOURCE-468 - Getting issue details... STATUS
  15. MODINV-648 - Getting issue details... STATUS


Attempt 1: Grouping ERROR import messages of 500K records by type:

  1. io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[0].name","value":"null"}]}]} - (6)
  2. io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[2].name","value":"null"}]}]} - (1)
  3. io.vertx.core.impl.NoStackTraceThrowable: Mapped Instance is invalid: [Field 'title' is a required field and can not be null], by jobExecutionId: '851a4c9a-7640-4cfc-813a-ff90973bec9d' and recordId: '31932487-67f6-466e-b037-736dc21975f2' and chunkId: '6e166525-8244-48a3-a367-b5e83ee18aad'  - (1)
  4. io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.657 http://mod-inventory-storage: Connection was closed: POST /holdings-storage/holdings - (7)
    1. /instance-storage/instances - (9)
    2. /item-storage/items - (9)
  5. io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.657 http://mod-inventory-storage: finishConnect(..) failed: Connection refused: mod-inventory-storage.folijet.svc.cluster.local/172.20.83.37:80: POST /holdings-storage/holdings - (7)
    1. /instance-storage/instances - (3)
    2. /item-storage/items - (13)
  6. io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.657 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /holdings-storage/holdings - (5)

Attempt 2: Grouping ERROR import messages of 500K records by type:

  1. io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[0].name","value":"null"}]}]} - 6
  2. io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[2].name","value":"null"}]}]} - 1
  3. io.vertx.core.impl.NoStackTraceThrowable: Mapped Instance is invalid: [Field 'title' is a required field and can not be null], by jobExecutionId: '3d4a1812-8478-4554-b0c1-027afd3ff117' and recordId: '1d251c07-b7a3-4f22-8426-34d415b88f9c' and chunkId: '5769decb-567f-47b3-8013-172c48f8ee83' - 1
  4. io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.666 http://mod-inventory-storage: Connection was closed: POST /holdings-storage/holdings - 6
    1. POST /instance-storage/instances - 4
    2. POST /item-storage/items - 3
  5. io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.666 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /holdings-storage/holdings - 2
  6. io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.0.0-SNAPSHOT.666 http://mod-inventory-storage: finishConnect(..) failed: Connection refused: mod-inventory-storage.folijet.svc.cluster.local/172.20.83.37:80: POST /holdings-storage/holdings - 2
    1. POST /instance-storage/instances - 3
    2. POST /item-storage/items - 1


Kafka metrics during 500K DI create


Database metrics during 500K DI create

Update 10K:

org.folio.processing.exceptions.MatchingException:

Failed to encode as JSON: No serializer found for class org.folio.inventory.domain.items.LastCheckIn and no properties discovered to create BeanSerializer (to avoid exception, disable SerializationFeature.FAIL_ON_EMPTY_BEANS)
(through reference chain: java.util.ArrayList[3]->org.folio.inventory.domain.items.Item["lastCheckIn"])