The following resources are used:
- 6 m4large EC2 spot instances for Kubernetes cluster;
- 1 db.r5.xlarge instance for RDS service (writer)
- one m5.large per 2 zones for kafks on MSK
Previous Lotus testing performance results:
Lotus Snapshot Performance testing
Modules:
Data Import Module (mod-data-import-2.5.0-SNAPSHOT.231)
Source Record Manager Module (mod-source-record-manager-3.4.0-SNAPSHOT.621)
Source Record Storage Module (mod-source-record-storage-5.4.0-SNAPSHOT.426)
Inventory Module (mod-inventory-18.2.0-SNAPSHOT.537) - mod-inventory-18.0.0
Inventory Storage Module (mod-inventory-storage-23.1.0-SNAPSHOT.692)
Data Import Converter Storage (mod-data-import-converter-storage-1.14.0-SNAPSHOT.202)
Invoice business logic module (mod-invoice-5.4.0-SNAPSHOT.306)
Data Export Module (mod-data-export-4.5.0-SNAPSHOT.319)
Performance-optimized configuration:
MAX_REQUEST_SIZE = 4000000 (for all modules)
2 items for all items (except mod-data-import)
2 partition for all DI kafka topics:
Examples:
delete old topic
# ./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --delete --topic perf-eks-folijet.Default.fs09000000.DI_ERROR
recreate topic with "--partitions 2 --replication-factor 1"
# ./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --create --topic perf-eks-folijet.Default.fs09000000.DI_ERROR --partitions 2 --replication-factor 1
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic perf-eks-folijet.Default.fs09000000.DI_ERROR.
get topic info
# ./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --describe --topic perf-eks-folijet.Default.fs09000000.DI_ERROR
Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR PartitionCount: 2 ReplicationFactor: 1 Configs: min.insync.replicas=1,message.format.version=2.6-IV0,unclean.leader.election.enable=true
Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR Partition: 1 Leader: 2 Replicas: 2 Isr: 2
mod-data-import: -XX:MaxRAMPercentage=85.0 -XX:+UseG1GC / cpu: 128m/192m | memory: 1Gi/1Gi
mod-source-record-manager: -XX:MaxRAMPercentage=65 -XX:MetaspaceSize=120M -XX:+UseG1GC / DB_MAXPOOLSIZE = 15 / DB_RECONNECTATTEMPTS = 3 / DB_RECONNECTINTERVAL = 1000 / cpu: 512m/1024m | memory: 1844Mi / 2Gi
mod-source-record-storage: -XX:MaxRAMPercentage=65 -XX:MetaspaceSize=120M -XX:+UseG1GC / DB_MAXPOOLSIZE = 15 / cpu: 512m/1024m | memory: 1296Mi/1440Mi
mod-inventory: -XX:MaxRAMPercentage=80 -XX:MetaspaceSize=120M -XX:+UseG1GC -Dorg.folio.metadata.inventory.storage.type=okapi / DB_MAXPOOLSIZE = 15 / cpu: 512m/1024m | memory: 2592Mi/2880Mi
mod-inventory-storage: -XX:MaxRAMPercentage=80 -XX:MetaspaceSize=120M -XX:+UseG1GC / DB_MAXPOOLSIZE = 15 / cpu: 512m/1024m | memory: 1024Mi/1200Mi
Tests:
env | profile | records number | time in Morning Glory | time in Lotus | Kafka partition number | module instance number | CPU | description |
---|---|---|---|---|---|---|---|---|
MG Perf Rancher | PTF Create - 2 | 5000 | 7 min | 8 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-10T07:44:27.576+00:00 2022-06-10T07:51:11.140+00:00 |
MG Perf Rancher | PTF Create - 2 | 5000 | 7 min | 8 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ -Ddi.flow.control.enable=false 2022-06-14T10:24:44.093+00:00 2022-06-14T10:31:54.725+00:00 |
MG Perf Rancher | PTF Update - 1 | 5000 | 11 min | 13 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-20T06:18:46.748+00:00 2022-06-20T06:30:02.991+00:00 |
MG Perf Rancher | PTF Create - 2 | 10`000 | 16 min | 19 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-10T07:54:23.720+00:00 2022-06-10T08:08:48.484+00:00 |
MG Perf Rancher | PTF Create - 2 | 10`000 | 16 min | 19 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-14T10:36:41.482+00:00 2022-06-14T10:53:03.556+00:00 |
MG Perf Rancher | PTF Update - 1 | 10`000 | 22 min | 25 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-20T07:07:00.594+00:00 2022-06-20T07:28:54.905+00:00 |
MG Perf Rancher | PTF Create - 2 | 50`000 | 59 min | 1h 25min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-10T08:12:29.178+00:00 2022-06-10T09:11:34.642+00:00 |
MG Perf Rancher | PTF Update - 1 | 50`000 | 1h 42 min | 2h 17min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-20T09:11:41.701+00:00 2022-06-20T10:54:29.378+00:00 |
MG Perf Rancher | PTF Create - 2 | 100`000 | 2h 20min | 2h 24min (22 errors) | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-13T09:30:35.574+00:00 2022-06-13T12:26:52.484+00:00 |
MG Perf Rancher | PTF Update - 1 | 100`000 | 2h 49min | 4h 40min (tests were made for 1 instance number and partition number | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-21T11:46:43.175+00:00 2022-06-21T14:36:05.532+00:00 57 errors Inventory/Inventory-storage errors: io.netty.channel.StacklessClosedChannelException, io.vertx.core.impl.NoStackTraceThrowable: Connection is not active now, current status: CLOSED io.vertx.core.impl.NoStackTraceThrowable: Timeout |
MG Perf Rancher | PTF Create - 2 | 500`000 | 14h 46min (60 errors) | 15h 37min (31 errors) | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.659+ 60 errors 2022-06-13T14:27:40.568+00:00 2022-06-14T05:14:27.458+00:00 |
Results before flow control fix: MODSOURMAN-811
env | profile | records number | time | time in Lotus | Kafka partition number | module instance number | CPU | description |
---|---|---|---|---|---|---|---|---|
MG Perf Rancher | PTF Create - 2 | 5000 | 7 min | 8 min | 2 | 2 | 512/1024 | mod-source-record-manager-3.4.0-SNAPSHOT.621 2022-05-27T12:58:30.331+00:00 2022-05-27T13:05:08.683+00:00 |
MG Perf Rancher | PTF Update - 1 | 5000 | 10 min | 13 min | 2 | 2 | 512/1024 | 2022-05-27T13:22:35.123+00:00 2022-05-27T13:32:35.344+00:00 |
MG Perf Rancher | PTF Create - 2 | 10`000 | 21 min | 27min | 19 min | 2 | 2 | 512/1024 | -Ddi.flow.control.enable=false 2022-05-30T09:51:13.876+00:00 | 2022-05-31T18:13:05.977+00:00 2022-05-30T10:12:33.982+00:00 | 2022-05-31T18:40:58.928+00:00 |
MG Perf Rancher | PTF Update - 1 | 10`000 | 30 min | 25 min | 2 | 2 | 512/1024 | -Ddi.flow.control.enable=false 2022-05-31T19:19:46.296+00:00 2022-05-31T19:49:59.651+00:00 |
MG Perf Rancher | PTF Create - 2 | 10`000 | 21 min | 19 min | 2 | 2 | 512/1024 | -Ddi.flow.control.enable=true 2022-05-31T20:02:06.368+00:00 2022-05-31T20:23:19.490+00:00 |
MG Perf Rancher | PTF Update - 1 | 10`000 | 31 min | 25 min | 2 | 2 | 512/1024 | -Ddi.flow.control.enable=true 2022-06-01T19:08:11.563+00:00 2022-06-01T19:39:58.803+00:00 |
MG Perf Rancher | PTF Create - 2 | 10`000 | 17 min | 19 min | 2 | 2 | 512/1024 | -Ddi.flow.control.enable=true 2022-06-03T09:20:07.654+00:00 2022-06-03T09:37:51.631+00:00 |
MG Perf Rancher | PTF Create - 2 | 30`000 | 1h 6 min | 45 min | 2 | 2 | 512/1024 | 2022-05-27T13:37:12.980+00:00 2022-05-27T14:31:52.595+00:00 |
MG Perf Rancher | PTF Update - 1 | 30`000 | 1h 26min | - | 2 | 2 | 512/1024 | 2022-05-27T15:37:33.580+00:00 2022-05-27T17:03:15.702+00:00 |
MG Perf Rancher | PTF Create - 2 | 50`000 | 2h 37 min | 1h 25min | 2 | 2 | 512/1024 | 3 errors: 2022-06-01T19:48:33.977+00:00 2022-06-01T22:25:59.700+00:00 |
60 errors (500K - PTF Create - 2):
Almost all errors with mod-inventory storage related to not having enough memory for instances (memory: 778Mi/846Mi). Instances of mod-inventory-storage were restarted 2 times.
io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[0].name","value":"null"}]}]}
io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[2].name","value":"null"}]}]}
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: Connection was closed: POST /holdings-storage/holdings
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: Connection was closed: POST /instance-storage/instances
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: finishConnect(..) failed: Connection refused: mod-inventory-storage.folijet.svc.cluster.local/172.20.250.48:80: POST /holdings-storage/holdings
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: finishConnect(..) failed: Connection refused: mod-inventory-storage.folijet.svc.cluster.local/172.20.250.48:80: POST /instance-storage/instances
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /holdings-storage/holdings
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /instance-storage/instances
io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /item-storage/items