Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

IN PROGRESS

Table of Contents
outlinetrue

...

In the scope of

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-531
 it's needed to run tests to answer questions: 

  • Can the current System can accommodate a high use case load NLA load model investigation and creation?
  • What happened at peak times when all workflows are running at once? 
  • Typical KPIs:
    • Service CPU
    • Service Memory
    • DB CPU
    • DB Memory
  • Response times
  • Durations of long workflows
  • Recommendations to improve on scaling up/out modules to accommodate peak times

...

tested without Data Import due to

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-582

  • The current system can accommodate an average load only without Data Import. With data import, we will have 'HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect.' for several workflows in the source-storage/records/{id}/formatted requests
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyPERF-582
    , and the general response time will be longer up to 2 times for all other workflows.
  • Response time for high load (without DI & FYR) in general is the same as for normal load.
  • Service CPU utilization did not exceed 36% 33% even at the beginning of all processes together. Without spikes at the beginning and start of DI jobs average CPU usage was about 15%.   Instance CPU Utilization did not exceed 22%13%.
  • Service memory utilization was stable, and no memory leaks were suspected during tests.
  • Each FYR (Fiscal close - end of FY rollover) job is consuming a lot of DB CPU (each spike here corresponds to each FYR job)job is consuming a lot of DB CPU. Approximately DB CPU usage was up to 92%

    Increasing the number of database connections to DB_MAXPOOLSIZE = 200 for mod-source-record-manager and mod-source-record-storage did not give any positive effect on 'HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect.' but duration of Data Import job was decreased twice.

    is up to 95,2% with FYR and up to 75% without.

Recommendations & Jiras (Optional)

Jiras

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-582

...

 Test results from 1st test run (1st, 2nd and 3rd test run results are similar):

Test #

Workflow name 

Total time it takes to complete workflow
Avg diffAvg no DI

Comments


Avg (sec) high load with FYR95th pct (sec) high load with FYRAvg (sec) high load95th pct
no DI
(sec) high loadAvg
no DI
(sec) normal load95th pct
no DI
(sec) normal load
1Checkin1.2181.4881.0931.3621.0541.591
2Checkout2.2722
Checkout
.7801.8302.1871.6501.948
3View invoices0.8251.0560.5650.9080.7630.913
4Create invoices1.3651.6391.0761.3411.1741.370
5Edit invoices1.7342.5090.5311.7491.5811.897
6Delete invoices0.9821.2090.3720.8840.8040.927
7Approving Invoices0.9431.0910.7580.9201.4531.940
8View Authority records0.4760.3260.290.4740.2890.381
9View MARC tag table1.3292.0451.0281.3060.9871.284
10View holdings records1.9172.5211.5072.0561.5261.922
11View Bib1.0271.3590.8081.1710.8411.168
12View patron records0.6991.1060.5260.8990.5660.883
13Delete patron records1.0161.2640.6271.1090.6381.070
14Update patron records1.4122.1380.9761.6151.0431.625
15Create patron records1.2731.5690.9721.3081.0981.261
16View Ledger0.0600.1020.0540.0850.0500.088
17Create ledger0.7230.9230.6610.7980.6160.761
18Edit ledger0.1460.1580.0540.0880.0540.085
19Delete a ledger0.0650.0970.0550.0990.0460.080
20Export bib "Default instances export job profile
"
"16 sec (5000 records)-13 sec (5000 records)-5 sec  (5000 records)-
21Export holdings "Default holdings export job profile"40 sec  (5000 records)-37 sec  (5000 records)-26 sec  (5000 records)-
22Export authority records "Default authority export job profile"8 sec (5000 records)-5 sec (5000 records)-3 sec  (5000 records)-
23DI "DISC HRID match"------
24DI "DS LA edeposit records update"------
25DI "DISC New edeposit records"------
26DI "DISC New NON edeposit records"------
27View item records1.5451.9841.2211.5981.2891.649
28update item records1.1341.4600.9201.2120.9981.250
29delete item records1.2871.6170.9101.2160.9271.099
30Monitoring Pick Slips and Requests GET /circulation/requests0.4430.5710.3650.5440.3590.480
31Monitoring Pick Slips and Requests GET /circulation/pick-slips/0.1120.3760.1300.3030.1120.256
32Monitoring Pick Slips and Requests0.2330.2330.1120.1120.3030.303
33Users loan renewal2.1672.4801.6051.8651.4671.661
34Item-level requests0.8541.0580.7090.9660.6690.973
35View vendor records1.0951.9390.7401.3080.7131.165
36Edit vendor records8.62610.2085.8256.5785.1996.190
37Create vendor records1.5981.9831.0621.2991.0641.200
38Delete vendor records0.5780.7140.3560.4540.4120.522
39Create purchase orders2.1882.7481.7501.8681.6251.733
40View purchase orders1.7921.8861.3471.5911.2051.435
41Edit purchase orders

41Edit purchase orders3.3813.7952.9153.1342.0762.984
42Delete purchase orders2.0932.0931.6841.8181.4321.830
43Retrieving instances and holdings0.0570.0970.0420.0840.0350.073
44Edit MARC tag table4.3804.9193.4303.9393.4244.257
45Fiscal close - end of FY rollover12 min 30 sec---11 min-
46

Blacklight: View an inventory record JMeter script

1.4151.9590.9691.3460.8211.042
47Blacklight: Create a Request JMeter script1.7302.0861.3401.8411.1221.404
48

Blacklight: Create a View Patron record JMeter script

0.1080.1260.0660.1020.0730
.110
.110
49VIH_View instance holdings details1.9592.5211.5911.8371.4561.572

Throughput graphs

For Test#1, Test#2 and Test#3 graphs are almost the same. With FYR general throughput is decreased. Possibly, the reason is high database resource usage of FYR.

Image Added

Memory Utilization

This graph represents memory usage of 3 first test runs and shows that no memory leak is suspected for all of the modules.


Image Added



Service CPU Utilization 

Average CPU usage did not exceed 36 33 % for all modules. We can observe spikes in CPU usage of DI modules at the beginning of the Data Import jobs. Without DI spikes average CPU usage was about 15%.mod-authtoken like in every CICO testing.

Image Added

Most CPU-consuming modules: 

  • mod-inventory configuration - 36%33%mod
  • nginx-authtoken okapi - 25%24%
  • mod-datausers - import 22%
  • okapi - 16%21
  • mod-di-converterfinance-storage - 15,6%20%
  • mod-quickauthtoken - marc - 15%mod-finance-storage 10-22% (spiking)
  • pub-okapi - 14%nginx
  • mod-okapi finance - 14%13%okapi - 11%
  • mod-inventory - 12,5%
  • mod-tags -10,6%others - usage less than 10%inventory-storage -12%
  • mod-sourcequick-record-manager - up to 7%marc - 11%
  • mod-source-record-storage - up to 4%invoice-storage -11%
  • others - usage less than - 10%


Instance CPU Utilization

Instance CPU Utilization did not exceed 13%.

Image Added


RDS CPU Utilization 

Each FYR (Fiscal close - end of FY rollover) job is consuming a lot of DB CPU (each spike here at the start corresponds to each FYR job).

Approximately DB CPU usage is up to 95,2% with FYR and up to 92%75% without.


Image Added

RDS Database Connections


Test# 1 - Test# 3 - 420 connections count.

Image Added

Appendix

Infrastructure

Load generator 

Instance Type: t3.2xlarge (Ram memory in GB available per load generator - 30 GB needed for the test with high load)

PTF -environment ncp3 

  • 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731
    NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731


Records count :

  • mod_source_record_storage.marc_records_lb = 7300919
  • mod_source_record_storage.raw_records_lb = 7300919
  • mod_source_record_storage.records_lb = 7300919
  • mod_source_record_storage.marc_indexers = 245032159 (all records)
  • mod_source_record_storage.marc_indexers with field_no 010 = 1008129
  • mod_source_record_storage.marc_indexers with field_no 035 = 8968420
  • mod_inventory_storage.authority = 852215
  • mod_inventory_storage.holdings_record = 6091403
  • mod_inventory_storage.instance = 5581816
  • mod_inventory_storage.item = 5705915


  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics

...

Modules

Version

Task Definition

Running Tasks 

CPU

Memory

MemoryReservation

MaxMetaspaceSize

Xmx

mod-inventory-storage26.0.0121024220819523841440
mod-inventory20.0.4121024288025925121814
mod-tags2.0.1121281024896128768
mod-gobi2.6.0121281024896128700
mod-remote-storage2.0.2121024492044725123960
mod-invoice-storage5.6.0121281024896128700
edge-sip23.0.0121281024896128768
mod-users-bl7.5.01251214401152128922
edge-rtac2.6.0121281024896128768
mod-feesfines18.2.1121281024896128768
mod-rtac3.5.0121281024896128768
mod-erm-usage-harvester4.3.0121281024896128768
mod-search2.0.1124002592248010241440
mod-service-interaction2.2.212256204818445121290
edge-ncip1.8.1121281024896128768
mod-authtoken2.13.01251214401152128922
mod-permissions6.3.122512168415445121024
mod-circulation-storage16.0.012102415361440512896
mod-ncip1.13.1121281024896128768
mod-pubsub2.9.112102415361440512922
edge-orders2.8.112102415361440512922
mod-circulation23.5.412153628802592128700
edge-caiasoft2.0.0121281024896--
mod-data-export4.7.11110241024896128768
mod-organizations-storage4.5.1121281024896128700
mod-source-record-storage5.6.5122048560050005123600
mod-copycat1.4.0128961024896128768
mod-bulk-operations1.0.5121024307226005121536
mod-quick-marc3.0.011128228821765121664
mod-audit2.7.01210241024896128768
mod-oai-pmh3.11.3121024224820005121440
edge-connexion1.0.6121281024896128768
mod-kb-ebsco-java3.13.0121281024896128768
mod-patron5.5.2121281024896128768
mod-email1.15.3121281024896128768
mod-password-validator3.0.01212814401298512768
mod-login7.9.012102414401298512768
mod-data-export-worker3.0.12121024307226005122048
mod-agreements5.5.212128309625805122048
edge-oai-pmh2.6.1121024151213605121440
mod-eusage-reports1.3.0121281024896128768
mod-orders-storage13.5.0125121024896128700
mod-notify3.0.0121281024896128768
mod-source-record-manager3.6.2122048560050005123600
mod-di-converter-storage2.0.2221281024896128768
mod-template-engine1.18.0121281024896128768
mod-user-import3.7.2121281024896128768
mod-finance-storage8.4.1121281024896128700
mod-users19.1.1121281024896128768
mod-sender1.10.0121281024896128768
mod-graphql1.11.0121281024896128768
mod-licenses4.3.112128248023125121792
mod-invoice-b5.6.21251214401152128922
mod-event-config2.5.0121281024896128768
mod-calendar2.4.2121281024896128768
mod-erm-usage4.5.2121281024896128768
mod-patron-blocks1.8.01210241024896128768
mod-data-import2.7.111256204818445121292
mod-ebsconet2.0.01212812481024256700
edge-dematic2.0.0121281024896--
mod-task-list5.0.1111281024896128768
mod-courses1.4.7121281024896128768
mod-inventory-update3.0.1121281024896128768
mod-login-saml2.6.1121281024896128768
mod-orders12.6.612102420481440 (Recommended to change to 1544)5121024
mod-configuration5.9.1121281024896128768
mod-organizations1.7.0121281024896128700
mod-notes5.0.1121281024896128322
mod-finance4.7.1121281024896128700
mod-data-export-spring2.0.111256204818442561292
edge-patron4.11.0122561024896128768
okapi5.0.123102416841440512922
nginx-okapi2022.03.02121281024896--
pub-okapi2022.03.02121281024896-768


Methodology/Approach

To test Baseline for normal stress testing of NLA library usage the JMeter scripts were used.

Tested with different DI delays:

...

without DI.

Data was gathered from 2 periods with and without data importFYR.


  • DI - data import
  • FYR - Fiscal close - end of FY rollover

...