Skip to end of banner
Go to start of banner

PTF- MARC search

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Overview

This document contains the results of performance testing for searching SRS records based on the LDR05 value on the OCP3 environment for POST /source-storage/stream/marc-record-identifiers request with the following parameters PERF-703 - Getting issue details... STATUS :

Summary

  • The average time to respond to HTTP requests used in Test 1 and Test 2 for returning 66 records is about 6,4 seconds. The average time to respond to requests used in Test 3 for returning 20,000 records is about 6,9 seconds. The average time to respond to requests used in Test 4 for filtering using data range is about 6,7 seconds;

  • The high response time from the server to the HTTP request is caused by the duration of the SQL query to the database, which is about 6 seconds for all types of queries, ticket for optimization is created;

  • The memory utilization didn`t increase during Test 1- Test 4;

  • The maximal CPU utilization increases up to 3.5% on mod-source-record-storage;

  • RDS CPU utilization was about 80% during Test 1- Test 4, number on DB connection ~ 400;

Recommendations & Jiras

Test Runs 

Test 1. Request method POST, path source-storage/stream/marc-record-identifiers , body data

{"leaderSearchExpression": "p_05 = 'd'", } this request returns 66 records.

Test 2. Request method POST, path source-storage/stream/marc-record-identifiers , body data

{"leaderSearchExpression": "p_05 = 'd'", "deleted": true, "suppressFromDiscovery": true,} this request returns 66 records, generated by DELETE /source-storage/records/${id}.

Test 3. Request method POST, path source-storage/stream/marc-record-identifiers , body data

{"leaderSearchExpression": "p_05 = 'd'", "deleted": true, "suppressFromDiscovery": true,} this request returns 20,000 records, generated by DELETE /source-storage/records/${id}.

Test 4. Request method POST, path source-storage/stream/marc-record-identifiers , body data

{"leaderSearchExpression": "p_05 = 'd'", "deleted": true, "suppressFromDiscovery": true, "fieldsSearchExpression": "005.date in '20141106-20231108'"} this request returns 574 records.

Results

The table contains the main testing metrics. The duration of all of the tests was about 30 minutes.

NAME

TTL REQ, COUNT

ERRORS

MIN (sec)

MEDIAN (sec)

PCT95 (sec)

SQL Query Execution Time (sec)

Test 1

450

0

6,1

6,4

6,9

6,2

Test 2

450

0

6

6,4

6,9

6,2

Test 3

450

0

6,5

6,9

7,4

6,6

Test 4

450

0

6,3

6,7

7,2

6,58

Resource Utilizations

Services CPU Utilizations 

Test 1 Started at 12:15 and finished at about 12:45, the main services that were involved in request processing: mod-authtoken, mod-source-record-storage, okapi-b, nginx-okapi. Maximum CPU utilization was about 2.5%

Test 2 Started at 11:30 and finished at about 12:00, Test 3 Started at 13:50 and finished at about 14:20,
Test 4 Started at 15:50 and finished at about 15:20

Maximum CPU utilization was about 3.5% for mod-source-record-storage

image-20240308-162659.png

Services Memory Utilizations

Test 1. During testing, the memory usage does not increase

Test 2 - Test 3 - Test 4. During testing, the memory usage does not increase

image-20240308-162405.png

Test 2 - Test 3 - Test 4. The memory usage for main CPU Utilization services.

image-20240308-162344.png

RDS CPU Utilization 

During Test 1 RDS CPU usage increased up to 75%;

The maximum RDS CPU usage during Test 2, Test 3, and Test 4 was up to 80% ;

image-20240308-162747.png

RDS Database Connection

During Test 1- Test 4 the average number of database connections was about 400.

image-20240308-162811.png

Database Load

Database load during Test 1

 Click here to expand...

select "alias_94280021"."external_id", "alias_100265999".count from (select distinct "records_lb"."external_id" from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" where ((p_05 = 'd') and "records_lb"."state" = 'ACTUAL' and "records_lb"."suppress_discovery" = false and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_94280021" right outer join (select count(distinct "records_lb"."external_id") from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" where ((p_05 = 'd') and "records_lb"."state" = 'ACTUAL' and "records_lb"."suppress_discovery" = false and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_100265999" on true

Database load during Test 2

image-20240308-163050.pngimage-20240308-164125.pngimage-20240308-163224.png

 Click here to expand...

select "alias_121109780"."external_id", "alias_44493210".count from (select distinct "records_lb"."external_id" from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" where ((p_05 = 'd') and ("records_lb"."state" = 'ACTUAL' or "records_lb"."state" = 'DELETED' or ("records_lb"."state" = 'ACTUAL' and "records_lb"."leader_record_status" in ('d', 's', 'x'))) and "records_lb"."suppress_discovery" = true and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_121109780" right outer join (select count(distinct "records_lb"."external_id") from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" where ((p_05 = 'd') and ("records_lb"."state" = 'ACTUAL' or "records_lb"."state" = 'DELETED' or ("records_lb"."state" = 'ACTUAL' and "records_lb"."leader_record_status" in ('d', 's', 'x'))) and "records_lb"."suppress_discovery" = true and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_44493210" on true

Database load during Test 3

image-20240308-163343.pngimage-20240308-163859.pngimage-20240308-163421.png
 Click here to expand...

select "alias_121109780"."external_id", "alias_44493210".count from (select distinct "records_lb"."external_id" from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" where ((p_05 = 'd') and ("records_lb"."state" = 'ACTUAL' or "records_lb"."state" = 'DELETED' or ("records_lb"."state" = 'ACTUAL' and "records_lb"."leader_record_status" in ('d', 's', 'x'))) and "records_lb"."suppress_discovery" = true and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_121109780" right outer join (select count(distinct "records_lb"."external_id") from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" where ((p_05 = 'd') and ("records_lb"."state" = 'ACTUAL' or "records_lb"."state" = 'DELETED' or ("records_lb"."state" = 'ACTUAL' and "records_lb"."leader_record_status" in ('d', 's', 'x'))) and "records_lb"."suppress_discovery" = true and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_44493210" on true

Database load during Test 4

image-20240308-163552.pngimage-20240308-163756.pngimage-20240308-163653.png
 Click here to expand...

select "alias_5398203"."external_id", "alias_100273909".count from (select distinct "records_lb"."external_id" from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" join "marc_indexers_005" "i005" on "records_lb"."id" = "i005"."marc_id" join "marc_records_tracking" "mrt_005" on ("i005"."marc_id" = "mrt_005"."marc_id" and "i005"."version" = "mrt_005"."version") where ((p_05 = 'd') and (to_date(substring("i005"."value", 1, 8), 'yyyymmdd') between '20141106' and '20231108') and ("records_lb"."state" = 'ACTUAL' or "records_lb"."state" = 'DELETED' or ("records_lb"."state" = 'ACTUAL' and "records_lb"."leader_record_status" in ('d', 's', 'x'))) and "records_lb"."suppress_discovery" = true and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_5398203" right outer join (select count(distinct "records_lb"."external_id") from "records_lb" join "marc_indexers_leader" on "records_lb"."id" = "marc_indexers_leader"."marc_id" join "marc_indexers_005" "i005" on "records_lb"."id" = "i005"."marc_id" join "marc_records_tracking" "mrt_005" on ("i005"."marc_id" = "mrt_005"."marc_id" and "i005"."version" = "mrt_005"."version") where ((p_05 = 'd') and (to_date(substring("i005"."value", 1, 8), 'yyyymmdd') between '20141106' and '20231108') and ("records_lb"."state" = 'ACTUAL' or "records_lb"."state" = 'DELETED' or ("records_lb"."state" = 'ACTUAL' and "records_lb"."leader_record_status" in ('d', 's', 'x'))) and "records_lb"."suppress_discovery" = true and "records_lb"."record_type" = 'MARC_BIB' and "records_lb"."external_id" is not null)) "alias_100273909" on true

Additional information


Long SQL query analysis

 SQL query that take about 6 second

select "alias_94280021"."external_id", "alias_100265999".count
from (select distinct fs09000000_mod_source_record_storage.records_lb.external_id
from fs09000000_mod_source_record_storage.records_lb join fs09000000_mod_source_record_storage.marc_indexers_leader on fs09000000_mod_source_record_storage.records_lb.id = fs09000000_mod_source_record_storage.marc_indexers_leader.marc_id
where ((p_05 = 'd')
and fs09000000_mod_source_record_storage.records_lb.state = 'ACTUAL'
and fs09000000_mod_source_record_storage.records_lb.suppress_discovery = false
and fs09000000_mod_source_record_storage.records_lb.record_type = 'MARC_BIB'
and fs09000000_mod_source_record_storage.records_lb.external_id is not null)) "alias_94280021"
right outer join (select count(distinct fs09000000_mod_source_record_storage.records_lb.external_id)
from fs09000000_mod_source_record_storage.records_lb join fs09000000_mod_source_record_storage.marc_indexers_leader on fs09000000_mod_source_record_storage.records_lb.id = fs09000000_mod_source_record_storage.marc_indexers_leader.marc_id
where ((p_05 = 'd')
and state = 'ACTUAL'
and suppress_discovery = false
and record_type = 'MARC_BIB'
and external_id is not null)) "alias_100265999" on true

 Explain analysts results

"Nested Loop Left Join (cost=1020229.38..1020323.73 rows=3773 width=24) (actual time=5230.369..5231.050 rows=66 loops=1)" " -> Aggregate (cost=510007.34..510007.35 rows=1 width=8) (actual time=4192.762..4192.857 rows=1 loops=1)" " -> Gather (cost=1000.56..509997.90 rows=3773 width=16) (actual time=24.231..4192.687 rows=66 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Nested Loop (cost=0.56..508620.60 rows=1572 width=16) (actual time=151.580..4056.045 rows=22 loops=3)" " -> Parallel Seq Scan on marc_indexers_leader (cost=0.00..490961.97 rows=2074 width=16) (actual time=145.937..3841.444 rows=57 loops=3)" " Filter: ((p_05)::text = 'd'::text)" " Rows Removed by Filter: 9773039" " -> Index Scan using records_lb_pkey on records_lb (cost=0.56..8.51 rows=1 width=32) (actual time=3.739..3.739 rows=0 loops=172)" " Index Cond: (id = marc_indexers_leader.marc_id)" " Filter: ((NOT suppress_discovery) AND (external_id IS NOT NULL) AND (state = 'ACTUAL'::fs09000000_mod_source_record_storage.record_state) AND (record_type = 'MARC_BIB'::fs09000000_mod_source_record_storage.record_type))" " Rows Removed by Filter: 1" " -> Unique (cost=510222.05..510240.91 rows=3773 width=16) (actual time=1037.603..1038.181 rows=66 loops=1)" " -> Sort (cost=510222.05..510231.48 rows=3773 width=16) (actual time=1037.602..1038.168 rows=66 loops=1)" " Sort Key: records_lb_1.external_id" " Sort Method: quicksort Memory: 28kB" " -> Gather (cost=1000.56..509997.90 rows=3773 width=16) (actual time=1.604..1038.113 rows=66 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Nested Loop (cost=0.56..508620.60 rows=1572 width=16) (actual time=10.387..1033.099 rows=22 loops=3)" " -> Parallel Seq Scan on marc_indexers_leader marc_indexers_leader_1 (cost=0.00..490961.97 rows=2074 width=16) (actual time=10.348..1032.530 rows=57 loops=3)" " Filter: ((p_05)::text = 'd'::text)" " Rows Removed by Filter: 9773039" " -> Index Scan using records_lb_pkey on records_lb records_lb_1 (cost=0.56..8.51 rows=1 width=32) (actual time=0.008..0.008 rows=0 loops=172)" " Index Cond: (id = marc_indexers_leader_1.marc_id)" " Filter: ((NOT suppress_discovery) AND (external_id IS NOT NULL) AND (state = 'ACTUAL'::fs09000000_mod_source_record_storage.record_state) AND (record_type = 'MARC_BIB'::fs09000000_mod_source_record_storage.record_type))" " Rows Removed by Filter: 1" "Planning Time: 61.232 ms" "Execution Time: 5231.126 ms"

Line 5 shows that the duration of the p_05 filtering is ~ 4 seconds because it goes through all 29 million records in the table.

image-20240311-102915.png

Appendix

Infrastructure

PTF -environment ocp3

  • 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 1 instance writer of db.r6.xlarge database instances, [db.r6g.xlarge, vCPU 4 RAM 32 GB]

  • MSK ptf-kakfa-3 [ kafka configurations]
    4 m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

Modules memory and CPU parameters

Module

Task Def. Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft limit

CPU units

Xmx

MetaspaceSize

MaxMetaspaceSize

R/W split enabled

pcp1-pvt

mod-authtoken

16

mod-authtoken:2.14.1

2

1440

1152

512

922

88

128

FALSE

mod-permissions

47

mod-permissions:6.5.0-SNAPSHOT.369

2

1684

1544

512

1024

384

512

FALSE

mod-inventory-storage

15

mod-inventory-storage:27.0.4

2

4096

3690

2048

3076

384

512

FALSE

mod-source-record-storage

18

mod-source-record-storage:5.7.5

2

5600

5000

2048

3500

384

512

FALSE

mod-inventory

14

mod-inventory:20.1.7

2

2880

2592

1024

1814

384

512

FALSE

mod-source-record-manager

17

mod-source-record-manager:3.7.8

2

5600

5000

2048

3500

384

512

FALSE

nginx-okapi

9

nginx-okapi:2023.06.14

2

1024

896

128

0

0

0

FALSE

okapi-b

11

okapi:5.1.2

3

1684

1440

1024

922

384

512

FALSE

Methodology/Approach

All tests were run using the JMeter script.

Thread properties: 2 Virtual users with a Ramp-Up period of 1 second were running 225 loops.

Request parameters for:

Test 1.

POST /source-storage/stream/marc-record-identifiers
{
"leaderSearchExpression": "p_05 = 'd'"
}

Test 2 and Test 3.

POST /source-storage/stream/marc-record-identifiers
{
"leaderSearchExpression": "p_05 = 'd'"
"suppressFromDiscovery": true,
"deleted": true
}

Test 4

POST /source-storage/stream/marc-record-identifiers
{
"leaderSearchExpression": "p_05 = 'd'",
"fieldsSearchExpression": "005.date in '20141106-20231108'",
"suppressFromDiscovery": true,
"deleted": true
}
  • No labels