Data Import Create MARC holdings records [Orchid]

Overview

This document contains the results of testing Data Import Create MARC holdings records in Orchid to detect performance trends.

Ticket: PERF-474 - Getting issue details... STATUS

Previous test report: Data Import Create MARC holdings records [Nolana]

Summary

  • Data import jobs duration significantly increased compared to the Nolana results, see TestResults
  • Maximum CPU utilization was observed for mod-inventory-storage module during 80k DI job - 88%, see ServiceCPUUtilization.
  • There is a significant memory utilization increase for some modules during 10k and 80k DI jobs, see MemoryUtilization
  • Maximum service memory utilization reached 97% for mod-inventory-storage during 80k DI job, see MemoryUtilization.
  • During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance, see DBCPUUtilization.

It should be noted that trigger process_marc_records_lb_insert_update_trigger was enabled for mod_source_record_storage.marc_records_lb table to make the environment production-like. It could influence the results in current release. At the same time tests showed that DI jobs duration with and without the trigger stays the same.

Recommendations & Jiras

Issue with failing 80K DI jobs was reproduced in 100% (tested several times). This issues requires further investigation.

Ticket: MODSOURCE-687 - Getting issue details... STATUS

Test Results

Profile used for testing - Default - Create Holdings and SRS MARC Holdings

Test

File

Duration: Morning_Glory

Duration: NolanaDuration: OrchidDiff_absoluteDiff_percentage

1

1k28s32s45s13s+40%
25k1m 48s4m 20s7m 47s3m 27s+79%
310k4m 4s3m 24s19m 46s16m 22s+481%
480k29m 6s21m 22s20m (error*)--

*65K items were discarded, see AdditionalInformation

Instance CPU Utilization

Service CPU Utilization

Top services by CPU utilization (during 80k DI):

  • mod-inventory-storage - 88%
  • mod-inn-reach - 42%
  • mod-inventory - 15%

Memory Utilization

Top services by used memory increase (during 10k and 80k DI):

  • mod-inventory-storage - from 11% to 79%
  • mod-source-record-manager - from 33% to 61%
  • mod-source-record-storage - from 37% to 63%
  • mod-data-import - from 22% to 35%


mod-inventory-storage memory utilizatoin (maximum) during 80k DI job:

DB CPU Utilization

During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance.

DB Connections

DB Load

SQL queries

Top-SQL statement:

SELECT jsonb,id FROM fs09000000_mod_inventory_storage.instance_holdings_item_view WHERE id='947de548-2c4b-443a-8b2d-0bfac571430b' LIMIT 1 OFFSET 0

1k and 5k:


10k and 80k:

Additional Information

Log messages during 80k DI job failure:

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR dpointFailureHandler An error occurred io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] INFO  LogUtil     127.0.0.1:50354 PUT /holdings-storage/holdings/77a4fa76-091d-4429-a8a0-803f75276bf7 null HTTP_1_1 500 60 30000 tid=fs09000000 Internal Server Error

09:48:56 [576440/holdings-sources] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR PgUtil        Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [295007/inventory-view] [fs09000000] [d28a1154-3939-46bc-a114-e03c62aeafe5] [mod_inventory_storage] ERROR PostgresClient   Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

Error messages retrieved from the database (during 80k DI job):

  • Connection is closed
  • io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: db.ncp5.folio-eis.us-east-1/10.23.34.172:5432
  • io.netty.channel.StacklessClosedChannelException
  • io.vertx.core.impl.NoStackTraceThrowable: Couldn't update JobExecution status, JobExecution already marked as ERROR
  • io.vertx.core.impl.NoStackTraceThrowable: Fail to read any response from the server, the underlying connection might get lost unexpectedly.
  • io.vertx.core.impl.NoStackTraceThrowable: Timeout
  • io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • io.vertx.pgclient.PgException: FATAL: the database system is in recovery mode (57P03)
  • java.io.IOException: Connection reset by peer
  • org.folio.processing.exceptions.EventProcessingException: 504 Gateway Time-out
  • org.folio.processing.exceptions.EventProcessingException: Invalid token: Unexpected user response code 500 for user id 9eb67301-6f6e-468f-9b1a-6134dc39a684
  • org.folio.processing.exceptions.EventProcessingException: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • org.folio.processing.exceptions.EventProcessingException: Unable to retrieve permissions for user with id '9eb67301-6f6e-468f-9b1a-6134dc39a684': Unable to retrieve permissions (code 400): Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • This connection has been closed.
  • Timeout

Infrastructure

PTF -environment ncp5 

  • 11 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1  
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731
  • number of connections for mod-source-record-manager and mod-source-record-storage - 30 connections.
  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics


Modules memory and CPU parameters

Modules

Version

Task Definition

Running Tasks 

CPU

Memory

MemoryReservation

MaxMetaspaceSize

Xmx

mod-inventory-storage26.0.0122204840963690
512
3076
mod-inventory20.0.61221024288025925121814
mod-source-record-storage5.6.72722048560050005123500
mod-quick-marc3.0.061128228821765121664
mod-source-record-manager3.6.41822048560050005123500
mod-di-converter-storage2.0.5621281024896128768
mod-data-import2.7.1101256204818445121292
okapi5.0.183102416841440512922
nginx-okapi2022.06.14721281024896--
pub-okapi2022.06.14721281024896-768