Skip to end of banner
Go to start of banner

Data Import Create MARC holdings records [Orchid]

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Overview

This document contains the results of testing Data Import Create MARC holdings records in Orchid to detect performance trends.

Ticket: PERF-474 - [Orchid] [Data Import] Marc HOLDINGS

Previous test report: Data Import Create MARC holdings records [Nolana]

Summary

  • Maximum CPU utilization was observed for mod-inventory-storage module during 80k DI job - 88%, see .
  • There is a significant memory utilization increase for some modules during 10k and 80k DI jobs, see. 
  • Maximum service memory utilization reached 97% for mod-inventory-storage during 80k DI job, see.
  • During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance, see.
  • Data import jobs duration significantly increased compared to the Nolana results, see. It can be caused by added index on the ncp5 environment on Orchid release to make it production like.


Test Results

Test

File

Duration: Morning_Glory

Duration: NolanaDuration: OrchidDiff_absoluteDiff_percentage

1

1k28s32s45s13s+40%
25k1m 48s4m 20s7m 47s3m 27s+79%
310k4m 4s3m 24s19m 46s16m 22s+481%
480k29m 6s21m 22s20m (error*)--

*65K items were discarded, see [link]

Instance CPU Utilization

Service CPU Utilization

Top services by CPU utilization (during 80k DI):

  • mod-inventory-storage - 88%
  • mod-inn-reach - 42%
  • mod-inventory - 15%

Memory Utilization

Top services by used memory increase (during 10k and 80k DI):

  • mod-inventory-storage - from 11% to 79%
  • mod-source-record-manager - from 33% to 61%
  • mod-source-record-storage - from 37% to 63%
  • mod-data-import - from 22% to 35%

DB CPU Utilization

During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance.

DB Connections

DB Load

SQL queries

Top-SQL statement:

SELECT jsonb,id FROM fs09000000_mod_inventory_storage.instance_holdings_item_view WHERE id='947de548-2c4b-443a-8b2d-0bfac571430b' LIMIT 1 OFFSET 0

1k and 5k:


10k and 80k:

Additional info

Log messages during 80k DI job failure:

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR dpointFailureHandler An error occurred io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] INFO  LogUtil     127.0.0.1:50354 PUT /holdings-storage/holdings/77a4fa76-091d-4429-a8a0-803f75276bf7 null HTTP_1_1 500 60 30000 tid=fs09000000 Internal Server Error

09:48:56 [576440/holdings-sources] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR PgUtil        Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [295007/inventory-view] [fs09000000] [d28a1154-3939-46bc-a114-e03c62aeafe5] [mod_inventory_storage] ERROR PostgresClient   Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

Error messages retrieved from the database (during 80k DI job):

  • Connection is closed
  • io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: db.ncp5.folio-eis.us-east-1/10.23.34.172:5432
  • io.netty.channel.StacklessClosedChannelException
  • io.vertx.core.impl.NoStackTraceThrowable: Couldn't update JobExecution status, JobExecution already marked as ERROR
  • io.vertx.core.impl.NoStackTraceThrowable: Fail to read any response from the server, the underlying connection might get lost unexpectedly.
  • io.vertx.core.impl.NoStackTraceThrowable: Timeout
  • io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • io.vertx.pgclient.PgException: FATAL: the database system is in recovery mode (57P03)
  • java.io.IOException: Connection reset by peer
  • org.folio.processing.exceptions.EventProcessingException: 504 Gateway Time-out
  • org.folio.processing.exceptions.EventProcessingException: Invalid token: Unexpected user response code 500 for user id 9eb67301-6f6e-468f-9b1a-6134dc39a684
  • org.folio.processing.exceptions.EventProcessingException: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • org.folio.processing.exceptions.EventProcessingException: Unable to retrieve permissions for user with id '9eb67301-6f6e-468f-9b1a-6134dc39a684': Unable to retrieve permissions (code 400): Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • This connection has been closed.
  • Timeout

Infrastructure

PTF -environment ncp5 

  • 11 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1  
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731
  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics


Modules memory and CPU parameters

Modules

Version

Task Definition

Running Tasks 

CPU

Memory

MemoryReservation

MaxMetaspaceSize

Xmx

mod-inventory-storage26.0.0122204840963690
512
3076
mod-inventory20.0.61221024288025925121814
mod-source-record-storage5.6.72722048560050005123500
mod-quick-marc3.0.061128228821765121664
mod-source-record-manager3.6.41822048560050005123500
mod-di-converter-storage2.0.5621281024896128768
mod-data-import2.7.1101256204818445121292
okapi5.0.183102416841440512922
nginx-okapi2022.06.14721281024896--
pub-okapi2022.06.14721281024896-768


  • No labels