Data Import Create MARC holdings records [Orchid]

Overview

This document contains the results of testing Data Import Create MARC holdings records in Orchid to detect performance trends.

Ticket: PERF-474 - Getting issue details... STATUS

Previous test report: Data Import Create MARC holdings records [Nolana]

Summary

  • Data import jobs duration significantly increased compared to the Nolana results, see TestResults
  • Maximum CPU utilization was observed for mod-inventory-storage module during 80k DI job - 88%, see ServiceCPUUtilization.
  • There is a significant memory utilization increase for some modules during 10k and 80k DI jobs, see MemoryUtilization
  • Maximum service memory utilization reached 97% for mod-inventory-storage during 80k DI job, see MemoryUtilization.
  • During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance, see DBCPUUtilization.

It should be noted that trigger process_marc_records_lb_insert_update_trigger was enabled for mod_source_record_storage.marc_records_lb table to make the environment production-like. It could influence the results in current release. At the same time tests showed that DI jobs duration with and without the trigger stays the same.

Recommendations & Jiras

Issue with failing 80K DI jobs was reproduced in 100% (tested several times). This issues requires further investigation.

Ticket: MODSOURCE-687 - Getting issue details... STATUS

Test Results

Profile used for testing - Default - Create Holdings and SRS MARC Holdings

Test

File

Duration: Morning_Glory

Duration: NolanaDuration: OrchidDiff_absoluteDiff_percentage

1

1k28s32s45s13s+40%
25k1m 48s4m 20s7m 47s3m 27s+79%
310k4m 4s3m 24s19m 46s16m 22s+481%
480k29m 6s21m 22s20m (error*)--

*65K items were discarded, see AdditionalInformation

Instance CPU Utilization

Service CPU Utilization

Top services by CPU utilization (during 80k DI):

  • mod-inventory-storage - 88%
  • mod-inn-reach - 42%
  • mod-inventory - 15%

Memory Utilization

Top services by used memory increase (during 10k and 80k DI):

  • mod-inventory-storage - from 11% to 79%
  • mod-source-record-manager - from 33% to 61%
  • mod-source-record-storage - from 37% to 63%
  • mod-data-import - from 22% to 35%


mod-inventory-storage memory utilizatoin (maximum) during 80k DI job:

DB CPU Utilization

During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance.

DB Connections

DB Load

SQL queries

Top-SQL statement:

SELECT jsonb,id FROM fs09000000_mod_inventory_storage.instance_holdings_item_view WHERE id='947de548-2c4b-443a-8b2d-0bfac571430b' LIMIT 1 OFFSET 0

1k and 5k:


10k and 80k:

Additional Information

Log messages during 80k DI job failure:

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR dpointFailureHandler An error occurred io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] INFO  LogUtil     127.0.0.1:50354 PUT /holdings-storage/holdings/77a4fa76-091d-4429-a8a0-803f75276bf7 null HTTP_1_1 500 60 30000 tid=fs09000000 Internal Server Error

09:48:56 [576440/holdings-sources] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR PgUtil        Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [295007/inventory-view] [fs09000000] [d28a1154-3939-46bc-a114-e03c62aeafe5] [mod_inventory_storage] ERROR PostgresClient   Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

Error messages retrieved from the database (during 80k DI job):

  • Connection is closed
  • io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: db.ncp5.folio-eis.us-east-1/10.23.34.172:5432
  • io.netty.channel.StacklessClosedChannelException
  • io.vertx.core.impl.NoStackTraceThrowable: Couldn't update JobExecution status, JobExecution already marked as ERROR
  • io.vertx.core.impl.NoStackTraceThrowable: Fail to read any response from the server, the underlying connection might get lost unexpectedly.
  • io.vertx.core.impl.NoStackTraceThrowable: Timeout
  • io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • io.vertx.pgclient.PgException: FATAL: the database system is in recovery mode (57P03)
  • java.io.IOException: Connection reset by peer
  • org.folio.processing.exceptions.EventProcessingException: 504 Gateway Time-out
  • org.folio.processing.exceptions.EventProcessingException: Invalid token: Unexpected user response code 500 for user id 9eb67301-6f6e-468f-9b1a-6134dc39a684
  • org.folio.processing.exceptions.EventProcessingException: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • org.folio.processing.exceptions.EventProcessingException: Unable to retrieve permissions for user with id '9eb67301-6f6e-468f-9b1a-6134dc39a684': Unable to retrieve permissions (code 400): Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
  • This connection has been closed.
  • Timeout

Infrastructure

PTF -environment ncp5