Data Import Create MARC holdings records [Orchid]

Data Import Create MARC holdings records [Orchid]

Overview

This document contains the results of testing Data Import Create MARC holdings records in Orchid to detect performance trends.

Ticket: PERF-474: [Orchid] [Data Import] Marc HOLDINGSClosed

Previous test report: Data Import Create MARC holdings records [Nolana]

Summary

  • Data import jobs duration significantly increased compared to the Nolana results, see TestResults

  • Maximum CPU utilization was observed for mod-inventory-storage module during 80k DI job - 88%, see ServiceCPUUtilization.

  • There is a significant memory utilization increase for some modules during 10k and 80k DI jobs, see MemoryUtilization

  • Maximum service memory utilization reached 97% for mod-inventory-storage during 80k DI job, see MemoryUtilization.

  • During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance, see DBCPUUtilization.

It should be noted that trigger process_marc_records_lb_insert_update_trigger was enabled for mod_source_record_storage.marc_records_lb table to make the environment production-like. It could influence the results in current release. At the same time tests showed that DI jobs duration with and without the trigger stays the same.

Recommendations & Jiras

Issue with failing 80K DI jobs was reproduced in 100% (tested several times). This issues requires further investigation.

Ticket: MODSOURCE-687: Investigate failing 80K Data Import MARC Holdings jobOpen

Test Results

Profile used for testing - Default - Create Holdings and SRS MARC Holdings

Test

File

Duration: Morning_Glory

Duration: Nolana

Duration: Orchid

Diff_absolute

Diff_percentage

Test

File

Duration: Morning_Glory

Duration: Nolana

Duration: Orchid

Diff_absolute

Diff_percentage

1

1k

28s

32s

45s

13s

+40%

2

5k

1m 48s

4m 20s

7m 47s

3m 27s

+79%

3

10k

4m 4s

3m 24s

19m 46s

16m 22s

+481%

4

80k

29m 6s

21m 22s

20m (error*)

-

-

*65K items were discarded, see AdditionalInformation

Instance CPU Utilization

This editor does not support displaying this content: heading

Service CPU Utilization

Top services by CPU utilization (during 80k DI):

  • mod-inventory-storage - 88%

  • mod-inn-reach - 42%

  • mod-inventory - 15%

Memory Utilization

Top services by used memory increase (during 10k and 80k DI):

  • mod-inventory-storage - from 11% to 79%

  • mod-source-record-manager - from 33% to 61%

  • mod-source-record-storage - from 37% to 63%

  • mod-data-import - from 22% to 35%

This editor does not support displaying this content: heading

 

mod-inventory-storage memory utilizatoin (maximum) during 80k DI job:

DB CPU Utilization

During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance.

DB Connections

DB Load

SQL queries

Top-SQL statement:

SELECT jsonb,id FROM fs09000000_mod_inventory_storage.instance_holdings_item_view WHERE id='947de548-2c4b-443a-8b2d-0bfac571430b' LIMIT 1 OFFSET 0

1k and 5k:

 

10k and 80k:

Additional Information

Log messages during 80k DI job failure:

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR dpointFailureHandler An error occurred io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] INFO  LogUtil     127.0.0.1:50354 PUT /holdings-storage/holdings/77a4fa76-091d-4429-a8a0-803f75276bf7 null HTTP_1_1 500 60 30000 tid=fs09000000 Internal Server Error

09:48:56 [576440/holdings-sources] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR PgUtil        Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

09:48:56 [295007/inventory-view] [fs09000000] [d28a1154-3939-46bc-a114-e03c62aeafe5] [mod_inventory_storage] ERROR PostgresClient   Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

Error messages retrieved from the database (during 80k DI job):

  • Connection is closed

  • io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: db.ncp5.folio-eis.us-east-1/10.23.34.172:5432

  • io.netty.channel.StacklessClosedChannelException

  • io.vertx.core.impl.NoStackTraceThrowable: Couldn't update JobExecution status, JobExecution already marked as ERROR

  • io.vertx.core.impl.NoStackTraceThrowable: Fail to read any response from the server, the underlying connection might get lost unexpectedly.

  • io.vertx.core.impl.NoStackTraceThrowable: Timeout

  • io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

  • io.vertx.pgclient.PgException: FATAL: the database system is in recovery mode (57P03)

  • java.io.IOException: Connection reset by peer

  • org.folio.processing.exceptions.EventProcessingException: 504 Gateway Time-out

  • org.folio.processing.exceptions.EventProcessingException: Invalid token: Unexpected user response code 500 for user id 9eb67301-6f6e-468f-9b1a-6134dc39a684

  • org.folio.processing.exceptions.EventProcessingException: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

  • org.folio.processing.exceptions.EventProcessingException: Unable to retrieve permissions for user with id '9eb67301-6f6e-468f-9b1a-6134dc39a684': Unable to retrieve permissions (code 400): Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432

  • This connection has been closed.

  • Timeout