Data Import Create MARC holdings records [Orchid]
Overview
This document contains the results of testing Data Import Create MARC holdings records in Orchid to detect performance trends.
Ticket: - PERF-474Getting issue details... STATUS
Previous test report: Data Import Create MARC holdings records [Nolana]
Summary
- Data import jobs duration significantly increased compared to the Nolana results, see TestResults.
- Maximum CPU utilization was observed for mod-inventory-storage module during 80k DI job - 88%, see ServiceCPUUtilization.
- There is a significant memory utilization increase for some modules during 10k and 80k DI jobs, see MemoryUtilization.
- Maximum service memory utilization reached 97% for mod-inventory-storage during 80k DI job, see MemoryUtilization.
- During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance, see DBCPUUtilization.
It should be noted that trigger process_marc_records_lb_insert_update_trigger was enabled for mod_source_record_storage.marc_records_lb table to make the environment production-like. It could influence the results in current release. At the same time tests showed that DI jobs duration with and without the trigger stays the same.
Recommendations & Jiras
Issue with failing 80K DI jobs was reproduced in 100% (tested several times). This issues requires further investigation.
Ticket: - MODSOURCE-687Getting issue details... STATUS
Test Results
Profile used for testing - Default - Create Holdings and SRS MARC Holdings
Test | File | Duration: Morning_Glory | Duration: Nolana | Duration: Orchid | Diff_absolute | Diff_percentage |
---|---|---|---|---|---|---|
1 | 1k | 28s | 32s | 45s | 13s | +40% |
2 | 5k | 1m 48s | 4m 20s | 7m 47s | 3m 27s | +79% |
3 | 10k | 4m 4s | 3m 24s | 19m 46s | 16m 22s | +481% |
4 | 80k | 29m 6s | 21m 22s | 20m (error*) | - | - |
*65K items were discarded, see AdditionalInformation
Instance CPU Utilization
Service CPU Utilization
Top services by CPU utilization (during 80k DI):
- mod-inventory-storage - 88%
- mod-inn-reach - 42%
- mod-inventory - 15%
Memory Utilization
Top services by used memory increase (during 10k and 80k DI):
- mod-inventory-storage - from 11% to 79%
- mod-source-record-manager - from 33% to 61%
- mod-source-record-storage - from 37% to 63%
- mod-data-import - from 22% to 35%
mod-inventory-storage memory utilizatoin (maximum) during 80k DI job:
DB CPU Utilization
During 80k DI job there is a switch of instrances for reader and writer role. This might happen because of overload and restart of the writer instance.
DB Connections
DB Load
SQL queries
Top-SQL statement:
SELECT jsonb,id FROM fs09000000_mod_inventory_storage.instance_holdings_item_view WHERE id='947de548-2c4b-443a-8b2d-0bfac571430b' LIMIT 1 OFFSET 0
1k and 5k:
10k and 80k:
Additional Information
Log messages during 80k DI job failure:
09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR dpointFailureHandler An error occurred io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
09:48:56 [937501/holdings-storage] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] INFO LogUtil 127.0.0.1:50354 PUT /holdings-storage/holdings/77a4fa76-091d-4429-a8a0-803f75276bf7 null HTTP_1_1 500 60 30000 tid=fs09000000 Internal Server Error
09:48:56 [576440/holdings-sources] [fs09000000] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_inventory_storage] ERROR PgUtil Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
09:48:56 [295007/inventory-view] [fs09000000] [d28a1154-3939-46bc-a114-e03c62aeafe5] [mod_inventory_storage] ERROR PostgresClient Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
Error messages retrieved from the database (during 80k DI job):
- Connection is closed
- io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: db.ncp5.folio-eis.us-east-1/10.23.34.172:5432
- io.netty.channel.StacklessClosedChannelException
- io.vertx.core.impl.NoStackTraceThrowable: Couldn't update JobExecution status, JobExecution already marked as ERROR
- io.vertx.core.impl.NoStackTraceThrowable: Fail to read any response from the server, the underlying connection might get lost unexpectedly.
- io.vertx.core.impl.NoStackTraceThrowable: Timeout
- io.vertx.core.impl.NoStackTraceThrowable: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
- io.vertx.pgclient.PgException: FATAL: the database system is in recovery mode (57P03)
- java.io.IOException: Connection reset by peer
- org.folio.processing.exceptions.EventProcessingException: 504 Gateway Time-out
- org.folio.processing.exceptions.EventProcessingException: Invalid token: Unexpected user response code 500 for user id 9eb67301-6f6e-468f-9b1a-6134dc39a684
- org.folio.processing.exceptions.EventProcessingException: Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
- org.folio.processing.exceptions.EventProcessingException: Unable to retrieve permissions for user with id '9eb67301-6f6e-468f-9b1a-6134dc39a684': Unable to retrieve permissions (code 400): Timeout for DB_HOST:DB_PORT=db.ncp5.folio-eis.us-east-1:5432
- This connection has been closed.
- Timeout
Infrastructure
PTF -environment ncp5