Data Import Update multi tenant (Poppy)

Overview

  • This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Poppy release. Ticket: PERF-779

Summary

Test set run №2 (updated configuration for mod-permission)

  • Three tests of DI with update jobs with 25k records were carried out on the pcp1 cluster.
  • DI with a new configuration for mod-permission was successful. DI jobs are processed one by one in test run #3.
  • Mod-permission version changed from 6.4.0 → 6.3.2 doesn't solve the issue with NullPointerException: "mappingParameters" is null for configuration parameter for mod-permission in Mem Hard Limit=1684, Mem Soft limit=1544, Xmx=1024m. 
  • Changed configuration parameter for mod-permission in Mem Hard Limit=1684→2384, Mem Soft limit=1544→2244, Xmx=1024m→1500m allowed to run tests successfully and avoided issue with "mappingParameters" is null.
  • Memory consumption for mod-data-import was 36%, mod-inventory - 94%. No memory leaks are suspected.
  • CPU utilization for mod-data-import spiked to 72%, mod-inventory - 171%. It was observed that CPU utilization for mod-inventory has a growing trend.
  • DB CPU utilization was 97%.
  • DB connections - 500 on average and 681 during spikes connected to mod-permission spikes on the CPU graph.
  • Slow queries found -  SELECT jsonb FROM fs07000002_mod_permissions.permissions

Test set run №1

  • Three tests of DI with update jobs with 25k records were carried out on the pcp1 cluster.
  • The first and second tests were successful. The third test of DI on three tenants running concurrently failed after a third data import started. Errors attached.
  • Slow queries found for SELECT jsonb FROM fs07000001_mod_permissions.permissions

  • Issue with a spike from mod-permission that led to test #3 problems were investigated. The module stopped after a timeout. Additional analysis with a method that consumed most of the resources attached.

Recommendations and Jiras

  • Created Jira ticket to Investigate "mappingParameters" is null error: PERF-801

Test Results

Test set run#2 (updated configuration for mod-permission)

ProfileTest #TenantMARC File

DI Duration Poppy (hh:mm:ss)

Results

DI MARC Bib Update (PTF - Updates Success - 1)1fso700000125K.mrc00:17:18Completed
2fso900000025K.mrc00:23:11Completed
fso700000225K.mrc00:24:16Completed
3fso900000025K.mrc00:26:36Completed
fso700000225K.mrc00:38:01Completed
fso700000125K.mrc00:49:24
Completed


Test set run#1

ProfileTest #TenantMARC File

DI Duration Poppy (hh:mm:ss)

Results

DI MARC Bib Update (PTF - Updates Success - 1)1fso900000025K.mrc00:20:23Completed
2fso900000025K.mrc00:24:48Completed
fso700000125K.mrc00:31:53Completed
3fso900000025K.mrc00:27:10Completed
fso700000125K.mrc00:36:09with errors*
fso700000225K.mrc00:30:29with errors*

Comparison

The following table compares previous results of Poppy release with current results.

ProfileMARC FileTest run #DI DurationDI Delta, (hh:mm:ss)
Poppy (previous)Poppy
DI MARC Bib Update (PTF - Updates Success - 1)

25K.mrc

100:14:5000:17:18+ 00:02:28
2not tested00:24:16-
3not tested00:49:24-


Errors for test set #1

 Errors test #3


15:23:10 [] [fs07000001] [] [] WARN  teModifyEventHandler handle:: Error while MARC record modifying

javax.ws.rs.BadRequestException: Date when processing started is not set, expected snapshot status is PARSING_IN_PROGRESS, actual - PROCESSING_IN_PROGRESS

at org.folio.services.RecordServiceImpl.lambda$3(RecordServiceImpl.java:112) ~[ms.jar:?]

at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38) ~[ms.jar:?]

at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60) ~[ms.jar:?]

at io.vertx.core.impl.future.FutureImpl.tryComplete(FutureImpl.java:211) ~[ms.jar:?]


15:11:11 [] [fs07000002] [90aad488-be59-4879-b63b-2f8f13b08e85] [] ERROR KafkaConsumerWrapper businessHandlerCompletionHandler:: Error while processing a record - id: 1 subscriptionPattern: SubscriptionDefinition(eventType=DI_RAW_RECORDS_CHUNK_READ, subscriptionPattern=pcp1\.Default\.\w{1,}\.DI_RAW_RECORDS_CHUNK_READ) offset: 9738

org.folio.services.exceptions.RawChunkRecordsParsingException: java.lang.NullPointerException: Cannot invoke "org.folio.processing.mapping.defaultmapper.processor.parameters.MappingParameters.getLinkingRules()" because "mappingParameters" is null

at org.folio.verticle.consumers.RawMarcChunksKafkaHandler.lambda$handle$1(RawMarcChunksKafkaHandler.java:98) ~[ms.jar:?]

at io.vertx.core.impl.future.Composition.onFailure(Composition.java:50) ~[ms.jar:?]

at io.vertx.core.impl.future.FutureBase.emitFailure(FutureBase.java:75) ~[ms.jar:?]


Caused by: java.lang.NullPointerException: Cannot invoke "org.folio.processing.mapping.defaultmapper.processor.parameters.MappingParameters.getLinkingRules()" because "mappingParameters" is null

at org.folio.services.afterprocessing.FieldModificationServiceImpl.lambda$remove9Subfields$2(FieldModificationServiceImpl.java:32) ~[ms.jar:?]

at io.vertx.core.impl.future.Mapping.onSuccess(Mapping.java:35) ~[ms.jar:?]

at io.vertx.core.impl.future.FutureBase.emitSuccess(FutureBase.java:60) ~[ms.jar:?]

at io.vertx.core.impl.future.FutureImpl.tryComplete(FutureImpl.java:211) ~[ms.jar:?]


15:11:10 [] [fs07000002] [90aad488-be59-4879-b63b-2f8f13b08e85] [] ERROR RestUtil             Response HTTP code is not equals 200, 201, 204. Response code: 504

15:11:10 [] [fs07000002] [90aad488-be59-4879-b63b-2f8f13b08e85] [] ERROR ngParametersProvider Something happened while initializing mapping parameters

javax.ws.rs.BadRequestException: HTTP 400 Bad Request

at org.folio.dataimport.util.RestUtil.validateAsyncResult(RestUtil.java:159) ~[ms.jar:?]

at org.folio.services.mappers.processor.MappingParametersProvider.lambda$getIdentifierTypes$2(MappingParametersProvider.java:243) ~[ms.jar:?]

java.lang.NullPointerException: Cannot invoke "org.folio.processing.mapping.defaultmapper.processor.parameters.MappingParameters.getLinkingRules()" because "mappingParameters" is null

java.lang.NullPointerException: Cannot invoke "org.folio.processing.mapping.defaultmapper.processor.parameters.MappingParameters.getLinkingRules()" because "mappingParameters" is null

Unable to retrieve permissions (code 400): Timeout for DB_HOST:DB_PORT=db.pcp1.folio-eis.us-east-1:5432

Timeout for DB_HOST:DB_PORT=db.pcp1.folio-eis.us-east-1:54
15:09:48 [037670/locations;130709/perms] [fs07000001] [] [mod_permissions] ERROR PermsAPI             Timeout for DB_HOST:DB_PORT=db.pcp1.folio-eis.us-east-1:5432

Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-959 [ACTIVE]
pcp1/mod-search/ff1a258966fc4c69b8f88d1655267aa4

15:16:47 [] [] [] [] WARN  KafkaMessageListener Failed to index resource event [eventType: CREATE, tenantId: fs09000000, id: cbc38005b4e939fe85401fc369bdcd3049c80b8f]
org.folio.search.exception.SearchOperationException: Failed to perform elasticsearch request [index=pcp1_contributor_fs09000000, type=bulkApi, message: 30,000 milliseconds timeout on connection http-outgoing-959 [ACTIVE]]
    at org.folio.search.utils.SearchUtils.performExceptionalOperation(SearchUtils.java:97) ~[classes!/:3.0.5]
    at org.folio.search.repository.AbstractResourceRepository.executeBulkRequest(AbstractResourceRepository.java:55) ~[classes!/:3.0.5]
    at java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
    
15:25:15 [] [] [] [] WARN  KafkaMessageListener Failed to index resource event [eventType: CREATE, tenantId: fs07000002, id: 9d97f034-a06a-4bba-8895-a86ab367b87a]
feign.FeignException$Unauthorized: [401 Unauthorized] during [GET] to [http://user-tenants?tenantId=fs07000002] [UserTenantsClient#getUserTenants(String)]: [Invalid token]


pcp1/mod-source-record-manager
15:02:50 [] [fs07000001] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [] ERROR RestUtil             Response HTTP code is not equals 200, 201, 204. Response code: 500

mod-permission heap dump analysis

 Heap dump analysis

Problem Suspect

46 instances of io.vertx.pgclient.impl.codec.PgCod