Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Bulk Edits - Establish a performance baseline for Items Holdings bulk updates PERF-408 in the Orchid release that has architectural changes that were implemented in UXPROD-3842. The goal is to make sure the performance did not deteriorate in comparison to Nolana release.  Some questions can help us to determine the performance and stability of the new Bulk Edits implementation:

  • How long does it take to export 100, 1000, 10k, and 100K records?
  • Can it be used with up to 5 concurrent users? 
  • Run consecutively four jobs editing 10k  item 10k holdings records
  • Run simultaneously four jobs editing 10k item holdings records
  • Look for a memory trend and CPU usage

...

Orchid release works about 40% slower for holdings bulk editing than Nolana. One of the possible root causes of performance degradation could be a too long time to get a preview of changes.

Jira Legacy
serverSystem JIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyMODBULKOPS-86

It is approximately the same stable as Nolana.

  • Jobs duration: For 1 concurrent job, 100 records can be edited in 1 min 9 s which is 19 s slower compared to Nolana (50 s), and 1000 records editing could be performed in 2 min 54 s  which is 40 s slower compared to Nolana(2 min 10 s), and 10k records bulk editing is about 36% slower 100k records editing unavailable due to
    Jira Legacy
    serverSystem JIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyPERF-334
    .
  • 5 concurrent jobs run is successful: 10k records per user, 5 users simultaneously (50k records total) can be uploaded and edited in about 22 minutes which is about 9 min 30 s slower compared to Nolana (about 12 min 25 s). Slowness Could be a result of the changes UXPROD-3842 and
    Jira Legacy
    serverSystem JIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODBULKOPS-86
  • Run consecutively four jobs editing 10k Holdings records did not show any performance degradation.
  • Simultaneously four jobs editing 10k holdings records could be run 20 min 40 s which is about 1 min or 5,7% slower than 1 concurrent job (19 min 33 s).
  • The memory utilization of mod-bulk operation increases from 20% to 23% (The service was updated before the test, probably it is reaching a steady state- the memory trend will be investigated in further testing). For all other modules, no memory leaks are suspected.
  • CPU for all modules did not exceed 42% 56% for all of the tests. Compared to Nolana mod-data-export-worker has no spikes anymore and the average CPU utilization of other modules is approximately the same, except nginx-okapi - which is about 15% higher.
  • For all records number (100, 1k,10k), and 5 concurrent jobs - RDS CPU utilization did not exceed 41%. Better compared to Nolana(it was up to 50%).


Recommendations & Jiras

More than 50% of jobs with 10k + records FAILED in about 30 min - 1 hour with the error "Connection reset (SocketException)".

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-334

...

Number of recordsOrchid (Total Time)Nolana (Total Time)
1001 min 9 s50 sec
10002 min 54 s2 min 10 sec
10k19 min 33 s12 min 25 sec
100k

Error in about 43 min - 1 hour: Connection reset (SocketException)

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-334

2 hours 17 min

...

"BARCODE". Records number per 1 userOrchid (Total Time)Nolana (Total Time)
1001 min 9 s49 sec
10003 min 4 s2 min 25 sec
10k22 min 2 s12 min 25 sec
100k

Results are not representative because of
Error in about 28-33 min Connection reset (SocketException)

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-334

-

...

Did not exceed 17%.


Service CPU utilization

CPU for all modules did not exceed 56% for all of the tests.

Image Added


Image Added

Image Added

RDS CPU utilization

Maximum RDS CPU utilization is 41% for 5 concurrent jobs with 10k holdings records.

The more concurrent jobs are running -the higher RDS CPU usage. The maximum number of concurrent jobs will be investigated. 


Errors in logs during testing

Code Block
2023-03-22T19:47:19.986Z
19:47:19 [] [] [] [] ERROR ? HTTP response code=404 msg=No suitable module found for path /holdings-sources/ for tenant fs09000000
ncp5/okapi-b/8dcac0276f1c46cba21d6e5814ec6cd0
Field				Value
@ingestionTime		1679514444708
@log				054267740449:ncp5-folio-eis
@logStream			ncp5/okapi-b/8dcac0276f1c46cba21d6e5814ec6cd0
@message			19:47:19 [] [] [] [] ERROR ?                    HTTP response code=404 msg=No suitable module found for path /holdings-sources/ for tenant fs09000000
@timestamp			1679514439986


Code Block
2023-03-22T19:47:19.985Z
19:47:19 [${FolioLoggingContext:requestid}] [${FolioLoggingContext:tenantid}] [${FolioLoggingContext:userid}] [${FolioLoggingContext:moduleid}] ERROR oldingsDataProcessor Holdings source was not found by id=null
ncp5/mod-bulk-operations/bfcbe6d984e1443bb3e2e49dbd14601e
Field				Value
@ingestionTime		1679514442775
@log				054267740449:ncp5-folio-eis
@logStream			ncp5/mod-bulk-operations/bfcbe6d984e1443bb3e2e49dbd14601e
@message			19:47:19 [${FolioLoggingContext:requestid}] [${FolioLoggingContext:tenantid}] [${FolioLoggingContext:userid}] [${FolioLoggingContext:moduleid}] ERROR oldingsDataProcessor Holdings source was not found by id=null
@timestamp			1679514439985


Appendix

Infrastructure

PTF -environment ncp5 [ environment name] 

...