Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

  1. In this workflow, we are checking the performance of exporting MARC Bib records workflow (with items and holdings) running in the Kiwi release -
    Jira Legacy
    serverSystem JiraJIRA
    columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyPERF-202

...

Backend:

  • mod-data-export-4.2.1 and mod-data-export-4.2.2
  • mod-source-record-storage-5.2.0
  • mod-source-record-manager-3.2.2
  • okapi-4.9.0
  • mod-authtoken-2.9.0

...

  • 7.2 million UChi SRS records
  • 7.2 million inventory records (7.3 Million instances, 7.8 Million holdings record, 8.9 Million items)
  • 77 FOLIO back-end modules deployed in 151 ECS services
  • 3 okapi ECS services
  • 6 m5.xlarge  EC2 instances
  • writer db.r6g.xlarge 1 reader db.r6g.xlarge AWS RDS instance
  • INFO logging level

High-Level Summary

  1. Data Export is relatively stable for 100K Instance records
  2. It is flaky as we increase the number of Instances to 500K. See the Jira ticket created

Test Runs

mod-data-export v4.2.1

With Items and Holdings

Test

Total instances

1 User - Avg Total time to Export instances

1.

1000

1 minute
2.100,0001 hour 8 minutes
3.500,0005 hours 48 minutes


mod-data-export v4.2.2

Test

Total instances

1 User - Avg Total time to Export instances

1.

1000

1 minute
2.100,00051 minutes
3.200,0001 hour 48 minutes


CQL

Test

Total instances

CQL query1 User - Avg Total time to Export instances

1.

22025

(instanceTypeId=="7cb86491-cc57-4c77-a0a1-24ebfe925906" and source=="MARC") sortby title13 minutes
2.1582(source=="MARC" and items.effectiveLocationId=="0a8c7b4e-04cd-42ac-a887-e7f2ee2ea6ec") sortby title1 minute

CPU and Memory resources allocated

Module

CPU (units)

Hard/Soft Memory limit (MB)

mod-data-export

128

512/360

mod-inventory-storage128864/536
mod-source-record-storage1281440/896

CPU Utilization

mod-data-export v4.2.1

CPU utilization is fairly stable at 100% for 100, 100K, 500K records Data Exports with items and Holdings job profile.


mod-data-export v4.2.2

Performance improvement after fixing MDEXP-441

Image Added

Service Memory Utilization

mod-data-export is stable at 120% even if we increase the number of instances gradually from 1000 to 500K records. For all other modules such as mod-source-record-storage, the okapi remains constant between 80% - 100%.


Issues faced during Data Export

I was able to successfully export 100K. But, when I tried to export 500K, Data Export failed in UI, and Chrome console log shows as

...

Code Block
 {
"id" : "e74bc816-57cc-4c19-a075-ceda913a5adb",
"hrId" : 7635,
"exportedFiles" : [ {
"fileId" : "dcbce2b8-7e84-4ca4-866f-07a2760a2d98",
"fileName" : "kcp1-DE-500k-7635.mrc"
} ],
"jobProfileId" : "937d6256-8532-442b-9286-cbc3396fa18d",
"jobProfileName" : "holdings and items",
"progress" : {
"exported" : 500000,
"failed" : 0,
"total" : 500000
},
"completedDate" : "2021-10-29T22:57:24.532+00:00",
"lastUpdatedDate" : "2021-10-29T22:57:24.474+00:00",
"startedDate" : "2021-10-29T16:39:46.971+00:00",
"runBy" : {
"firstName" : "folio",
"lastName" : "folio"
},
"status" : "COMPLETED"
}

JIRA ticket

  1. Jira Legacy
    serverSystem JiraJIRA
    columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMDEXP-471
  2. Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMDEXP-473
  3. Jira Legacy
    serverSystem JiraJIRA
    columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMDEXP-474
  4. Jira Legacy
    serverSystem JIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMDEXP-441

Data Export CSV files used to run test

All instance records in the below files are source=MARC

...