PTF - Data Export Test Report (Kiwi)


















Overview

  1. In this workflow, we are checking the performance of exporting MARC Bib records workflow (with items and holdings) running in the Kiwi release - PERF-202 - Getting issue details... STATUS

We tested with 1 virtual user for 1000, 100K, and 500K records. 


Backend:

  • mod-data-export-4.2.1 and mod-data-export-4.2.2
  • mod-source-record-storage-5.2.0
  • mod-source-record-manager-3.2.2
  • okapi-4.9.0
  • mod-authtoken-2.9.0

Frontend:

  • folio_data-export-5.0.0

Environment:

  • 7.2 million UChi SRS records
  • 7.2 million inventory records (7.3 Million instances, 7.8 Million holdings record, 8.9 Million items)
  • 77 FOLIO back-end modules deployed in 151 ECS services
  • 3 okapi ECS services
  • 6 m5.xlarge  EC2 instances
  • writer db.r6g.xlarge 1 reader db.r6g.xlarge AWS RDS instance
  • INFO logging level

High-Level Summary

  1. Data Export is relatively stable for 100K Instance records
  2. It is flaky as we increase the number of Instances to 500K. See the Jira ticket created

Test Runs

mod-data-export v4.2.1

With Items and Holdings

Test

Total instances

1 User - Avg Total time to Export instances

1.

1000

1 minute
2.100,0001 hour 8 minutes
3.500,0005 hours 48 minutes


mod-data-export v4.2.2

Test

Total instances

1 User - Avg Total time to Export instances

1.

1000

1 minute
2.100,00051 minutes
3.200,0001 hour 48 minutes


CQL

Test

Total instances

CQL query1 User - Avg Total time to Export instances

1.

22025

(instanceTypeId=="7cb86491-cc57-4c77-a0a1-24ebfe925906" and source=="MARC") sortby title13 minutes
2.1582(source=="MARC" and items.effectiveLocationId=="0a8c7b4e-04cd-42ac-a887-e7f2ee2ea6ec") sortby title1 minute

CPU and Memory resources allocated

Module

CPU (units)

Hard/Soft Memory limit (MB)

mod-data-export

128

512/360

mod-inventory-storage128864/536
mod-source-record-storage1281440/896

CPU Utilization

mod-data-export v4.2.1

CPU utilization is fairly stable at 100% for 100, 100K, 500K records Data Exports with items and Holdings job profile.


mod-data-export v4.2.2

Performance improvement after fixing MDEXP-441

Service Memory Utilization

mod-data-export is stable at 120% even if we increase the number of instances gradually from 1000 to 500K records. For all other modules such as mod-source-record-storage, the okapi remains constant between 80% - 100%.


Issues faced during Data Export

I was able to successfully export 100K. But, when I tried to export 500K, Data Export failed in UI, and Chrome console log shows as

 {
"id" : "e74bc816-57cc-4c19-a075-ceda913a5adb",
"hrId" : 7635,
"exportedFiles" : [ {
"fileId" : "dcbce2b8-7e84-4ca4-866f-07a2760a2d98",
"fileName" : "kcp1-DE-500k-7635.mrc"
} ],
"jobProfileId" : "937d6256-8532-442b-9286-cbc3396fa18d",
"jobProfileName" : "holdings and items",
"progress" : {
"exported" : 236400,
"failed" : 0,
"total" : 500000
},
"completedDate" : "2021-10-29T19:11:10.626+00:00",
"lastUpdatedDate" : "2021-10-29T19:42:30.320+00:00",
"startedDate" : "2021-10-29T16:39:46.971+00:00",
"runBy" : {
"firstName" : "folio",
"lastName" : "folio"
},
"status" : "FAIL"
}

mod-data-export failed at 10/29/2021 3:11 PM Eastern as per the Chrome console log. But as per CPU and memory utilization, export continued to happen. mod-inventory, mod-inventory-storage, mod-source-record-storage continue to work even if job failed in UI. There are no errors in module logs.


When I click on .mrc file in UI to see the error, it failed to download from S3. This happened because the DE job failed even before it could upload the file to S3.



After a failing test in UI, I did not stop running backend modules explicitly. After a few hours, DE jobs status changed from FAILED -> COMPLETED and so does the status in UI. Job completely successfully. However, it took 6 hours 38 minutes or 1 hour more than normal to complete the DE job.

 {
"id" : "e74bc816-57cc-4c19-a075-ceda913a5adb",
"hrId" : 7635,
"exportedFiles" : [ {
"fileId" : "dcbce2b8-7e84-4ca4-866f-07a2760a2d98",
"fileName" : "kcp1-DE-500k-7635.mrc"
} ],
"jobProfileId" : "937d6256-8532-442b-9286-cbc3396fa18d",
"jobProfileName" : "holdings and items",
"progress" : {
"exported" : 500000,
"failed" : 0,
"total" : 500000
},
"completedDate" : "2021-10-29T22:57:24.532+00:00",
"lastUpdatedDate" : "2021-10-29T22:57:24.474+00:00",
"startedDate" : "2021-10-29T16:39:46.971+00:00",
"runBy" : {
"firstName" : "folio",
"lastName" : "folio"
},
"status" : "COMPLETED"
}

JIRA ticket

  1. MDEXP-471 - Getting issue details... STATUS
  2. MDEXP-473 - Getting issue details... STATUS
  3. MDEXP-474 - Getting issue details... STATUS
  4. MDEXP-441 - Getting issue details... STATUS

Data Export CSV files used to run test

All instance records in the below files are source=MARC

kcp1-1k-DE.csv

kcp1-DE-100k.csv

kcp1-DE-500k.csv