PTF - Data Export Deleted records via API [ Ramsons ECS Eureka]
Overview
The purpose of this page is to highlight test results of export deleted bib records via API on Ramsons Eureka ECS environment with LoC dataset using [POST] /data-export/export-deleted endpoint with parameters:
{"from": "2025-02-12","to": "2025-03-12"}
Context: In production clients time after time exporting previously deleted records that were deleted in the last day, last week and last month.
In this testing set included tests for exports of the last day, last week and last month of deleted records. Also included API call results without using “Time ranges“. (see test 4 in test results)
Jira ticket: https://folio-org.atlassian.net/browse/PERF-935
Summary
All Api calls [POST] /data-export/export-deleted with time rage “from“- “to“ failed due to “The upstream server is timing out" issue. However tests for 1 day and for 1 week time range was actually able to start data exports. (as well as test with no time range in parameters).
Data exports for 1 day and 1 week took 47 and 21 seconds accordingly with no visible issues.
Test for 1 month time range failed due to timeout of underlying mod-source-record-storage call [POST] source-storage/stream/marc-record-identifiers caused by long running complex query to get id’s from SRS (see additional information).
Most CPU loaded module is mod-data-export with Spike of CPU up to 9%
There’s no visible memory usage activity on all related modules (taking into account that export itself is fast).
As was discussed with Folijet team [POST] source-storage/stream/marc-record-identifiers call (that getting instance id’s for data export) is failing due to timeout that related to Eureka design. Ticket for optimisation created: MODSOURCE-878: Long running query causing timeout on [POST] /source-storage/stream/marc-record-identifiersClosed
Recommendations & Jiras
long running query preventing start of data export due to timeout of underlying mod-source-record-storage call MODSOURCE-878: Long running query causing timeout on [POST] /source-storage/stream/marc-record-identifiersClosed
Additional info Exporting Deleted Records via APIs , https://folio-org.atlassian.net/browse/UXPROD-4306
Results
Test | Data size | Date range | Duration | Status |
---|---|---|---|---|
1 | 20K | 1 day | 47 s | success |
2 | 20K | 1 week | 21 s | success |
3 | 20K | 1 month | N/A | fail |
4 | 108 | - | 3s | success |
Memory Utilization
For all of a tests memory usage of mod-data-export, mod-source-record-storage and mod-inventory-storage remains the same and without visible spikes.
| Mem usage |
---|---|
mod-data export | 72% |
mod-source-record-storage | 50% |
mod-inventory-storage | 35% |
CPU Utilization
For test #1 and #2 there’s visible spike of CPU usage for mod-data export to ±9%. Test #3 is not visible here as it failed due to timeout of underlying call to mod-source-record-storage. Test #4 is not visible on any charts as it took only 3 seconds to complete.
RDS CPU Utilization
Database load caused mostly by CTE triggered from mod-source-record-storage side to get deleted records id’s.
Note: here is no visible connections used, probably because of tests duration is too fast.
Here is clearly visible most of load on DB comes from query to select deleted records id’s to pass them to data export.
Additional information from module and database logs
This query goal is to get deleted instances id’s from SRS. This query got high latency that leads to timeouts if time range is one month
MODSOURCE-878: Long running query causing timeout on [POST] /source-storage/stream/marc-record-identifiersClosed
WITH "cte" AS
(SELECT DISTINCT MARC_INDEXERS.MARC_ID
FROM MARC_INDEXERS
JOIN "marc_records_tracking" ON "marc_records_tracking"."marc_id" = MARC_INDEXERS.MARC_ID
WHERE (("field_no" = '005'AND TO_DATE(SUBSTRING(VALUE,1,8), 'yyyymmdd') BETWEEN '20250212' AND '20250312'))
GROUP BY MARC_INDEXERS.MARC_ID HAVING
(COUNT(DISTINCT CASE WHEN (("field_no" = '005' AND TO_DATE(SUBSTRING(VALUE, 1, 8),'yyyymmdd') BETWEEN '20250212' AND '20250312')) THEN 1 END) = 1))
SELECT "alias_128462826"."external_id", "alias_41002224".COUNT FROM
(SELECT DISTINCT "records_lb"."external_id"
FROM "records_lb"
WHERE ((LEADER_RECORD_STATUS = 'd')
AND EXISTS
(SELECT *
FROM CTE
WHERE (RECORDS_LB.ID = CTE.MARC_ID))
AND "records_lb"."state" in ('ACTUAL', 'DELETED')
AND "records_lb"."record_type" = 'MARC_BIB'
AND "records_lb"."external_id" IS NOT NULL)) "alias_128462826"
RIGHT OUTER JOIN
(SELECT COUNT(DISTINCT "records_lb"."external_id")
FROM "records_lb"
WHERE ((LEADER_RECORD_STATUS = 'd')
AND EXISTS
(SELECT *
FROM CTE
WHERE (RECORDS_LB.ID = CTE.MARC_ID))
AND "records_lb"."state" in ('ACTUAL','DELETED')
AND "records_lb"."record_type" = 'MARC_BIB'
AND "records_lb"."external_id" IS NOT NULL)) "alias_41002224" ON TRUE
Appendix
Infrastructure
PTF -environment relc
12 r7g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
db.r6.xlarge database, writer
MSK fse-test
4 kafka.m7g.xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
R- Ramsons release
LoC Data set:
Data |
|
---|---|
Marc records | 46 M |
Instances | 24 M |
Methodology/Approach
Prepare deleted instances by using [DELETE] inventory/instances/${id}/mark-deleted endpoint with instance UUID’s (in case of this test set 20K instances is needed).
Using [POST] /data-export/export-deleted endpoint with parameters {"from": "2025-02-12","to": "2025-03-12"} trigger data export of deleted records. Call will respond with UUID of data-export job started.
Using query below - check duration of data export job
SELECT id, completed_date- started_date as duration, status, job_profile_name, exported FROM cs00000int_mod_data_export.job_executions order by started_date desc limit 10;
Monitor: mod-source-record-storage, mod-data-export, mod-inventory-storage logs and metrics.