Metadata Record Export (UXPROD-652)

[UXPROD-4110] Investigate Data export performance improvements Created: 07/Mar/23  Updated: 30/Nov/23  Resolved: 05/Jun/23

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Poppy (R2 2023)
Parent: Metadata Record Export

Type: New Feature Priority: P2
Reporter: Magda Zacharska Assignee: Magda Zacharska
Resolution: Done Votes: 0
Labels: LC-priority2, loc, metadatamanagement
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
is defined by MDEXP-594 Spike - investigate options for impro... Closed
Gantt End to Start
has to be done before UXPROD-4127 Improve Data export performance In Progress
Relates
relates to UXPROD-3389 [NFR] - Migrate Data Export to Sprin... Closed
relates to UXPROD-3588 [NFR] - Migrate Data Export to Spring... Closed
Release: Poppy (R2 2023)
Epic Link: Metadata Record Export
Back End Estimate: Large < 10 days
Back End Estimator: Viachaslau Khandramai (Inactive)
Back-End Confidence factor: 100%
Estimation Notes and Assumptions: Taking into account that only analysis should be provided.
Development Team: Firebird
PO Rank: 0
Rank: Cornell (Full Sum 2021): R1

 Description   

Current situation or problem:
The existing implementation of users can export up to 22 M records with the default mapping profile but significantly less when exporting with a custom mapping profile that includes data coming from holdings and item records. The performance deteriorates further when triggered by a CQL query and recommended number of records is 300K. However, the limits not enforced programmatically and are causing additional work for librarians who must manually create a files with the specified number of records UUIDs to trigger the export.

In scope

  • Performance improvements for exporting:
    • instances and SRS records
    • holdings
    • authority records
  • Export of 22M SRS records in 24 hours without significantly impacting the performance of the other FOLIO modules (cataloging, check in, check out, data import)
  • Support multiple concurrent exports:
    • daily (thousands) :
    • monthly:
    • annually:
  • Establish performance baseline to determine areas that will require improvements
  • Determine an easy way for triggering the export of all instances/SRS records without the need to first list UUIDs of matching records.
  • Set to 100k the max number of records to be saved in .mrc file and split the entire export into supported size files
  • Compress all files generated in one export job into one directory and provide the link to it similarly as it is done currently for single files

Additional information:


Generated at Fri Feb 09 00:37:21 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.