Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-3969] Improve solution that will refresh Instances when default mapping rules have changed Created: 18/Jan/23  Updated: 05/Feb/24

Status: Open
Project: UX Product
Components: None
Affects versions: None
Fix versions: Ramsons (R2 2024)
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: P2
Reporter: Ann-Marie Breaux (Inactive) Assignee: Ryan Taylor
Resolution: Unresolved Votes: 0
Labels: LC1, data-import, epam-folijet, loc, needs-be-estimate, possible-di-help
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
defines UXPROD-47 Batch Importer (Bib/Acq) Analysis Complete
Gantt End to Start
has to be done after UXPROD-4082 Long term solution for applying mappi... In Progress
Relates
relates to FOLIO-3659 Create solution that will update MARC... Closed
relates to FOLIO-3681 Extend solution that will update Inst... Closed
relates to MODSOURCE-718 [RRT] Script to Refresh Instances fails. Closed
Release: Ramsons (R2 2024)
Epic Link: Batch Importer (Bib/Acq)
Front End Estimate: Out of scope
Back End Estimate: XXL < 30 days
Back End Estimator: Olamide Kolawole
Back-End Confidence factor: 30%
Development Team: Folijet
PO Rank: 70
Rank: Cornell (Full Sum 2021): R3
Solution Architect: Olamide Kolawole

 Description   

Review the Spitfire app that batch updates MARC Authority records based on default mapping updates: FOLIO-3659 Closed

Start with Spike FOLIO-3681 Closed to assess the work to develop and test, and create the required stories

Would replace script 3 on https://folio-org.atlassian.net/wiki/display/FOLIOtips/Scripts+for+Inventory%2C+Source+Record+Storage%2C+and+Data+Import+Cleanup, which works, but is too time-consuming

Tenant could run the script to adjust Instances whenever the default MARC Bib-to-Inventory Instance mapping rules are updated. We could test it on Bugfest, since many istances have data that diverges from current BF default mappings.

Approach
Create a java app that has to do the update work:
1. Create data-import job profile for MARC-to-MARC matching by 999ff$s subfield and update bibliographic record. Job/Match/Action/Mapping profiles will be hidden

  • Job Profile: Release Upgrade - Migrate MARC bibliographic records 
  • Action Profile: Release Upgrade - Migrate MARC bibliographic records 
  • Match Profile: Release Upgrade - Migrate MARC bibliographic records 
  • Mapping Profile: Release Upgrade - Migrate MARC bibliographic records

2. Load bibliographic records by pages of 50k (should be configurable) from SRS (GET /source-storage/records?recordType=MARC_BIB&state=ACTUAL&offset=<P>&limit=<N>

3. Verify that createdDate & updatedDate of records are older that time of script launch (If new records was detected - update totalRecords)
4. Prepare mrc file or JSON payload (README)
5. Initialize data-import job
6. Wait until the data-import job finished (GET /change-manager/jobExecutions/<id> and check status of job)
7. Load the next page and repeat 3-6 until there are no authority records left.
8. Delete the job profile created in 1st step
9. Logging should exist to indicate N of batch that is in progress now.

Validate that a user can view Job status

  • job status is shown on the Data Import Logs UI
  • For each record update, user can view SRS/MOD-Inventory-Storage output

Documentation 

  • Instructions must be provided to Hosting providers/System administrators for using the standalone application 
    • Must consider that some libraries have already upgraded to Morning Glory weeks/months before this implementation. 
    • Include a note that this should be run off-hours
  • Release notes for Morning Glory and Nolana should be updated and include link to Instructions 

Testing - MORE details to discuss

  • Need a story for PTF
  • Need an environment(s) to test 
    • Upgrade Lotus > Morning Glory   
      • Work with FSE? 
    • Upgrade Morning Glory > Nolana
      • Work with FSE?   

Discussed 20 Feb 2023 in context of Poppy development planning:

  • Spitfire is testing the tool for updating MARC Authority records after mapping updates on different clients
  • All blame the performance of DI, but PTF env – able to run in 50K chunks, which is not performant in production; that led to stuck imports and errors in production envs; reduced chunks to 20K; now recommending 5K chunks (running sequentially); better to run in smaller chunks
  • Spitfire will include Kate and Olamide in future meetings about the Authority refresh
  • Make sure there is good documentation from Spitfire
  • Don’t implement for Bibs until it’s more performant; consider for Quesnelia

 



 Comments   
Comment by Ann-Marie Breaux (Inactive) [ 16/Oct/23 ]

How can we guarantee this will be performant?
Need mapping rules, reference data, and SRS data - direct access to SRS and Inventory storage
Stand-alone, but part of the system, without any import business logic or processing
Could this be done by an external dev or dev team?
Kateryna Senchenko and Olamide Kolawole to discuss

Generated at Fri Feb 09 00:36:14 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.