Batch Importer (Bib/Acq) (UXPROD-47)

[MODDATAIMP-840] Data Import files stuck in INITIALIZATION Created: 06/Jun/23  Updated: 28/Jul/23  Resolved: 07/Jun/23

Status: Closed
Project: mod-data-import
Components: None
Affects versions: None
Fix versions: None
Parent: Batch Importer (Bib/Acq)

Type: Bug Priority: P2
Reporter: Alissa Hafele Assignee: Kateryna Senchenko
Resolution: Cannot Reproduce Votes: 3
Labels: data-import, di-configuration, epam-folijet
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified
Environment:

Nolana Hotfix 1


Issue links:
Defines
defines UXPROD-3838 NFR: Data Import & Inventory Support ... In Review
Sprint:
Story Points: 0
Development Team: Folijet
Release: Poppy (R2 2023)
Affected Institution:
OTHER
Tester Assignee: Khalilah Gambrell
UX Lead: Khalilah Gambrell
Epic Link: Batch Importer (Bib/Acq)
RCA Group: Not a bug
Affected releases:
Nolana (R3 2022)

 Description   

Overview: Stanford is observing that when approx. 4 or more data import jobs are sent approx. 1/2 disappear. There is no error in the UI and the pattern is the get stuck in a state of

"status": "FILE_UPLOADED",
"uiStatus": "INITIALIZATION",

Steps to Reproduce:

  1. We are on Nolana hotfix 1. We have built an external tool to send files to Data Import. This application sent 8 marc files to data import in short succession ranging in size from 35 to 5 records. 4 files completed loading and show in the Data Import UI as Completed. 4 files never show up in Data Import UI. If a GET is sent with metadata-provider/jobExecutions?uistatusAny=INITIALIZATION&sortBy=completed_date,desc we find that the "missing" files all remain with "status": "FILE_UPLOADED", "uiStatus": "INITIALIZATION",
    The load profile is very simple - no matching, create instance and add an Instance status term and Admin note.

Expected Results: All 8 files load with a UI status of completed or completed with errors. 

Actual Results: 4 files load, 4 go missing

Additional Information: Note: The number 4 is not consistent. This is an approximate based on observations we are currently making but we have also observed success when sending more than 5 and failures when sending fewer files. The overall impact on Data Import is unclear when this happens, eg. we are sometimes able to send through other data import jobs via the UI during this time but other jobs sent from our application seem to get stuck in this same state. 
URL:

Interested parties: Stanford



 Comments   
Comment by Ann-Marie Breaux (Inactive) [ 07/Jun/23 ]

Hi Alissa Hafele Thank you for writing up this bug. Does it also reproduce for you in Nolana and Orchid Bugfest? If so, it would be great if you could supply links to the bugfest job logs for the ones that did import/fail/partially import.

Kateryna Senchenko This was on the SUP project, but I've moved it to MODDATAIMP. I imagine it may need to move again once a bit of analysis is done. Stanford is self-hosted, so if copies of any logs, or questions about any configs, the team will need to reach out to the library directly.

Comment by Alissa Hafele [ 07/Jun/23 ]

Hi Ann-Marie Breaux thanks for looking at this! I have not reproduced in bugfest. Our team here swarmed on this yesterday and I think they have some new insight. As soon as west coast folks are up, I'll add details to the ticket.

Comment by Ann-Marie Breaux (Inactive) [ 07/Jun/23 ]

Thank you Alissa Hafele It would definitely be helpful to know if it's reproducing in non-Stanford environments. I routinely upload multiple files at once and start imports, and haven't experienced losing some files. Granted, they tend to be small files and fairly simple jobs. My first thought is that there's a resourcing problem, but we'll dig into it more after any other updates/details you can provide for Stanford's experience.

cc: Kateryna Senchenko Mariia Aloshyna

Comment by Alissa Hafele [ 07/Jun/23 ]

Thanks for the attention on this Ann-Marie Breaux. And apologies for raising an alert for something that turned out to be because of our local setup. The short version is that it had to do with how many pods we were running with local storage. After investigation by the team yesterday, they identified what was happening and we've now noted that this is documented in the readme. There was some confusion here around the different ways storage is setup across different modules, e.g. mod-data-import vs. mod-data-export-worker vs. mod-agreements. 
From our end, this can be closed. Thanks again!

 

Comment by Ann-Marie Breaux (Inactive) [ 07/Jun/23 ]

Hi Alissa Hafele You made my day - thank you!! Closing this issue

Generated at Thu Feb 08 22:23:24 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.