Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Question/AssumptionResponse
how do we validate that other systems (such as check-out) don't slow down during these batched larger loads? The scope of this solution covers the initial stage of the Data Import process only. It will not affect the number of MARC records that will be sent to DI Kafka topics. Also, it will not change the logic of how MARC records are processed and converted into FOLIO entities.
will this make it quicker to process large files? (versus customers breaking apart their large files and doing batching on their own)It will improve the user's experience because users no longer have to split files manually beforehand. But the time needed for MARC records parsing and conversion to inventory instances, authorities, etc., will remain the same.
will batching make it quicker to process all jobs?The overall time needed to parse and convert all records from all simultaneously running DI jobs will remain almost the same. But processing will be more evenly distributed between different DI jobs.
will there be less overall impact on the system, or will batching stress the system in other ways?The impact will remain the same.
when something goes wrong with part of the batch, or a chunk, how will users manage the "mess" and resulting errors? All DI jobs created for chunk files will be linked to the Parent DI Job created. Also, the chunk file name will consist of the original file name + chunk number. This will allow us to understand the original file a particular chunk belongs to. Also, adding an operation/button to download a chunk file makes sense. This way, the user can download a chunk file with errors, fix that, and restart only chunk processing instead of a complete source file.
how will we provide visibility into the progress of the overall job? do we need to update the card that appears when a job is started Because all DI jobs for chunk files are linked to their parent DI jobs, we can sum the progress of all "child" DI jobs and represent it on the "parent" DI job card.
how will we provide visibility into where a job is in the queue?Since we will not use a simple FIFO logic for DI job queue management, and the logic to select the next DI job will be quite complex and will rely on a set of different factors, we can discuss possible options for how to provide visibility regarding positions on the queue for DI jobs.