Table of Contents

The existing approach, when the user uploads a source file directly to the Data Import app, will be removed because it . It will make the mod-data-import module stateless and allow us to scale this module horizontally (Stateless, Horizontal scaling, and High Availability), making it HA-compliant.

The second improvement is to implement large data import file slicing logic in the Data Import application as well.

...

The max chunk file size or the max number of source records in the chunk file must be configurable.
Records would need to be chunked and named based on the sequential order of the records in the original file, e.g. records 1-1000 in chunk file_1, records 1001-2000 in chunk file_2, etc.

Non-functional requirements

The implementation must be decoupled from the mod-data-import main code base and simple enough to make the backporting of it to the previous releases at least twice cheaper (in terms of man/days) than the original development effort. TBD: define the list of releases for backporting.
The usage of the S3-like storage should not be "vendor locked" and must support different types of storage (AWS S3, MinIO)

Assumptions

Garbage collection (removing already processed files and chunk files) is out of the scope of the feature. It can be done by configuring appropriate retention policies on S3-like storage.

Implementation

The solution will be implemented as a part of the mod-data-import and ui-data-import modules.

...

Uploading to S3-like storage directly from a FOLIO UI application can be implemented using the following guide https://aws.amazon.com/blogs/compute/uploading-to-amazon-s3-directly-from-a-web-or-mobile-application/. The initial call to acquire the uploadURL must be done by the back-end mod-data-import module.

The diagram below represents in detail the Direct upload flow.

Simultaneous launch of a large number of Data Import Jobs (9)

...

Version	Old Version 3	New Version 4
Changes made by	Taras Spashchenko	Taras Spashchenko
Saved on	May 05, 2023	May 05, 2023

Versions Compared

Key

Non-functional requirements

Assumptions

Implementation

Simultaneous launch of a large number of Data Import Jobs (9)

Page Comparison

Versions Compared

Key

Non-functional requirements

Assumptions

Implementation

Simultaneous launch of a large number of Data Import Jobs (9)