Table of Contents |
---|
...
- The max chunk file size or the max number of source records in the chunk file must be configurable at the tenant level.
- Records would need to be chunked and named based on the sequential order of the records in the original file, e.g. records 1-1000 in chunk file_1, records 1001-2000 in chunk file_2, etc.
...
- The implementation must be decoupled from the mod-data-import main code base and simple enough to make the backporting of it to the previous releases at least twice cheaper (in terms of man/days) than the original development effort. TBD: define the list of releases for backporting.
- The alternative option is Development in Nolana/Orchid codebase and forward porting to the Development branch.
- The usage of the S3-like storage should not be "vendor locked" and must support different types of storage (AWS S3, MinIO)
...
- Garbage collection (removing already processed files and chunk files) is out of the scope of the feature. It can be done by configuring appropriate retention policies on S3-like storage.
- Every tenant will have its own dedicated S3-like storage area
Implementation
The solution will be implemented as a part of the mod-data-import and ui-data-import modules.
...
Uploading to S3-like storage directly from a FOLIO UI application can be implemented using the following guide https://aws.amazon.com/blogs/compute/uploading-to-amazon-s3-directly-from-a-web-or-mobile-application/. The initial call to acquire theĀ uploadURL must be done by the back-end mod-data-import module.
The diagram below represents in detail the Direct upload flow.
Simultaneous launch of a large number of Data Import Jobs (9)
...