Central File/Object Storage (WIP)
Overview
A need exists across several FOLIO apps to be able to store files. Instead of each of those apps implementing their own solution, it probably makes sense to create centralized file storage facilities. This page outlines the problem and possible designs.
Reader Beware
This page was originally created as a place to capture some ideas. Much of it was stream of consciousness - I never finished my thoughts here but the page remains in case I ever revisit the idea. In short, this is half baked, and not completely thought-through...
Requirements
- Must be able to support files of varied sizes, including those which are quite large
- Must be able to upload and persist text and binary files with any MIME type and/or extension
- Must allow files to be retrieved
- Must store file metadata (size, type, date uploaded, etc.)
- Must be able to list/search for files without actually retrieving file contents (return metadata only)
- Must segregate files by tenant
Nice-To-Haves
- Should support multiple underlying storage technologies
- Should be able to change some metadata (e.g. filename), but file content should be immutable.
- Should allow for segregation of files by app/domain/timebox/etc
Schemas
file_metadata
Property | Type | Required | Default | Description | Notes/Example |
---|---|---|---|---|---|
id | string | No | <system generated> | UUID of the file metadata record | e.g. 1a220b67-7ddf-4b33-b9d4-5ce6157134e3 |
name | string | Yes | Filename to associate with the file | e.g. inv20190702-13.pdf | |
size | number | No | <system calculated> | Size of the file in bytes | e.g. 5123680 |
type | string | Yes | MIME type of the file | e,g, application/pdf | |
metadata | metadata | No | <system generated> | Standard record metadata | created by, creation date, updated by, updated date, etc. |
uri | string | No | <system generated> | URI pointing to the file | e.g. s3://diku.file-storage.us-east-1/invoices/1a220b67-7ddf-4b33-b9d4-5ce6157134e3 |
domain | string | No | Optional domain used to group files | e.g. invoices | |
storageType | string | No | TBD (Postgres?) | Optionally specify the type of storage to use | must be one of the enabled, supported storage types |
file_collection
Property | Type | Required | Default | Description | Notes/Example |
---|---|---|---|---|---|
files | array<file_metadata> | Yes | [ ] | collection of file_metadata |
storage_type
Property | Type | Required | Default | Description | Notes/Example |
---|---|---|---|---|---|
name | string | Yes | name for this type of storage | unique | |
TBD |
storage_type_collection
Property | Type | Required | Default | Description | Notes/Example |
---|---|---|---|---|---|
storageTypes | array<storage_type> | Yes | [ ] | collection of storage_type |
Storage Layer
A new module named mod-file-storage is introduced.
API
Method | Endpoint | Request | Response | Description | Notes |
---|---|---|---|---|---|
POST | /file-storage/files | file_metadata | file_metadata | Create a file metadata record | |
GET | /file-storage/files | CQL query | file_metadata_collection | Search/list file metadata records | |
GET | /file-storage/files/<id> | NA | file_metadata | Get a particular file metadata record | |
PUT | /file-storage/files/<id> | file_metadata | file_metadata | Update a file metadata record | Only certain fields would be allowed to be updated |
DELETE | /file-storage/files/<id> | NA | 204 | Delete file metadata and content | |
POST | /file-storage/files/<id>/contents | <MIME type from file_metadata> | 201 | Upload file content | Can by binary or text. Data is stored in the configured repository |
GET | /file-storage/files/<id>/contents | NA | <MIME type from file_metadata> | Get the contents of a particular file | Response can be binary or text. Data is retrieved from the configured repository |
GET | /file-storage/storage-types | NA | storage_type_collection | List enabled, supported storage types for your tenant |
Storage
Multiple underlying storage technologies would be supported
- Files could potentially be retrieved by the client directly via the URI in the file_metadata, but this is subject to any access controls imposed by the underlying storage
- Files can also be retrieved via the storage module.
Configuration
- Parameters are passed into the _tenant API indicating which storage type to use, and provide configuration (e.g. connection details, etc.)
- Details of how secrets are stored are TBD - perhaps we can leverage AWS Param Store or Vault
JIRA
A convenient place for links to related JIRA features/stories/etc.
- TBD
Open Issues
- Need to add details of the API
- Need to add details of how the underlying storage is provisioned/configured