File storage options

The agreements and licenses apps allow the uploading of files as part of their "Supplementary documents" and "Core documents" (licenses only) functionality (note that a document does not require a file upload as the document can refer to a physical document or an online document with a URL). When a file is uploaded as part of a supplementary or core document record it will be automatically stored in one of two ways, depending on how the application has been configured. The settings that control where uploaded files are stored are under Settings → Agreements → App settings → File storage and  Settings → Licenses → App settings → File storage (the licenses and agreements settings are separate and the two modules can be configured differently in terms of file storage).

Viewing and Editing the file storage settings requires appropriate permissions in the relevant application which are as follows:

  • Settings (Agreements): Can view and edit app settings

  • Settings (Licenses): Can view and edit app settings

The primary setting, which determines the overall method used for file storage, is the "Storage engine". There are currently two valid values for the Storage engine setting:

  • LOB

  • S3

Using the LOB storage engine

If the Storage engine setting is "LOB" then any uploaded files will be stored as Binary Large Objects (BLOBs) in the database being used by the module for data storage. 

In the case that the Storage engine setting is set to "LOB" the remaining File storage settings will have no effect.

NB If the LOB storage engined is used with PostgreSQL in a multi-tenant environment all files are stored in a single (not tenant differentiated) schema. This means (for example) it is not easily possible to backup and restore the relevant files for a single tenant.

Use the S3 storage engine

If the Storage setting is "S3" then any uploaded files will be stored in S3 compatible storage as specified by the remaining File storage settings. The purpose of each setting is described in the following table. Default values for these settings can be provided from environment variables, and the environment setting names are also given in the table.

Setting

Explanation

Environment variable for default value

Setting

Explanation

Environment variable for default value

S3 access key

The access key and secret key are used to authenticate access to the relevant S3 storage. See the  S3 Access key documentation for more information

KIWT_FILESTORE_AWS_ACCESS_KEY_ID 

S3 secret key

The access key and secret key are used to authenticate access to the relevant S3 storage. See the  S3 Access key documentation for more information

KIWT_FILESTORE_AWS_SECRET 

S3 bucket name

A bucket is a container for objects stored in S3, and in order to know which S3 bucket uploaded files should be stored in, a bucket name must be entered here. See S3 Bucket documentation for more information

KIWT_FILESTORE_AWS_BUCKET 

S3 bucket region

A bucket is a container for objects stored in S3. Each bucket has a region (geographical location of the underlying storage), and in order to know which S3 bucket uploaded files should be stored in, a bucket region must be entered here. The named S3 bucket (as per the S3 bucket name) must exist in the named S3 bucket region. See S3 Bucket documentation for more information

KIWT_FILESTORE_AWS_REGION 

S3 endpoint

The S3 endpoint is the API endpoint provided by the relevant S3 service. This will be available from the provider of the S3 compatible service being used

KIWT_FILESTORE_AWS_URL 

S3 object prefix

The S3 object prefix is a way of organizing file objects stored in a single S3 bucket. See the S3 object prefix documentation for more information

Not defined by environment variable, defaults to

<tenant>/<module-name>/ 



Switching storage engine

Only one storage engine can be used at a time, but it is possible to switch from using LOB to S3 or S3 to LOB. However, changing the storage engine in the settings will only affect future file uploads, not existing files that have already been stored. Once the storage engine has been changed any newly uploaded files will be uploaded using the currently specified storage engine. If it is desired to switch from using the LOB storage engine (which was the default and only available option in earlier versions of the application) to using S3, a migration job can be triggered to move all the existing files stored in the database storage to the specified S3 storage. However there is no reciprocal migration job from S3 to LOB as it cannot be guaranteed that a file stored in S3 can be stored in a database as a BLOB (see information on maximum file storage below). So while it is possible to move from using S3 to LOB it is not particularly recommended.

The migration job to move files from LOB to S3 must only be run once all the S3 settings have been correctly set and the Storage engine setting has been set to S3.

  • For agreements, the migration job is run by a GET request to the endpoint: /erm/admin/triggerDocMigration

  • For licenses, the migration job is run by a GET request to the endpoint: /licenses/admin/triggerDocMigration

In both cases job will check for any files stored in LOB storage within the relevant module (agreements or licenses) and for each one found will upload a copy of each to the configured S3 compatible service and remove the copy from the database storage.

Maximum file sizes

Typically a BLOB is limited to 4Gb in most database engines (including PostgreSQL commonly used for Folio installations). In theory S3 has file size limit of 50 TiB. In both cases the maximum storage available to the module is not limited by the module but the amount of disk space made available.

However in all cases the practical limits to file storage are likely to dictated by other factors. Experience indicates that attempting to upload a file larger than 200Mb via the API is likely to fail because of the time this takes and default timeouts for http connections.