High level suggestions

Partitioning

A table can be partitioned along one or more columns with a variety of strategies. A good rule of thumb is to target approximately 1-2M rows per partition. Ideally there will be roughly an even distribution between tables. For evenly partitioning a table with randomly distributed keys, the ideal strategy is to use hash partitioning.

Hash partitioning is implemented in terms of a modulus and a remainder. A simple scalable strategy for managing partitions is to base the modulus on a power of 2 and create a partition for each value less than the modulus.
This allows for a reasonable scaling potentially up to billions of rows and simplifies the process of scaling the number of partitions up or down. Ideally, there would be an independent mechanism for configuring the exponent for each tenant and possibly each table.

For example, the trivial case is if the exponent of the modulus for an instance table is 0, it will have a modulus of 1, which means there will be a single partition with a remainder of 0

Table creation

The DDL for the `instance` table will be consistent regardless of partition count and might look like:

create table instance
(
    id uuid not null primary key,
    jsonb jsonb not null,
    creation_date timestamp,
    created_by text,
    instancestatusid uuid
        constraint instancestatusid_instance_status_fkey
        references instance_status,
    modeofissuanceid uuid
        constraint modeofissuanceid_mode_of_issuance_fkey
        references mode_of_issuance,
    instancetypeid uuid
        constraint instancetypeid_instance_type_fkey
        references instance_type
) partition by hash(id);

As a convention the name of the partition might be of the form

<parent_table_name>_<modulus>_<remainder.

For the trivial case of modulus 0, the partition would look like

create table instance_1_0 partition of instance for values with (modulus 1, remainder 0);

In the sample dataset, there are a number of tables that would benefit from partitioning. For example

item has 23901856 rows
instance has 22618027 rows
holdings_record has 23901856 rows

Based on a target of 1-2M rows per partition table and a power of 2 partitioning strategy, the ideal number of partitions for each table would be based on an exponent of 4, yielding 16 partitions. This would result in ~1.4M rows for each partition table for all three tables.

Presumably, these 3 tables are joined together frequently, so it makes sense to partition along the same value in order to allow the planner to pick the correct partitions on common joins. Unfortunately, there is not a common value between these three tables.

As the holdings_record and item derive from instance, it may make sense to use instance.id as the partitioning key. This would mean partitioning by holdings_record.instanceid. In order to partition item it would make sense to create a new (redundant) foreign key column to the instance table and partition on that.

Indexes and constraints

Any indexes and constraints created on the parent table will automatically be applied to the partitions, so they can be created as normal. Re-indexing or analyzing is handled automatically.

Resizing

If the size of the partition grows beyond a reasonable size, it is possible to resize it up to the next exponent.

alter table instance detach partition instance_1_0;

create table instance_2_0 partition of instance for values with (modulus 2, remainder 0);
create table instance_2_1 partition of instance for values with (modulus 2, remainder 1);

insert into instance
select * from instance_1_0;

drop table instance_1_0;

Resizing is potentially complicated by the existence of foreign keys pointing to the id. Ideally those tables would also be partitioned by the same criteria. In that case, the corresponding partions on the dependent table
should be detached before the instance table. In the event that the dependent tables are not partitioned, the corresponding constraints would need to be dropped and re-created after the fact.

Ideally, tables with those dependent relationships would also be partitioned based on the same exponent, as the planner will only parallelize matching partionwise joins in Postgresql 12. This is improved in PG 13.

Partitions could theoretically be migrated one by one if each dependent table only has a single foreign key pointing at the parent table. In the case of multiple foreign keys, they would presumably point to different partitions.
Unfortunately, this appears to be the case with the instance_relationship and preceding_suceeding_title tables in relation to instance.

Partitions can themselves be further partitioned, so this might be solved by, for example, partitioning instance_relationship on superinstanceid then on subinstanceid. I'm not sure how the planner would behave in this case.

Partition Formula

For partitioning schemes based on hashed UUIDs, the destination partition can be determined with

select mod(<modulus> + mod(uuid_hash(<uuid>), <modulus>), <modulus>);

Relational Refactor

Break the existing JSONB up into relational columns and tables. Any indexed property should presumably be broken out from the JSONB document. For example, metadata.updatedDate would presumably be a column of type datetime,
while identifiers would migrate to a join table and possibly a separate identifiers table.

Lookup tables

There are a number of lookup tables that index values embedded in the JSONB document. It would make sense to break these values into columns on the table and create hash indexes presuming that the values would be looked up by equivalence. Create lookup functions for lookup values with the correct markers for query parallelism to avoid a join or subquery.

Separate indexed JSONB values

There are a number of gin indexes on JSONB properties that presumably are intended to index an array. These would be candidates for breaking out into separate tables.

Functions

Consider adding functions to convert rows to and from JSONB.

Partial indexes

Modify any existing indexes on JSONB values that do not apply to every row in the table to be partial indexes. This should reduce the overhead of those indexes.

The percentage of rows in a particular table that are included in each index for a particular schema can be determined with the following query (in this case for instance indexes in fs09000000_mod_inventory_storage):

select tablename, 100 - (null_frac * 100) as indexed_percentage from pg_stats where schemaname='fs09000000_mod_inventory_storage'
and tablename like 'instance%idx%' order by null_frac desc;

The following results were run on a single tenant in a test database, so mileage will vary, but should be illustrative of the potential for partial indexing.

instance

tablename	indexed_percentage
instance_matchkey_idx_unique	0
instance_invalidisbn_idx_ft	0
instance_isbn_idx_ft	0
instance_staffsuppress_idx	0.043332576751708984
instance_discoverysuppress_idx	99.80666666524485
instance_indextitle_idx	99.96333333256189
instance_publication_idx	100
instance_contributors_idx	100
instance_title_idx	100
instance_identifiers_idx_ft	100
instance_hrid_idx_unique	100
instance_pmh_metadata_updateddate_idx	100
instance_identifiers_idx_gin	100
instance_source_idx	100
instance_statisticalcodeids_idx	100
instance_metadata_updateddate_idx	100

instance_matchkey_idx_unique should be a partial index or deleted
instance_invalidisbn_idx_ft should be a partial index or deleted
instance_isbn_idx_ft should be a partial index or deleted
instance_staffsuppress_idx should be either broken into a separate table or be a partial index

holdings_record

tablename	index_percentage
holdings_record_discoverysuppress_idx	0.11333227157593
holdings_record_callnumber_idx	34.40444604812622
holdings_record_hrid_idx_unique	100
holdings_record_callnumberandsuffix_idx	100
holdings_record_fullcallnumber_idx	100
holdings_record_pmh_metadata_updateddate_idx	100

holdings_record_discoverysuppress_idx should either have a partial index or be broken out into a separate table
holdings_record_callnumber_idx should be a partial index

item

tablename	indexed_percentage
item_discoverysuppress_idx	0
item_purchaseorderlineidentifier_idx	0.2966642379760742
item_accessionnumber_idx	8.78000259399414
item_barcode_idx_unique	38.066667318344116
item_effectivecallnumbercomponents_callnumber_idx	39.55000042915344
item_pmh_metadata_updateddate_idx	100
item_hrid_idx_unique	100
item_fullcallnumber_idx	100
item_callnumberandsuffix_idx	100
item_status_name_idx	100

item_discoverysuppress_idx should be a partial index or deleted.
item_purchaseorderlineidentifier_idx should either have a partial index or be broken out into a separate table
item_accessionnumber_idx should be a partial index
item_effectivecallnumbercomponents_callnumber_idx should be a partial index

General suggestions

Replace deletes with a tombstone status

Deletes are very expensive, so it may make sense to replace deletes with a mechanism for indicating whether a particular entity is active or not. Some approaches to this may include

a boolean is_active flag on each table
a status column that includes a tombstone as well as statuses already set via the JSONB document. If the set of statuses are well defined, they could be represented by an enumerated type.
A reference to an event log. This might consist of a sequential ID or a timestamp, with a minimum valid event tracked somewhere. This allows for expiring old entities on bulk import and also allows for tombstoning by explicitly setting this value to null on delete.

There might be a view for a table implementing this strategy that filters out inactive entities. Actual deletes might occur in a periodic maintenance window.

Of the noted options, the status column option seems to fit this data model the best.

Function Markers

In order to encourage query parallelism and improved planning, mark any user defined functions that are called by slow queries with the
appropriate parallel and volatility markers.
Any read-only functions could presumably be marked with parallel safe and stable or immutable.

Triggers

There are a large number of triggers on these large tables. Some of them appear to be functionally redundant. Consider reducing these.

Folio Development Teams

Item/instance database performance suggestions

Analytics