Script to add the "deleted" field to existing Instance records

Script to add the "deleted" field to existing Instance records

Related tickets and feature

Background

The deleted property for Instance records was introduced as part of the Set for Deletion feature.

However, existing Instance records created before this change may not have the deleted field populated. As a result, the index on the deleted field may not work properly, and the database planner may choose an inefficient execution plan.

This can cause performance issues and may lead to OAI-PMH harvest failures when the harvest record source is set to “Source record storage” or “Inventory”.

Purpose

This script updates existing Instance records where the deleted property is missing or set to null and explicitly sets it to false.

The script should be executed manually after the mod-inventory-storage upgrade.

Scripts

Replace ${myuniversity} with the tenant name before executing the script.

  1. Check how many records need to be updated

SELECT COUNT(*) FROM ${myuniversity}_mod_inventory_storage.instance WHERE jsonb ->> 'deleted' IS NULL;
  1. Create a procedure

CREATE OR REPLACE PROCEDURE ${myuniversity}_mod_inventory_storage.populate_deleted_field( p_batch_size INT DEFAULT 50000 ) LANGUAGE plpgsql AS $$ DECLARE rows_updated INT; BEGIN LOOP WITH batch AS ( SELECT ctid FROM ${myuniversity}_mod_inventory_storage.instance WHERE jsonb ->> 'deleted' IS NULL LIMIT p_batch_size FOR UPDATE SKIP LOCKED ) UPDATE ${myuniversity}_mod_inventory_storage.instance i SET jsonb = jsonb_set(i.jsonb, '{deleted}', 'false'::jsonb, true) FROM batch WHERE i.ctid = batch.ctid; GET DIAGNOSTICS rows_updated = ROW_COUNT; RAISE NOTICE 'Updated % rows', rows_updated; COMMIT; EXIT WHEN rows_updated = 0; PERFORM pg_sleep(0.05); END LOOP; END; $$;
  1. Disable user-defined triggers

ALTER TABLE ${myuniversity}_mod_inventory_storage.instance DISABLE TRIGGER USER;
  1. Call the procedure. The batch size can be adjusted if needed. This process may take several hours, depending on the number of records that need to be updated.

CALL ${myuniversity}_mod_inventory_storage.populate_deleted_field(50000);
  1. Verify that all records were updated. After the procedure finishes, check the number of records that still do not have the deleted field populated. The result should be 0.

SELECT COUNT(*) FROM ${myuniversity}_mod_inventory_storage.instance WHERE jsonb ->> 'deleted' IS NULL;
  1. Re-enable user-defined triggers

ALTER TABLE ${myuniversity}_mod_inventory_storage.instance ENABLE TRIGGER USER;
  1. Run VACUUM. After completion of all steps above, run the following query (or schedule autovacuum capacity) because this update will generate significant dead tuples/WAL.

VACUUM (ANALYZE) ${myuniversity}_mod_inventory_storage.instance;

Notes

  • The script should be executed manually by the environment administrator or DBA.

  • Execution time may vary depending on the size of the tenant’s Instance data set.

  • It is recommended to execute the script during a maintenance window for tenants with a large number of Instance records.