/
Inventory Audit log

Inventory Audit log

Summary

The technical design is aimed to address the need of catalogers to track the history of changes for different entities in FOLIO. The solution uses the common approach for auditable events in FOLIO similar to the audit of events in circulation and acquisition domains.

Requirements

Functional Requirements

NFR

Note: The data is collected from prod-like environments. The changes are tracked through metadata’s lastUpdatedDate field. This means that if the object was changed more than once or was deleted, the change will not be reflected in the data.

Tenant Id

Title

Total

Changed Last Year

Percent

Tenant Id

Title

Total

Changed Last Year

Percent

Tenant 1

holdings

11 996 598

1 991 794

16.60%

Tenant 1

instances

11 166 119

2 341 561

20.97%

Tenant 1

items

9 797 753

 549 548

5.61%

Tenant 2

holdings

7 552 493

 72 339

0.96%

Tenant 2

instances

6 751 475

6 313 212

93.51%

Tenant 2

items

8 977 780

 227 539

2.53%

Tenant 3

holdings

10 855 570

10 855 570

100.00%

Tenant 3

instances

10 396 208

10 396 208

100.00%

Tenant 3

items

10 121 740

10 121 740

100.00%

Tenant 4

holdings

6 719 620

 120 005

1.79%

Tenant 4

instances

4 845 531

 306 119

6.32%

Tenant 4

items

9 420 268

 455 542

4.84%

Total

holdings

37 124 281

13 039 708

35.12%

Total

instances

33 159 333

19 357 100

58.38%

Total

items

38 317 541

11 354 369

29.63%

Assumptions

  • For ECS environments: Shared entities' version history should be tracked only in the central tenant.

  • All changes in the system related to inventory entities (instances, items, holdings, bibs) generate Domain events.

  • Domain events related to update action have old and new versions of the entity

Baseline Architecture

In existing architecture, mod-inventory-storage is responsible for persisting such entities as instances, holdings, and items. mod-entities-links is responsible for authorities. Both modules produce domain events on create/update/delete actions from different sources.

 

Target Architecture

The existing architecture allows the reuse of the capabilities of the domain events approach to persist audit log events.

Audit Consumers Sequence Diagram

Audit Consumers with Outbox Sequence Diagram

The implementation can follow a Transactional outbox pattern. The approach allows enhanced guarantee for persisting the audit event but the trade-off is that this approach will negatively affect the performance of flows related to domain events.

 

Solution Summary

The process is split into two main parts

1. Persistence: There are storage options that can be implemented.

Option

Description

Pros & Cons

Option

Description

Pros & Cons

1

RDBMS

The audit database should persist a diff of the entity. The queries are made mostly by the entity's unique identifier. Thus partitioning by UUID and subpartitioning by date ranges can be applied to audit tables

Pros:

  • allows flexible access to versioning data

Cons:

  • limited scaling options

  • negative impact on Postgres that is used by all others modules

2

Object Storage

AWS S3-like storage can be used to persist diffs because audit events can be stored as plain-text (JSON) documents

Pros:

  • allows scaling almost indefinitely

Cons:

  • requires an additional solution for complex queries

  • might cause high costs for storage bc of a large amount of operations on small files

2. Version history display: This should be done on demand comparing each diff of the entity to the previous

3. Changes to be tracked:

  • Changes in entity fields

  • Changes related to an update of the parent entity, e.g. holding update cascades to updates in items’ effective shelving order

  • Changes in metadata should be ignored

Key implementation aspects:

  1. The Kafka default delivery semantics is “AT_LEAST_ONCE”. Ensure that domain events have their unique identifiers to be able to handle consuming messages in an idempotent manner

  2. Add new consumers in mod-audit to inventory domain events for instances, items, and holdings.

  3. Add new consumers in mod-audit to authority domain events for authorities.

  4. Add new consumers in mod-audit to source record domain events for marc-bib records.(see: https://folio-org.atlassian.net/wiki/spaces/DD/pages/62685275)

  5. mod-audit should support the following configurations on the tenant level:

    1. Retention period in years (with default value - 0 for indefinite retention)

    2. Feature flag to enable audit capability. In case when the audit is disabled no consumers and logs should be persisted.

    3. Anonymizing flag that indicates whether the records in the database should be anonymized before persistence to the database (To be confirmed).

  6. mod-audit should have the following scheduled jobs:

    1. Daily: to remove records that exceed the retention period

    2. Monthly: to create subpartitions for audit tables

  7. Persist audit events in an event storage. A table in DB per entity type with partitioning by UUID (hash) and subpartitioning by date range. E.g.: instance_audit table partitions:

-- main audit table CREATE TABLE instance_audit ( event_id uuid not null , event_date timestamp not null, entity_id uuid not null, origin varchar not null, action varchar not null , user_id uuid not null, diff jsonb, primary key (event_id, event_date, entity_id) ) PARTITION BY HASH (entity_id); -- main partitions CREATE TABLE instance_audit_p0 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 0) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p1 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 1) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p2 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 2) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p3 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 3) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p4 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 4) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p5 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 5) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p6 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 6) PARTITION BY RANGE (event_date); CREATE TABLE instance_audit_p7 PARTITION OF instance_audit FOR VALUES WITH (MODULUS 8, REMAINDER 7) PARTITION BY RANGE (event_date); -- sub-partitions CREATE TABLE instance_audit_p0_2025_q1 PARTITION OF instance_audit_p0 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p0_2025_q2 PARTITION OF instance_audit_p0 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p0_2025_q3 PARTITION OF instance_audit_p0 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p0_2025_q4 PARTITION OF instance_audit_p0 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p1_2025_q1 PARTITION OF instance_audit_p1 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p1_2025_q2 PARTITION OF instance_audit_p1 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p1_2025_q3 PARTITION OF instance_audit_p1 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p1_2025_q4 PARTITION OF instance_audit_p1 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p2_2025_q1 PARTITION OF instance_audit_p2 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p2_2025_q2 PARTITION OF instance_audit_p2 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p2_2025_q3 PARTITION OF instance_audit_p2 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p2_2025_q4 PARTITION OF instance_audit_p2 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p3_2025_q1 PARTITION OF instance_audit_p3 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p3_2025_q2 PARTITION OF instance_audit_p3 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p3_2025_q3 PARTITION OF instance_audit_p3 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p3_2025_q4 PARTITION OF instance_audit_p3 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p4_2025_q1 PARTITION OF instance_audit_p4 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p4_2025_q2 PARTITION OF instance_audit_p4 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p4_2025_q3 PARTITION OF instance_audit_p4 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p4_2025_q4 PARTITION OF instance_audit_p4 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p5_2025_q1 PARTITION OF instance_audit_p5 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p5_2025_q2 PARTITION OF instance_audit_p5 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p5_2025_q3 PARTITION OF instance_audit_p5 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p5_2025_q4 PARTITION OF instance_audit_p5 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p6_2025_q1 PARTITION OF instance_audit_p6 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p6_2025_q2 PARTITION OF instance_audit_p6 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p6_2025_q3 PARTITION OF instance_audit_p6 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p6_2025_q4 PARTITION OF instance_audit_p6 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31'); CREATE TABLE instance_audit_p7_2025_q1 PARTITION OF instance_audit_p7 FOR VALUES FROM ('2025-01-01') TO ('2025-03-31'); CREATE TABLE instance_audit_p7_2025_q2 PARTITION OF instance_audit_p7 FOR VALUES FROM ('2025-04-01') TO ('2025-06-30'); CREATE TABLE instance_audit_p7_2025_q3 PARTITION OF instance_audit_p7 FOR VALUES FROM ('2025-07-01') TO ('2025-09-30'); CREATE TABLE instance_audit_p7_2025_q4 PARTITION OF instance_audit_p7 FOR VALUES FROM ('2025-10-01') TO ('2025-12-31');
  1. Create REST API
    4. to provide information on a list of changes related to a particular entity
    5. to provide detailed information on the particular change - this API should use the Object diff library to return a verbose description of the changes related to the entity

  2. As an audit log may be enabled when the inventory records are already present in the system the existing records should always show the Created/Original Version card in the UI. To achieve this when the UI fetches all audit records related to the entity and the earliest record is not a Created event then UI should get the date of creation from the current record metadata and display it in the Created/Original Version card

  3. ECS specific details:

    • Instance records promoted from ‘Local’ to ‘Shared’ should be reflected in the audit log with the event Promoted instead of Created. To achieve this the audit consumers should listen to the topic CONSORTIUM_INSTANCE_SHARING_COMPLETE(link)

ERD

With data size implications, creating separate tables per each entity type is required. The default table structure is listed below:

Column

Type

required

unique