Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: draw.io diagram "Spitfire-Inventory-Audit-target-sequence.drawio" edited

...

  • For ECS environments: Shared entities' version history should be tracked only in the central tenant.

  • All changes in the system related to inventory entities (instances, items, holdings, bibs) generate Domain events.

  • Domain events related to update action have old and new versions of the entity

Baseline Architecture

In existing architecture, mod-inventory-storage is responsible for persisting such entities as instances, holdings, and items. mod-entities-links is responsible for authorities. Both modules produce domain events on create/update/delete actions from different sources.

...

Audit Consumers Sequence Diagram

Drawio
mVer2
simple0
zoom1
simple0
inComment0custContentId337084458
pageId332955650
lboxcustContentId1337084458
diagramDisplayNameSpitfire-Inventory-Audit-target-sequence.drawio
lbox1
contentVer78
revision78
baseUrlhttps://folio-org.atlassian.net/wiki
diagramNameSpitfire-Inventory-Audit-target-sequence.drawio
pCenter0
width13331303
links
tbstyle
height611
Expand
titleAudit Consumers with Outbox Sequence Diagram

Audit Consumers with Outbox Sequence Diagram

Drawio
mVer2
zoom1
simple0
inComment0
custContentId342687780
pageId332955650
lbox1
diagramDisplayNameSpitfire-InvAudit-sequence-outbox.drawio
contentVer1
revision1
baseUrlhttps://folio-org.atlassian.net/wiki
diagramNameSpitfire-InvAudit-sequence-outbox.drawio
pCenter0
width1170
links
tbstyle
height860

The implementation can follow a Transactional outbox pattern. The approach allows enhanced guarantee for persisting the audit event but the trade-off is that this approach will negatively affect the performance of flows related to domain events.

...

Option

Description

Pros & Cons

1

RDBMS

The audit database should persist a snapshot diff of the entity. The queries are made mostly by the entity's unique identifier. Thus partitioning by UUID and subpartitioning by date ranges can be applied to audit tables

Pros:

  • allows flexible access to versioning data

Cons:

  • limited scaling options

  • negative impact on Postgres that is used by all others modules

2

Object Storage

AWS S3-like storage can be used to persist snapshots diffs because audit events can be stored as plain-text (JSON) documents

Pros:

  • allows scaling almost indefinitely

Cons:

  • requires an additional solution for complex queries

  • might cause high costs for storage bc of a large amount of operations on small files

2. Version history display: This should be done on demand comparing each consecutive snapshot diff of the entity to the previous

...

  1. The Kafka default delivery semantics is “AT_LEAST_ONCE”. Ensure that domain events have their unique identifiers to be able to handle consuming messages in an idempotent manner

  2. Add new consumers in mod-audit to inventory domain events for instances, items, and holdings.

  3. Add new consumers in mod-audit to authority domain events for authorities.

  4. Add new consumers in mod-audit to source record domain events for marc-bib records.(see: Source Record Domain Eventing)

  5. mod-audit should support the following configurations on the tenant level:

    1. Retention period in years (with default value - 0 for indefinite retention)

    2. Feature flag to enable audit capability. In case when the audit is disabled no consumers and logs should be persisted.

    3. Anonymizing flag that indicates whether the records in the database should be anonymized before persistence to the database (To be confirmed).

  6. mod-audit should have the following scheduled jobs:

    1. Daily: to remove records that exceed the retention period

    2. Monthly: to create subpartitions for audit tables

  7. Persist audit events in an event storage. A table in DB per entity type with partitioning by UUID (hash) and subpartitioning by date range.

  8. Create REST API
    4. to provide information on a list of changes related to a particular entity
    5. to provide detailed information on the particular change - this API should use the Object diff library to return a verbose description of the difference between current and previous snapshots of changes related to the entity

ERD

With data size implications, creating separate tables per each entity type is required. The default table structure is listed below:

Column

Type

required

unique

Description

1

EventID

UUID

y

y

unique event identifier

2

EventDate

timestamp

y

n

date when the event appeared in the event log

3

Origin

varchar

y

n

Origin of the event: data-import, batch-update, user, etc.

4

Action

varchar

y

n

what action was performed

5

ActionDate

timestamp

y

n

when action was performed

6

EntityID

UUID

y

n

entity identifier

7

UserId

UUID

y

n

user who did the action, fixed UUID for anonymized user

8

Snapshot Diff

jsonb

y

n

Difference between “old” and “new” body of the entity

WBS

Expand

Story

Task

Entity

Description

Module

1

Persisting events

2

Persisting events

Extend domain event with source FOLIO

Instance (FOLIO)

[TBC]

mod-invenotry-storage

3

Persisting events

Extend domain event with source FOLIO

Item

[TBC]

mod-invenotry-storage

4

Persisting events

Extend domain event with source FOLIO

Holding

[TBC]

mod-invenotry-storage

5

Persisting events

Extend domain event with source MARC

Instance (MARC)

Add origin header to align depending DI profile

mod-source-record-storage

6

Persisting events

Extend domain event for Authority

Authority

Add origin header to align depending DI profile

mod-source-record-storage

7

Persisting events

Consume domain event

Instance(FOLIO)

Create table with partitioning by UUID
Create kafka consumer for domain event
Persist entity snapshotdiff

mod-audit

8

Persisting events

Consume domain event

Instance(MARC)

Create table with partitioning by UUID
Create kafka consumer for domain event
Persist entity snapshotdiff

mod-audit

9

Persisting events

Consume domain event

Holding

Create table with partitioning by UUID
Create kafka consumer for domain event
Persist entity snapshotdiff

mod-audit

10

Persisting events

Consume domain event

Item

Create table with partitioning by UUID
Create kafka consumer for domain event
Persist entity snapshotdiff

mod-audit

11

Persisting events

Consume domain event

Authority

Create table with partitioning by UUID
Create kafka consumer for domain event
Persist entity snapshotdiff

mod-audit

12

Persisting events

Configuration

All

Provide configuration parameter to enable/disable audit log on tenant level

mod-audit

13

Persisting events

Configuration

All

Anonimize events

mod-audit

14

Display History

Rest Endpoint for history

Instance(FOLIO)

Query list of snapshots diffs from database
Calculate diff messages
Return list of diff records

mod-audit

15

Display History

Rest Endpoint for history

Instance(MARC)

Query list of snapshots diffs from database
Calculate diff messages
Return list of diff records

mod-audit

16

Display History

Rest Endpoint for history

Holding

Query list of snapshots diffs from database
Calculate diff messages
Return list of diff records

mod-audit

17

Display History

Rest Endpoint for history

Item

Query list of snapshots diffs from database
Calculate diff messages
Return list of diff records

mod-audit

18

Display History

Rest Endpoint for history

Authority

Query list of snapshots diffs from database
Calculate diff messages
Return list of diff records

mod-audit

19

Display History

Show history Pane in inventory

ui-inventory

...