Acquisition Event Log
Context
There is a need of tracking audit information of various entities within the FOLIO. Circulation Log is already in place. Another urgent need is for the Acquisition domain. Here you can also note the functional feature "Display a history of edits for the order record", and plans for the Event Log as part of the domain stabilization plan.
Likely, FOLIO is interested in a platform wide solution which will handle that. The reason behind is that the App Interaction SIG is interested in a platform wide solution for other applications, e.g. Agreements, Inventory, Circulation, Organizations. At the same time, this document focuses specifically on the Acquisition Event Log topic, while also aiming for unification with the Circulation Log.
The proposed solution should describe the target / long-term vision, while allowing phased implementation, as well as clearly define the scope of phases (at least the first one in Orchid).
References:
- https://folio-org.atlassian.net/browse/UXPROD-3215
- Display a history of edits for the order record
- https://miro.com/app/board/uXjVObQrl6w=/?invite_link_id=742203032673
- Audit
- FOLIO Cross-Application Data Sync Solution
Stakeholders
Name | Role | Concerns |
---|---|---|
Dennis Bridges | Thunderjet Product Owner | Functional feet, alignment with team's roadmap, scope |
Serhii Nosko | Thunderjet Tech Lead | Design and implementation details, scope, implementation plan |
Siarhei Hrabko | Thunderjet Tech Lead | Design and implementation details, scope, implementation plan |
Mikita Siadykh | Thunderjet Scum Master | Implementation plan, timeline |
Raman Auramau | Solution Architect | Requirements, solution design |
Requirements and Expectations
Functional Requirements
Troubleshooting by Dev Team
One of the key problems with troubleshooting is the inability to trace the history of order and order line changes. Therefore, the Thunderjet team would like to use this Event Log to have clearer picture of changes. Among the basic scenarios - the ability to filter data by Order Id, Order Line Id with additional filtering by time range.
However, in many cases the dev team has no access to data storage and APIs on the Production environment (actually, it really should have no access to prod).
Therefore, the scope of this feature should implement an an API that allows getting information for the basic scenarios described above, as well as instructions for hosting providers on using the API to select the information needed for troubleshooting
UXPROD-3215 support
Data schema - need to consider https://folio-org.atlassian.net/browse/UXPROD-3215 scope as well (see UI mock up).
The approach is to keep a snapshot of the entity for each event. This will then allow any analysis of changes, incl. identify differences, versioning, etc.
Search&filtering capabilities -
- Range of date&time
- Source (it means who initiated the event)
- Category (Cost, Item details etc... not mandatory but really valuable)
- The categories are based on the accordions found on the record in the UI. For order lines this would be Item details, Purchase order line, Vendor, Cost details, Fund distribution, Location etc.
- Keyword (similar to how inventory search works) - it's not a critical requirement here, as long as they can search by individual field
- Full-text search (FTS) is required; it's assumed that FTS in Postgres would be enough to address the expectations
Non-Functional Requirements
Priority | Requirement | Comments, Expectations, Metrics |
---|---|---|
MEDIUM | Conceptual Integrity | The solution must be aligned on the technology stack, tools and approaches with the FOLIO platform. This solution is not considered cross-platform, but should address Event Log needs for all components of the Acquisition domain. However, the experience of implementing the Event Log in the Acquisition domain may be of interest in other domains. |
HIGH | Extensibility | It should be possible in principle to add new types of events, new event sources or new log consumers to the Event Log. Note: No specific candidates are being discussed at this time (maybe inventory or agreements) It should be possible to track the source/event version and the Event Log version in each event. |
MEDIUM | Performance | Generation and publication of events should not introduce noticeable delays in platform performance. It looks like there are no specific standards or metrics here. However, in some cases (like during order loads), performance issues sometimes appear. It worth testing! Need to test at least these three operations - open order, create order line, edit order line (there are links between OL and approve invoice, pay invoice) - check with TL; create a story(-ies); outcome is to get baseline metrics The architect and Product Owner have made the assumption that the delay added by posting an event can be no more than 100 ms to be considered acceptable. |
MEDIUM | Reliability | The proposed solution should ensure reliable storage of all Event Log data for at least 20 years. (“Order numbers in FOLIO can live for multiple fiscal years”). The fact is that some orders can live for 10 or even 20 years; after closing an order is archived so closing can be used as a logical reason for archiving (remember about re-opening!) Immutability - stored events must not be deliberately or accidentally updated. Durability - stored events must not be deliberately or accidentally deleted (except when deleted at expiration of the lifespan). |
HIGH | Scalability in terms of Data Volume | The solution must be efficient enough to process and store the estimated amount of data. On the example of orders:
Keeping in mind the required duration of data storage and the estimated amount of data per year, we can talk about 20 * 2.7 = 54M events for the entire period for one tenant. |
LOW | Availability | The solution must ensure availability at a level not lower than the SLA from the platform hosting provider. |
MEDIUM | Security | The proposed solution should provide secure data storage, as well as support for security requirements for multi-tenancy. |
MEDIUM | Traceability | There should be a clear way to link logged events to some common entity (e.g., order, or order line, etc.). |
Timeline and Delivery model
The first phase, in terms of its scope and complexity, should be shaped in such a way as to allow its planning, development, implementation, testing and delivery in the Orchid release.
Solution Design
Component Diagram
In fact, the solution is based on the Event Driven approach, and largely correlates with FOLIO Cross-Application Data Sync Solution.
Event Sources and Event Types
The following modules are related to Acquisition domain - mod-orders/mod-orders-storage, mod-invoice/mod-invoice-storage, mod-finance/mod-finance-storage, mod-organizations, mod-gobi, mod-ebsconet, edge-orders. The scope of this design is focused on mod-orders/mod-orders-storage.
At the module level with business logic (in this case, mod-orders), there is a whole set of different business flows, for example - Create Order, Open Order, Edit Order, Approve Order, Close Order, Reopen Order, Delete Order, Re-export Order , Print Order, Create Order Line, Receive Order Line, Re-export Order Line, Cancel Order Line, Delete Order Line, Print Order Line. However, the general recommendation for the Domain Event pattern is to create and publish events as close to the data as possible (i.e. in mod-orders-storage). Preliminary analysis shows that at the level of the data access module (mod-orders-storage) there are approximately 3-5 places where the Order Line changes, and approximately 1-2 places where the Order changes.
Note: if keeping a business operation flag is important for auditing, then in the scope of this feature, one will need to implement the transfer of a flag (or label) of a business operation between mod-orders and mod-orders-storage.
Event Log location
There are some alternatives regarding this topic.
Option | #1 In the Acquisition domain | #2 In mod-audit |
---|---|---|
Pros |
|
|
Cons |
|
|
Comments | It can be a table in existing module, or a new module (like, mod-audit-acquisition or something) |
|
For implementation, the option was chosen to place the Event Log logic in mod-audit, since it moves the event processing logic from the Acquisition application modules to a single-responsible module, where it can be reused in the future if necessary.
Data Schema
Refer to Orders Event Log for Data Model.
Scope
- Every new Event Source should be able to connect Kafka and post events to specified Kafka topics.
- How to achieve a transactional approach?
- Follow https://folio-org.atlassian.net/wiki/display/DD/Apache+Kafka+Messaging+System and https://folio-org.atlassian.net/wiki/display/~olamshin/Kafka+Practices+Proposal
- Kafka usage practices - topic per tenant and topic per event type
- Orchid phase - define some events that are to be logged in Orchid
- Event Log Storage
- OK - just a single flat table
- RDBMS
- Need to design a schema
- The intention is to avoid the approach with JSONB but how to design an extendable relational schema?
- RA: Maybe it could be reasonable to consider module -> entity -> event type hierarchy? Like, all events about mod-orders -> order go to one table, mod-orders -> something else go to another table?
- Actually, column-family or document oriented (e.g.,Mongo) NoSQL databases could fit well here..
- OK - the schema mod-audit currently has seems to be pretty extensible - it has a range of standard fields plus details in json body
- Event Log
- API for CRUD - just basic API for Orchid phase
- Event Log Showcase
- some correlation id(s) might be useful
- RA: What are the use cases? How will support/developers use the Event Log? Will full-text search be required?
- OK - Support / developers can execute SQL scripts or execute some API calls for retrieving data
Phased Delivery approach
Taking into account the entire scope of work, as well as a fairly short development stage in the Orchid release Orchid (R1 2023) , a phased approach is proposed. In this case, the scope of work can be distributed as follows:
- Orchid (R1 2023) -
- tracking Order and Order Line events from mod-orders,
- publishing messages to Kafka (in form of snapshots) including Transactional Outbox approach,
- updating mod-audit (design and implementation of data schema, CRUD operations, API for access),
- a simple showcase for demonstration
- Poppy (R2 2023) -
- analysis and definition of changed fields,
- filtering and search capabilities by fields
- ability to add note(-s) to changes fields,
- adding more types of events,
- data archiving (or any other approach addressing data volume challenge).
Outstanding questions
- Need to organize performance testing at least these three operations - open order, create order line, edit order line (there are links between OL and approve invoice, pay invoice); create a story(-ies); outcome is to get baseline metrics - discussed with Siarhei Hrabko , Jira ticket will be included
- Discuss troubleshooting scenarios with the team
- What event types will be in Orchid? (it seems that we agreed on 3 - Create Order, Edit Order, Create Order Line)
- Work on data schema with TL/PO
- .
- Historical data archiving
- Anonymization of records in scope of GDPR