EUREKA-695 - async entitlement steps completion tracking
EUREKA-695: Spike - How to provide greater insight into asynchronous entitlement progress Closed
- 1 Problem overview
- 2 Proposed solutions
- 2.1 Preliminary considerations
- 2.2 Implementations approaches
- 2.2.1 Entitlement confirmation messages
- 2.2.1.1 Solution overview
- 2.2.1.2 Process sequence
- 2.2.1.3 Required changes
- 2.2.1.4 Pros and cons
- 2.2.2 Kafka entitlement messages consumer offsets tracking
- 2.2.2.1 Solution overview
- 2.2.2.2 Process sequence
- 2.2.2.3 Required changes
- 2.2.2.4 Pros and cons
- 2.2.1 Entitlement confirmation messages
Problem overview
While performing entitlement, ops personnel needs to check whether the operation succeeded fully, including all the asynchronous steps of the entitlement process. Unfortunately, right now there is no easy way to do so, and Manager Tenant Entitlements service (MTE) reports entitlement flow to be completed once MTE completes it’s part of the work, which includes sending messages to dependent components via messaging system (Kafka) - but does not actually verify that those dependent components have received and processed those messages appropriately.
In order to ensure entitlement process completion, there is a need to obtain status of dependent components entitlement processing by MTE, and present it in flow status check response.
Proposed solutions
Preliminary considerations
We will not review possibilities that have obvious downsides, such as using synchronous entitlement calls instead of asynchronous messaging system. This will make entitlement process unreliable, since restart of MTE while waiting for response from dependent components will break the flow. And this will make MTE tightly coupled with all the other components.
Also we should keep in mind that there is a number of different messages sent by MTE via different Kafka topics during entitlement process:
an entitlement event for every module
a capability event for every module
a system user event
a scheduled jobs event for every module
The general approach is to track processing of those events, and update flow status accordingly.
Note that some of the even types may not be mandatory to process before entitlement flow in considered completed - namely scheduled jobs events, since a tenant will be usable even without them getting processed.
Same goes for entitlement events - they trigger routes update in sidecars, and also can be used by mod scheduler to enable timers. Note: we do need for sidecars to process those events in order to pass requests to the corresponding modules, but that processing is supposedly very fast and won’t require MTE to wait for it. Though we can always revise this assumption, and also make MTE await for confirmations on these events as well.
In order to keep entitlement process reliable, we should retain usage of a messaging system - this ensures that individual tasks that are represented as messages can be independently re-tried, and even if individual re-tries take a long time - there will be no timeouts occurring in the system, and minimal chances of data loss (e.g. loss of tasks) or data inconsistency due to missing tasks.
Again, this also reduces coupling between components of the system - though certain level of coupling will be present here, because MTE sends specific types of events, and thus will be expecting specific types of confirmations for these event types.
Additionally we should take into consideration error cases - e.g. some entitlement related task failed to process in one of the downstream components.
With these details in mind, let’s consider the options.
Implementations approaches
Entitlement confirmation messages
Solution overview
First approach that we propose is to use confirmation messages that are sent by downstream components after they’ve finished processing messages they received from MTE - be it successful completion or an error that won’t be retried.
Every downstream component that receives messages from MTE and asynchronously completes some part of the entitlement process will now report back to MTE via a Kafka message in a separate topic.
Process sequence
The process is simple to understand and straightforward - for every event that is supposed to have part of entitlement process happen asynchronously, MTE should wait for a corresponding confirmation event, and mark corresponding parts of the flow as completed/failed based on that confirmation event:
Required changes
This solution requires following modifications to the code:
MTE should create and subscribe to a Kafka topic that will receive confirmation messages from downstream components
MTE should modify flow state based on messages received from that topic - the flow and some of it’s stages must contain corresponding fields in DB
Downstream components should be modified to send confirmation messages (for now these are mod-roles-keycloak, mod-users-keycloak and potentially also mod-scheduler - if we decide to wait for timers creation before we consider entitlement process complete)
The messages themselves would contain these fields essentially:
Tenant ID
Message type - e.g. capability (done by mod-roles-keycloak), system user creation (done by mod-users-keycloak) and scheduled job (done by mod-scheduler) - if we decide to track scheduled jobs creation
Module ID (if applicable - essentially applicable to all message types except system user creation message)
Status - success or error
Details - any additional details, e.g. error messages
Pros and cons
Pros:
Straightforward to implement
Allows handling both success and failure cases
Independent of message broker (e.g. Kafka can be replaced with, for example, AWS SQS/SNS, or any other message broker - and the solution will still work the same way)
Future-proof - if any downstream component will have it’s own downstream components, this will not cause any issues
Cons:
Requires modifications not only to MTE, but also to all downstream services - even though modifications themselves are trivial
Some loose coupling between MTE and downstream services - via specific confirmation types
Kafka entitlement messages consumer offsets tracking
Solution overview
An alternative option that allows us to only modify MTE - without changing downstream components. The downside is, however, that we won’t be able to tell, if downstream component actually succeeded or failed in processing the message. And this solution will only work if Kafka is used as a message broker - as it is tied specifically to Kafka’s way of processing messages.
The idea behind this solution is to use Kafka data in order for MTE to check, whether message consumers have actually consumed messages that MTE has produced.
In Kafka, every message is written to a topic partition at a specific offset. The message producer - MTE in this case - may store the offset of the message it produced, and use Kafka management API to check, which consumers (in Kafka terms, “consumer groups”) have read those topic partitions at these offsets. This will allow MTE to tell, if a particular downstream service has confirmed that it has read and processed a Kafka message.
However, it is not possible for MTE to say whether the processing was successful. The downstream service may have re-tried processing of the message several times, and essentially skipped it as impossible to process. Or even not retry at all, if message is in general considered non-retry-able by the downstream service.
Also, changes to MTE become more complicated, because it would have to use Kafka management API to check offsets of specific consumers. It would also have to be properly configured with identifiers of these consumers - which must match the ones used by downstream services.
This may be challenging to configure, and is potentially error-prone.
Process sequence
It is debatable, when exactly should MTE make calls to Kafka to verify messages processing - whether this should be done on some sort of a timer, which would require adding scheduling capability to MTE, or should this happen synchronously when API call is made to MTE to get flow state - which would potentially make that API significantly slower.
Scheduling in distributed systems may be somewhat of a challenge, but operations that we need to perform are idempotent, e.g. we can perform them twice and still have a proper result. In our case, the operation is to check in Kafka if consumer has moved past our message offset, and if it did - mark a stage in MTE DB as passed. Given that this operation can be repeated multiple times without negative side effects, we can employ some simple scheduling strategy that doesn’t require “exactly once” sort of guarantees.
Therefore I would suggest to use the scheduling option.
As a result, MTE should store in DB offsets of the Kafka messages it produced, with relation to flow and flow stages these messages were produced for. And then on schedule perform calls to Kafka management API to verify, if specific Kafka consumers (consumer groups) it was configured to check have processed those messages (e.g. their consumer group offsets are past the offset of the messages MTE posted to Kafka). Once they did process those messages - MTE should mark corresponding stages of the flow as completed.
Required changes
This solution requires following modifications to the code:
MTE should store in DB offsets of the Kafka messages it produced, with relation to flow and flow stages these messages were produced for
MTE should modify flow state based on messages received from that topic - the flow and some of it’s stages must contain corresponding fields in DB
MTE should be configured with precise names of downstream Kafka message consumers that are supposed to process those Kafka messages
MTE should call Kafka management API - either on schedule or on get flow state API call - to determine whether downstream Kafka message consumers have processed those messages that MTE had sent
Pros and cons
Pros:
Requires only modifications to MTE codebase
No direct coupling between MTE and downstream services - except for Kafka consumer group IDs configurations
Cons:
Specifically ties us to using Kafka - this solution cannot be ported to any other message broker
MTE modifications are somewhat complicated, and in general the architecture of this solution is not very straightforward and may be hard to understand to people from the outside
Requires us to have a scheduled job for updating flow state, or risk having incorrect flow state in DB until we call an API endpoint to get that state - and that endpoint will be significantly slower
MTE will only know the fact that consumer has moved past the message, but will not know if processing was actually successful or failed with some error
May not work if we have downstream component having it’s own asynchronous steps by posting messages to 2nd level downstream components (e.g. MTE posts message to, say, mod-roles-keycloak, but mod-roles-keycloak also posts messages to, for example, mod-users-keycloak - thus entitlement will only be finished when all of those messages are processed)