Support for transaction processing by two or more users in parallel
Author | Andrei Makaranka |
---|---|
JIRA task | UXPROD-3434 - Update finance transaction model to protect against parallel processing |
Business Requirements | |
Architects Review | |
PO review |
Revision History
Version | Date | Owner | Description | Reason |
---|---|---|---|---|
v1.0 | Initial version | |||
v1.1 | added more cases | |||
v1.2 | Andrei Makaranka | concerns added | ||
v1.3 | add details | |||
v1.4 | removed a case that is no longer relevant | |||
v1.5 | added 2 cases |
Overview
When 2 users or separate mods run operations in parallel, they can send both summaries before sending the operations. mod-finance-storage is not saving previous summaries with the same id (which in practice is the order id for encumbrances), so one of the summaries sent will have no effect, resulting in an error for the second one.
Example of an issue
The mod-orders integration test delete-opened-order-and-lines triggers the deletion of related encumbrances when the fund distribution is removed. There are 2 encumbrances to delete, and mod-orders starts the deletion of both in parallel (as would happen if 2 users were deleting the 2 transactions at the same time). It calls mod-finance, which releases the transactions before deletion. The transaction PUT requires the use of a summary, so mod-finance creates a summary before the update (for each transaction, in parallel).
This is mod-finance's log:
An error occurs because mod-finance-storage is not expecting 2 summaries+transactions to happen at the same time for the same order or invoice.
Case 1 : Two users Open order or Approve invoice at the same time and trying create transaction summary with same order ID or invoice ID
Source code : org.folio.service.finance.transaction.TransactionSummariesService#createOrUpdateOrderTransactionSummary
Case 2 : Can be a problem with moving transaction from temporary to permanent table. Also there is a chance that All Or Nothing mechanism never finish.
Looks like problem that we don't delete transaction summary:
private Future<Void> finishAllOrNothing(JsonObject summary, DBClient client) { return temporaryTransactionDAO.deleteTempTransactions(summary.getString(ID_FIELD_NAME), client) .compose(tr -> transactionSummaryService.setTransactionsSummariesProcessed(summary, client)) ; }
Analysis and Approaches
Quick Comparison
Option and Criterion | Pessimistic Locking | Composite ID for parallel operations |
---|---|---|
Support of parallel processing | No | Yes |
Preliminary complexity | Lower | Higher |
User experience impact | A user is forced to wait for the previous operation to finish processing | The user may not even notice the fact of parallel operation |
Performance impact | May be noticeable | Less noticeable |
Consistency impact | Strong consistency | Achieving consistency is harder |
Pessimistic locking
A few notes:
- the discussed case with business POV is very rare, it occurs under certain conditions of competition during the execution of manual processes; the case was identified only with the help of automatic tests,
- the logic of working with transactions itself is not very simple, so in case of conflicts, their automatic resolution is probably possible, but it can also be difficult.
With this in mind, the approach is based on the prohibition of parallel work with orders and invoices.
Following this approach, mod-finance-storage could return an error when a summary is posted while the previous one for the same order/invoice has not been fully completed. Note that a specific error can be returned so that UI side can handle the error code properly.
It would require changing mod-orders somehow to deal with the example issue given above.
It is necessary to address the concern associated with the fact that if a list of transactions planned with a summary never finishes, the transactions can never be updated again. Timeout mechanism can be considered as a solution for such a case.
Introducing a unique composite ID for parallel transactions
The API could be fixed to allow parallel processing for series of transaction creations and updates. Instead of using the order id or invoice id as a key for the list of edits, we could use the summary id. The caller would get the id as it is returned by the POST method for the summary, and use it for each edit afterwards. Since 2 callers would use 2 different ids, there would be no risk of collision. It would also simplify the API because the same calls could be done for orders and invoices. Internally, mod-finance-storage would need a way to save the summary id at the same time as a temporary transaction, perhaps by storing it simply in the transaction. Temporary transactions would be removed when executed, when the total from the summary is reached.
With that new API, it would be safer to check for parallel edits and deny them if we choose to, because we could just search the temporary transactions for the same order/invoice id as a new one with a different summary id, and return an error if any is found.
This new API would address issues such as the one in mod-orders' TransactionSummariesService, because a new id would always be used, and there would be no risk of conflict with an existing one. So we would no longer have errors in the logs for normal operations, and we would not have to check for 404 errors in calling modules.
Changing the API would require changes in other modules. The old API should be preserved until all dependent modules have been updated.
Architecturally Significant Requirements
As we a going to allow parallel transaction processing for one order, than we should be sure that REPEATABLE READ isolation level set for DB.
Summing up what needs to be done:
mod-finance-storage
- Create a new model "TempTransaction" to support the correlation/summary "id" from the table "order_transaction_summaries".
"TempTransaction" model should extends all field from "Transaction" with addition field: "summaryId". "TempTransaction" will be saved into "temporary_order_transactions" - Table "order_transaction_summaries":
- Add new field "orderId" (old field "id" was "orderId", now "id" will be correlation id and "orderId" will be the order id)
- Existing field "id" will be unique and was generated on POST and should be set in the TempTransaction -> summaryId
- Update logic of the org.folio.service.transactions.AllOrNothingTransactionService#processAllOrNothing
- Transaction POST/PUT use a new transactionSummaryId query parameter, replacing orderId/invoiceId as the summary id. Remove TransactionSummaryService$getSummaryId()
mod-orders
- Update logic in all the places where we call transaction summary API to support updated "order_transaction_summaries" with field "orderId" and "id" should be generated and unique
- The same for invoices
Seems like it will be enough to support concurrent transactions update.
mod-finance
- Update API to pass the summary id for all transaction POST and PUT.
More cases
Some of these cases illustrate some issues with the current implementation, and how using a unique summary id can resolve them. This may or may not be directly related to parallel processing, but solution 2 would resolve these issues at the same time as allowing parallel processing.
An unexpected issue occurs when processing a transaction
Examples: MODINVOICE-363, MODFISTO-281.
This happens as a module such as mod-invoice calls mod-finance to process transactions as part of some larger operation. If an error occurs in mod-finance (which may be caused by insufficient testing for user errors in the calling module), recovery is impossible. Mod-finance returns the "All expected transactions already processed" error. Additionally, debugging becomes very difficult because the original issue cannot be reproduced anymore (as we experienced with MODINVOICE-363).
Implementing solution 2 would resolve these issues.
Note: in most cases this particular issue is now probably avoided with MODFISTO-293.
Damien Is "solution 2" about "Composite ID for parallel operations"?
Andrei Makaranka Yes, if I understand correctly the "Composite ID for parallel operations" approach would include MODFISTO-268 (Option to provide a summary id parameter in the transaction API). By using a unique summary id every time we process several transactions, we avoid potential conflicts.
A user creates a piece while another one updates an order line
This is a case where actual parallel operation would help. It is more likely than 2 users working on the same order at the same time, because receiving teams are often different. Also both teams can work on the same order at the same when a related external event occurs (such as a receipt).
Andrei M: Need to understand which fields have been updated in order to allow or not change.
A module calls mod-finance as part of a larger operation
Modules often validate user data and then start a multi-step operation. Undoing changes if something bad happens in the middle of the multi-step operation is generally not implemented. With solution 1, we would have to rewrite many implementations in several modules to handle errors in case of a conflict, otherwise we would end up with inconsistent states like we do now. With solution 2, conflicts caused by parallel processing should not happen.
Andrei M: it's not a problem to add verification whether the record is already being modified by some user
Performance
A current workaround to avoid parallel processing issues when another module calls mod-finance is to run transactions sequentially. This has a negative impact on performance, as the calling module waits for the previous call to finish before initiating another call. The overhead of initiating a call and getting the results from mod-finance and mod-finance-storage is removed when parallel operations are possible, as the wait happens in mod-finance-storage.
If we implement solution 2, we would have to revert the workarounds, but this case be done progressively as the implementation would still work.
Andrei M : We need to proof it, because over-parallelism degrades performance (DB is bottleneck). Currently "Open order" operation is happened in sequential mode, because the was an issue with opening order with many lines.
API complexity
Currently there are 2 sets of APIs for transaction summaries: order and invoice. We could potentially simplify that with solution 2 by having a single API for transaction summaries, and reduce the implementation. Additionally, we would not need to allow to update a transaction summary, as they would only be created. So we could have 1 entry point instead of 4.
Reducing the API complexity makes the code more readable, and in this case it would be easier to understand.
The order id is unknown, or there are several orders, but we want to process related transactions
Example: MODINVOICE-360
When an invoice is cancelled and we want to unrelease all related encumbrances, we currently need to find the related orders for each encumbrance, sort the encumbrances by order id, and create a transaction summary for each order. This use of the finance API would be greatly simplified if we could use a single summary id for all transactions. It would also improve performance.
Andrei M: Not sure that it is our case, because we a not going to remove orderId or invoiceId from transaction summary table.
Avoid unnecessary calls related to transaction summaries
Example: TransactionSummariesService.java
In mod-orders, when using transaction summaries, we first try to see if one exists with an additional call and test for a 404 before choosing to update or create a transaction summary. Solution 2 would remove the need for that extra call.
A module calls another several times in parallel, which then calls mod-finance-storage with a transaction summary
Example: MODFISTO-259
In the given case, mod-orders was trying to delete several encumbrances at the same time. To do that, it called mod-finance for each encumbrance to release, in parallel. Mod-finance takes care of releasing the encumbrance first when it deletes one. The trouble was that several transaction summaries ended up being used at the same time for the same order, which caused budget errors without any warning. It was resolved by releasing the encumbrances first in mod-orders, with a transaction summary, before deleting them. The solution works but it adds complexity, and we could encounter the same problem in the future if we're not very careful with the use of the financial API.
Solution 2 would resolve the issue and allow simpler implementations. We would not have to worry about creating new similar issues in the future.
2 users update order lines linked to different opened orders using the same budget, at the same time
Example: parallel-update-order-lines-different-orders
in the MODFISTO-260 integration tests
→ This is fixed by MODFISTO-304 and previous stories.
GET resource, modification, PUT resource
Examples: updateEncumbrancesOrderStatus (mod-orders), unreleaseEncumbrancesForPoLines (mod-invoice), cancelInvoice (mod-invoice), payPoLines (mod-invoice)
2 apps or 2 end users can conflict when using GET+PUT on the same resource (transactions, order lines, anything with a PUT).
Optimistic locking would at least provide an error when it happens, and it would be simple to implement. It would give us an idea of how frequently this happens in practice.
To better deal with end user conflicts, when a put method implementation has some logic before trying to save the resource, we should do a get from storage first and check the version, and return a 409 if needed. A 409 could still happen afterwards, but it would be less likely because the time scale is different for end users and apps.
Also, we could avoid these issues by moving logic from the calling to the receiving app and adding new endpoints. For instance, instead of doing GET transaction, change encumbrance status to unreleased, PUT transaction, we could call a new /finance/encumbrances/{id}/unrelease
endpoint that would use a lock on the transaction while the change is taking place. Or we could use a PATCH method to avoid creating many endpoints.
Encumbrance restriction validation
Example: processEncumbrances
If 2 users try to encumber the same budget at the same time, validation can pass for both, but either there will be an error afterwards (and possibly an inconsistent state because other modifications happened in between and no rollback is implemented) or they will go beyond the encumbrance restrictions.
(there is an additional problem when a piece is created, as the inventory is created before the encumbrances are even checked)