[MODINVSTOR-910] Optimistic locking makes Inventory-Batch APIs upsert loads fail Created: 18/May/22 Updated: 21/Jun/22 Resolved: 21/Jun/22 |
|
| Status: | Closed |
| Project: | mod-inventory-storage |
| Components: | None |
| Affects versions: | None |
| Fix versions: | 24.0.0 |
| Type: | Bug | Priority: | P2 |
| Reporter: | Theodor Tolstoy (One-Group.se) | Assignee: | Julian Ladisch |
| Resolution: | Done | Votes: | 0 |
| Labels: | support | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||
| Sprint: | |||||||||||||||||||||||||
| Development Team: | Core: Platform | ||||||||||||||||||||||||
| Release: | Morning Glory (R2 2022) | ||||||||||||||||||||||||
| RCA Group: | Incomplete/missing requirements | ||||||||||||||||||||||||
| Description |
|
Overview: Inventory has a set of API:s allowing you to post records to FOLIO In batch. These API:s also allows for the same data to be overlaid (Upserted) by adding the parameter
upsert=true
to the request. Documentation is here: https://s3.amazonaws.com/foliodocs/api/mod-inventory-storage/p/item-sync.html Now it seems like this upsert functionality is broken, by the Optimistic Locking functionality. This has been verified in Lotus Bugfest. Steps to Reproduce:
Expected Results: Records are successfully posted twice. Actual Results: An HTTP 409 Optimistic locking conflict is being raised. Solution: The batch sync APIs work as designed. However, the API documentation had been quite incomplete. This has been fixed in scope of this Jira. |
| Comments |
| Comment by Charlotte Whitt [ 18/May/22 ] |
|
Khalilah Gambrell - I labelled this ticket with RCA: Incomplete/missing requirements |
| Comment by Charlotte Whitt [ 18/May/22 ] |
|
Charlotte Whitt: 1 minute ago Theodor Tolstoy < 1 minute ago |
| Comment by Ingolf Kuss [ 19/May/22 ] |
|
Why is the _version field not included in the initial POST ? |
| Comment by Marc Johnson [ 19/May/22 ] |
Are you asking about the representation sent by the client or the one sent back by the server? I believe you are asking about the representation sent by the client. In which case, it should not be (and if it is, it should be ignored) because it is a server controlled property. (As FOLIO tends to use the same schema for creation, update and read these expectations tend not to be expressed explicitly in the interface). |
| Comment by Marc Johnson [ 19/May/22 ] |
|
Charlotte Whitt Khalilah Gambrell The optimistic concurrency work was done by the Core Platform team. I think this work would be better picked up by that team. |
| Comment by Ingolf Kuss [ 19/May/22 ] |
|
> Are you asking about the representation sent by the client or the one sent back by the server? Yes, I was asking about the representation sent by the client. I now understand that, if I do an upsert, the record is read, enriched (by the API) by a _version field (if it is not present, yet), and then send back to item-storage with that version field. From then on, optimistic locking will work for that record. |
| Comment by Marc Johnson [ 19/May/22 ] |
What are you referring to as an upsert? An API request or a database operation? |
| Comment by Ingolf Kuss [ 19/May/22 ] |
|
This API request /item-storage/batch/synchronous?upsert=true |
| Comment by Marc Johnson [ 19/May/22 ] |
Oh yeah, I'm not very familiar with the batch APIs. I imagine that is a challenge because a create SHOULD not have a version at all and an update MUST not have a version. Thus the client needs to know which is which. I think that might be one for the folks who were involved in both of these areas to ponder. |
| Comment by Ingolf Kuss [ 19/May/22 ] |
|
I agree. I was not aware that a CREATE should not have a version when I wrote my initial comment. |
| Comment by Theodor Tolstoy (One-Group.se) [ 19/May/22 ] |
|
Does anyone know if there is any temporary workaround for this? Like putting a -1 in the version property or something similar |
| Comment by Julian Ladisch [ 19/May/22 ] |
|
Documentation about Optimistic Locking: https://folio-org.atlassian.net/wiki/display/DD/Optimistic+locking+proposal |
| Comment by Julian Ladisch [ 19/May/22 ] |
|
The POST /item-storage/batch/synchronous API works exactly as specified. When creating a new record the client sends the record without _version property. When updating a record the client needs to send the record with the _version property as stored in the DB:
In step 2. the _version property is missing. In step 2. the record is rejected to prevent overwriting the record that was created by someone else. I suggest to close this Jira as "Won't do" because it's not a bug. |
| Comment by Ingolf Kuss [ 19/May/22 ] |
|
> Does anyone know if there is any temporary workaround for this? Like putting a -1 in the version property or something similar Reading Julians comment carefully I conclude that
I further conclude that there are two alternative ways to avoid the error:
|
| Comment by Marc Johnson [ 19/May/22 ] |
I disagree with this. As I understand it, the intention of an upsert operation is that the client is ignorant of whether a record already exists:
The clients expects that the resulting state is that a record exists with those properties in I think expecting the client to know whether a record exists or not (to decide whether to provide certain properties or not) rather contradicts the intention of an API that states it is performing an upsert. |
| Comment by Marc Johnson [ 19/May/22 ] |
Enabling / disabling optimistic concurrency can only be done by releasing a new version of the module. I think that makes this not a practical approach for general continued use of the API. |
| Comment by Julian Ladisch [ 19/May/22 ] |
Upsert is about concurrency. In some use cases the client is ignorant about the record existence, in other use cases the client isn't. Here we have the latter. Instead of the /item-storage/batch/synchronous?upsert=true API we could have provided these two APIs:
The implementation for them is almost the same, therefore they have been combined so that clients can mix inserts and updates in a single request. Note that /item-storage/batch/synchronous-update requires a correct _version property. |
| Comment by Marc Johnson [ 19/May/22 ] |
If the client knows whether a record exists, what is the purpose of using upsert instead of a dedicated insert or update? |
| Comment by Marc Johnson [ 19/May/22 ] |
I don't understand what this means. It seems to directly contradict the statement in the PostgreSQL documentation I quoted. Please can you explain what you mean by this. |
| Comment by Brooks Travis [ 19/May/22 ] |
|
The use case here is: I have a bunch of records I’ve just transformed from a previous system, with deterministic UUIDs, and I need to post them to FOLIO. If they don’t exist, create them, but if they do, replace that old record with the one I’m providing, and don’t bother me about it, I’m right, the database is wrong, just do what I say. 😁 |
| Comment by Julian Ladisch [ 19/May/22 ] |
|
Marc Johnson From the PostgreSQL wiki:
From the PostgreSQL documenation:
Upsert is an atomic operation. This avoids difficulties arising from concurrency. Upsert is usually easier than executing an INSERT and an UPDATE operation in sequence. Example: Insert 2 records and update 2 record, if the insert succeeds but the update fails because of missing foreign key one needs to manually roll back the insert. An "independent error" mentioned in the documentation can be a foreign key violation or an optimistic locking violation. |
| Comment by Julian Ladisch [ 19/May/22 ] |
|
Brooks Travis: This issue is not a bug report but a feature request. You request a bulk API with optimistic locking disabled. Please create a Jira with detailed spec. Providing an API with optimistic locking disabled has been discussed in length and detail (part of that in
|
| Comment by Marc Johnson [ 19/May/22 ] |
I agree that an atomic upsert operation reduces the impact of concurrency compared to using a combination of statements to (try to) achieve the same outcome. I disagree with your assertion the idea of an upsert is fundamentally only about concurrency. I don't think we are going to agree. I think this conversation is distracting from the (what I personally think is the) more important one about the clients expectations (that the request should be the same regardless of whether the record exists or not). |
| Comment by Jakub Skoczen [ 20/May/22 ] |
|
Brooks Travis Theodor Tolstoy (One-Group.se) Julian Ladisch I agree with Julian that this is not a bug but a new feature. The process of rolling out OL took around 2 years which included communication with POs and individual teams. According to https://folio-org.atlassian.net/wiki/display/DD/Optimistic+locking+proposal the "logOnConflict" setting has been introduced on 2021-02-01 which provided warning in the logfiles. In the existing implementation, Batch APIs are subject to the same OL mechanism as regular APIs. What this means in practice is that the usage of existing batch APIs remain the same for initial load but require "version" handling for subsequent updates. My understanding is that subsequent updates should be handled for Data Import. If this is not the case, I agree that functionality to explicitly disable OL for batch APIs should be defined and implemented. |
| Comment by Marc Johnson [ 24/May/22 ] |
|
This has been added to Prokopovych's sprint. What are your expectations for the team when they pick up this issue? As I understand it, both Julian Ladisch and Jakub Skoczen have stated that this isn't a bug and the API is working as intended. |
| Comment by Ian Walls [ 24/May/22 ] |
|
From the perspective of a repeat-implementer, losing the ability to upsert records to gain optimistic locking is a regression. When doing an implementation, iterative loads are the norm, as there can be numerous cases of record loading failure. Having to wipe out everything in Inventory between each load is wasteful, and may be outright untenable if the incoming record load is a partial list (say, in the case of building a union catalog from multiple data sources). I feel like we've made a bad trade off: protecting against a theoretic possibility of record update conflicts (exceedingly rare in all my practical experience) at the cost of being able to do effective data migrations. |
| Comment by Marc Johnson [ 24/May/22 ] |
|
Charlotte Whitt I don't think the Prokopovych team can pick this up until folks can agree what change should be. I also think we could consider the Core Platform team picking this issue up, as I believe they implemented both these APIs and the optimistic concurrency controls. I've put the issue into draft and removed it from the sprint. |
| Comment by Julian Ladisch [ 24/May/22 ] |
May I draw your attention to the "Use cases collected from community" listed in the description section of https://folio-org.atlassian.net/browse/UXPROD-3058 "Optimistic Locking"? Practical experience of many libraries showed a high demand for optimistic locking lifting it to the 3rd place in the Kiwi pointing process: https://folio-org.atlassian.net/wiki/display/COMMUNITY/Pointing+Results+for+Kiwi+Release |
| Comment by Julian Ladisch [ 24/May/22 ] |
|
Your use case is supported by the DELETE by CQL API that allows you to delete records in batches of 100: https://folio-org.atlassian.net/browse/MODINVSTOR-576 Therefore there is no need to disable optimistic locking in the POST bulk APIs. |
| Comment by Theodor Tolstoy (One-Group.se) [ 08/Jun/22 ] |
|
Late to the game, but to me this is a regression bug that needs to get fixed. There is nothing in the schemas, or documentation, or release notes indicating that the version property is required, and it seems like a oversight from both the implementing parties and the parties that use this enpoint to communicate the matter (guilty 🙋♂️). Pointing to library use cases here is not relevant since this is used for initial data migration and performing large-scale corrections by service providers. Pointing to DELETE APIs is not a valid argument either, because FOLIO is terrible at deleting things at scale, both at the database and the API level, often leading to unresponsiveness.
|
| Comment by Theodor Tolstoy (One-Group.se) [ 08/Jun/22 ] |
|
An unobtrusive propasal for a fix is to add a new parameter, like sidestepOptimisticLocking, to do just that, and not risk breaking the code for someone that been waiting and worked on implementing the optimisticLocking in their calls to the Batch API for the last two years...😉 |
| Comment by Marc Johnson [ 08/Jun/22 ] |
|
Theodor Tolstoy (One-Group.se)
How do you think it should be fixed?
It is unfortunate that this isn't explicit. This is a general limitation with how FOLIO uses JSON schema. We use a single schema for create and update, meaning that there is no explicit way to describe when a property becomes required for updates. Charlotte Whitt Khalilah Gambrell Jakub Skoczen I'm going to move this to Core Platform, as both the batch APIs and optimistic concurrency control were implemented by that team. |
| Comment by Theodor Tolstoy (One-Group.se) [ 08/Jun/22 ] |
|
Marc Johnson i added a propsal in a subsequent comment above. If you are asking me for the inner implementation, I must admit I do not have a clue....
|
| Comment by Marc Johnson [ 08/Jun/22 ] |
|
Theodor Tolstoy (One-Group.se)
Thank you for expanding the description of your expectations. |
| Comment by Erin Nettifee [ 13/Jun/22 ] |
|
Duke reported something that I believe is related - https://folio-org.atlassian.net/browse/MODINVSTOR-846 - though that jira still shows as blocked. |
| Comment by Julian Ladisch [ 14/Jun/22 ] |
|
| Comment by Julian Ladisch [ 14/Jun/22 ] |
Please create new Jira explaining in more detail why the delete-by-cql API https://folio-org.atlassian.net/browse/MODINVSTOR-576 is terrible. |
| Comment by Julian Ladisch [ 14/Jun/22 ] |
|
The batch synchronous implementations got the same optimistic locking implementation as the single record APIs. There were long discussions that changing an existing record always requires the correct _version property. It was intended that introducing optimistic locking is a breaking change for both batch and single record APIs. The incomplete API documentation https://s3.amazonaws.com/foliodocs/api/mod-inventory-storage/p/item-sync.html has been expanded:
Do you have more suggestions how to improve this API documentation? Otherwise I suggest to close this documentation bug report as done. There is a new story
|
| Comment by Ian Walls [ 14/Jun/22 ] |
|
The fundamental problem here is that optimistic locking has negatively impacted data migration. We're prioritizing the prevention of theoretical data loss, based on the worries of librarians in response to their previous systems, over the necessity of data migrators to be able to iteratively load large amounts of inventory data in order to bring FOLIO live within a timeframe. The very nature of 'upsert' is that we don't know whether a record we're POSTing is already in the system or not. It's current _version, according to FOLIO, is a mystery. Looking up each record's FOLIO _version is tantamount to updating each record individually, thereby negating the value of the batch API. This is not a documentation issue; this is an instance of one feature being implemented at the cost of another. I believe the correct solution is to make optimistic locking optional, either at the API level, or universally. |
| Comment by Theodor Tolstoy (One-Group.se) [ 14/Jun/22 ] |
I have no reason for doing so, since I was referring to current and past experiences deleting inventory records at scale. If the much-awaited
|
| Comment by Marc Johnson [ 14/Jun/22 ] |
Unfortunately, that conflicts (no pun intended) with the design of some of these records, where the expectations around some fields are different depending upon whether the record is new or not. I believe there is another issue around this topic relating to HRIDs. There may well be other fields that are affected by this. This is a challenge that may well come up repeatedly for this style of API, given the mismatch in design considerations. |
| Comment by Julian Ladisch [ 21/Jun/22 ] |
|
The feature request for disabling optimistic locking has been split off to https://folio-org.atlassian.net/browse/MODINVSTOR-924 "Batch update with optimistic locking disabled" The documentation for the existing batch sync APIs has been extended to explain in detail how the APIs were designed and implemented:
Therefore this Jira is done and closed. |