[MODINVSTOR-910] Optimistic locking makes Inventory-Batch APIs upsert loads fail Created: 18/May/22  Updated: 21/Jun/22  Resolved: 21/Jun/22

Status: Closed
Project: mod-inventory-storage
Components: None
Affects versions: None
Fix versions: 24.0.0

Type: Bug Priority: P2
Reporter: Theodor Tolstoy (One-Group.se) Assignee: Julian Ladisch
Resolution: Done Votes: 0
Labels: support
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
defines UXPROD-3691 R3 2022 | Prokopovych Enhancements/Bu... Closed
Duplicate
is duplicated by SUP-74 Regression: Inventory Storage Batch S... Closed
Relates
relates to MODINVSTOR-924 Batch update with optimistic locking ... Closed
Sprint:
Development Team: Core: Platform
Release: Morning Glory (R2 2022)
RCA Group: Incomplete/missing requirements

 Description   

Overview:

Inventory has a set of API:s allowing you to post records to FOLIO In batch. These API:s also allows for the same data to be overlaid (Upserted) by adding the parameter

upsert=true

to the request. Documentation is here: https://s3.amazonaws.com/foliodocs/api/mod-inventory-storage/p/item-sync.html

Now it seems like this upsert functionality is broken, by the Optimistic Locking functionality. 

This has been verified in Lotus Bugfest. 

Steps to Reproduce:

  1. POST a batch of records to /item-storage/batch/synchronous according to the documentation. The records does not have the _version property.
  2. Post the same batch again to /item-storage/batch/synchronous?upsert=true

Expected Results:

Records are successfully posted twice. 
**

Actual Results:

An HTTP 409 Optimistic locking conflict is being raised.

Solution:

The batch sync APIs work as designed. However, the API documentation had been quite incomplete. This has been fixed in scope of this Jira.



 Comments   
Comment by Charlotte Whitt [ 18/May/22 ]

Khalilah Gambrell - I labelled this ticket with RCA: Incomplete/missing requirements

Comment by Charlotte Whitt [ 18/May/22 ]

Charlotte Whitt: 1 minute ago
Which libraries are affected by this @Theodor?

Theodor Tolstoy < 1 minute ago
All that are not live yet....

Comment by Ingolf Kuss [ 19/May/22 ]

Why is the _version field not included in the initial POST ?

Comment by Marc Johnson [ 19/May/22 ]

Ingolf Kuss

Why is the _version field not included in the initial POST ?

Are you asking about the representation sent by the client or the one sent back by the server?

I believe you are asking about the representation sent by the client. In which case, it should not be (and if it is, it should be ignored) because it is a server controlled property.

(As FOLIO tends to use the same schema for creation, update and read these expectations tend not to be expressed explicitly in the interface).

Comment by Marc Johnson [ 19/May/22 ]

Charlotte Whitt Khalilah Gambrell

The optimistic concurrency work was done by the Core Platform team. I think this work would be better picked up by that team.

Comment by Ingolf Kuss [ 19/May/22 ]

> Are you asking about the representation sent by the client or the one sent back by the server?

Yes, I was asking about the representation sent by the client. I now understand that, if I do an upsert, the record is read, enriched (by the API) by a _version field (if it is not present, yet), and then send back to item-storage with that version field. From then on, optimistic locking will work for that record.

Comment by Marc Johnson [ 19/May/22 ]

Ingolf Kuss

I now understand that, if I do an upsert, the record is read, enriched (by the API) by a _version field (if it is not present, yet), and then send back to item-storage with that version field. From then on, optimistic locking will work for that record.

What are you referring to as an upsert? An API request or a database operation?

Comment by Ingolf Kuss [ 19/May/22 ]

This API request /item-storage/batch/synchronous?upsert=true

Comment by Marc Johnson [ 19/May/22 ]

Ingolf Kuss

This API request /item-storage/batch/synchronous?upsert=true

Oh yeah, I'm not very familiar with the batch APIs.

I imagine that is a challenge because a create SHOULD not have a version at all and an update MUST not have a version. Thus the client needs to know which is which.

I think that might be one for the folks who were involved in both of these areas to ponder.

cc: Jakub Skoczen Ian Walls

Comment by Ingolf Kuss [ 19/May/22 ]

I agree. I was not aware that a CREATE should not have a version when I wrote my initial comment.

Comment by Theodor Tolstoy (One-Group.se) [ 19/May/22 ]

Does anyone know if there is any temporary workaround for this? Like putting a -1 in the version property or something similar

Comment by Julian Ladisch [ 19/May/22 ]

Documentation about Optimistic Locking: https://folio-org.atlassian.net/wiki/display/DD/Optimistic+locking+proposal

Comment by Julian Ladisch [ 19/May/22 ]

The POST /item-storage/batch/synchronous API works exactly as specified.

When creating a new record the client sends the record without _version property.

When updating a record the client needs to send the record with the _version property as stored in the DB:

  • If no _version property is stored in the DB the new record must not have a _version.
  • If a _version property is stored in the DB because the record has been created or updated after optimistic locking was enabled then the new record must have the _version value as stored. Otherwise it is an optimistic locking conflict and the record to be rejected.

In step 2. the _version property is missing.

In step 2. the record is rejected to prevent overwriting the record that was created by someone else.

I suggest to close this Jira as "Won't do" because it's not a bug.

Comment by Ingolf Kuss [ 19/May/22 ]

> Does anyone know if there is any temporary workaround for this? Like putting a -1 in the version property or something similar

Reading Julians comment carefully I conclude that 

  1. in step 1. a _version property has been stored in the DB because the record has been created after optimistic locking was enabled.
  2. in step 2. you must provide the same _version property as stored in the DB in order to avoid the locking conflict.

I further conclude that there are two alternative ways to avoid the error:

  1. enable optimistic locking only after you have performed step 1.
  2. use a _version property in all record representations sent in step 2. by the client. The question is just which value to use here. In other words, what is the initial value that the DB assigns when it creates a new record that is subject to optimistic locking. Probably "1" ?
Comment by Marc Johnson [ 19/May/22 ]

Julian Ladisch

The POST /item-storage/batch/synchronous API works exactly as specified.

When creating a new record the client sends the record without _version property.

When updating a record the client needs to send the record with the _version property as stored in the DB.

I disagree with this.

As I understand it, the intention of an upsert operation is that the client is ignorant of whether a record already exists:

"UPSERT" is a DBMS feature that allows a DML statement's author to atomically either insert a row, or on the basis of the row already existing, UPDATE that existing row instead, while safely giving little to no further thought to concurrency

The clients expects that the resulting state is that a record exists with those properties in

I think expecting the client to know whether a record exists or not (to decide whether to provide certain properties or not) rather contradicts the intention of an API that states it is performing an upsert.

Comment by Marc Johnson [ 19/May/22 ]

Ingolf Kuss

enable optimistic locking only after you have performed step 1.

Enabling / disabling optimistic concurrency can only be done by releasing a new version of the module. I think that makes this not a practical approach for general continued use of the API.

Comment by Julian Ladisch [ 19/May/22 ]

the client is ignorant of whether a record already exists

Upsert is about concurrency.

In some use cases the client is ignorant about the record existence, in other use cases the client isn't. Here we have the latter.

Instead of the /item-storage/batch/synchronous?upsert=true API we could have provided these two APIs:

  • /item-storage/batch/synchronous-insert
  • /item-storage/batch/synchronous-update

The implementation for them is almost the same, therefore they have been combined so that clients can mix inserts and updates in a single request.

Note that /item-storage/batch/synchronous-update requires a correct _version property.

Comment by Marc Johnson [ 19/May/22 ]

Julian Ladisch

In some use cases the client is ignorant about the record existence, in other use cases the client isn't. Here we have the latter.

If the client knows whether a record exists, what is the purpose of using upsert instead of a dedicated insert or update?

Comment by Marc Johnson [ 19/May/22 ]

Julian Ladisch

Upsert is about concurrency.

I don't understand what this means. It seems to directly contradict the statement in the PostgreSQL documentation I quoted. Please can you explain what you mean by this.

Comment by Brooks Travis [ 19/May/22 ]

The use case here is: I have a bunch of records I’ve just transformed from a previous system, with deterministic UUIDs, and I need to post them to FOLIO. If they don’t exist, create them, but if they do, replace that old record with the one I’m providing, and don’t bother me about it, I’m right, the database is wrong, just do what I say. 😁

Comment by Julian Ladisch [ 19/May/22 ]

Marc Johnson
Upsert is about concurrency.

From the PostgreSQL wiki:

"UPSERT" is a DBMS feature that allows a DML statement's author to atomically either insert a row, or on the basis of the row already existing, UPDATE that existing row instead, while safely giving little to no further thought to concurrency

From the PostgreSQL documenation:

ON CONFLICT DO UPDATE guarantees an atomic INSERT or UPDATE outcome; provided there is no independent error, one of those two outcomes is guaranteed, even under high concurrency. This is also known as UPSERT — “UPDATE or INSERT”.

Upsert is an atomic operation. This avoids difficulties arising from concurrency.

Upsert is usually easier than executing an INSERT and an UPDATE operation in sequence. Example: Insert 2 records and update 2 record, if the insert succeeds but the update fails because of missing foreign key one needs to manually roll back the insert.

An "independent error" mentioned in the documentation can be a foreign key violation or an optimistic locking violation.

Comment by Julian Ladisch [ 19/May/22 ]

Brooks Travis: This issue is not a bug report but a feature request. You request a bulk API with optimistic locking disabled. Please create a Jira with detailed spec.

Providing an API with optimistic locking disabled has been discussed in length and detail (part of that in RMB-727 Closed , RMB-719 Closed ) and rejected. Instead the existing record should be fetched to get the _version property – then the bulk API can be invoked with correct _version property.

Comment by Marc Johnson [ 19/May/22 ]

Julian Ladisch

Upsert is about concurrency.

I agree that an atomic upsert operation reduces the impact of concurrency compared to using a combination of statements to (try to) achieve the same outcome.

I disagree with your assertion the idea of an upsert is fundamentally only about concurrency.

I don't think we are going to agree.

I think this conversation is distracting from the (what I personally think is the) more important one about the clients expectations (that the request should be the same regardless of whether the record exists or not).

Comment by Jakub Skoczen [ 20/May/22 ]

Brooks Travis Theodor Tolstoy (One-Group.se) Julian Ladisch

I agree with Julian that this is not a bug but a new feature. The process of rolling out OL took around 2 years which included communication with POs and individual teams. According to https://folio-org.atlassian.net/wiki/display/DD/Optimistic+locking+proposal the "logOnConflict" setting has been introduced on 2021-02-01 which provided warning in the logfiles.

In the existing implementation, Batch APIs are subject to the same OL mechanism as regular APIs. What this means in practice is that the usage of existing batch APIs remain the same for initial load but require "version" handling for subsequent updates. My understanding is that subsequent updates should be handled for Data Import. If this is not the case, I agree that functionality to explicitly disable OL for batch APIs should be defined and implemented.

Comment by Marc Johnson [ 24/May/22 ]

Charlotte Whitt

This has been added to Prokopovych's sprint. What are your expectations for the team when they pick up this issue?

As I understand it, both Julian Ladisch and Jakub Skoczen have stated that this isn't a bug and the API is working as intended.

Comment by Ian Walls [ 24/May/22 ]

From the perspective of a repeat-implementer, losing the ability to upsert records to gain optimistic locking is a regression.  When doing an implementation, iterative loads are the norm, as there can be numerous cases of record loading failure.  Having to wipe out everything in Inventory between each load is wasteful, and may be outright untenable if the incoming record load is a partial list (say, in the case of building a union catalog from multiple data sources).

I feel like we've made a bad trade off: protecting against a theoretic possibility of record update conflicts (exceedingly rare in all my practical experience) at the cost of being able to do effective data migrations.

Comment by Marc Johnson [ 24/May/22 ]

Charlotte Whitt I don't think the Prokopovych team can pick this up until folks can agree what change should be.

I also think we could consider the Core Platform team picking this issue up, as I believe they implemented both these APIs and the optimistic concurrency controls.

I've put the issue into draft and removed it from the sprint.

Comment by Julian Ladisch [ 24/May/22 ]

theoretic possibility of record update conflicts

May I draw your attention to the "Use cases collected from community" listed in the description section of https://folio-org.atlassian.net/browse/UXPROD-3058 "Optimistic Locking"?

Practical experience of many libraries showed a high demand for optimistic locking lifting it to the 3rd place in the Kiwi pointing process: https://folio-org.atlassian.net/wiki/display/COMMUNITY/Pointing+Results+for+Kiwi+Release

Comment by Julian Ladisch [ 24/May/22 ]

Your use case is supported by the DELETE by CQL API that allows you to delete records in batches of 100: https://folio-org.atlassian.net/browse/MODINVSTOR-576

Therefore there is no need to disable optimistic locking in the POST bulk APIs.

Comment by Theodor Tolstoy (One-Group.se) [ 08/Jun/22 ]

Late to the game, but to me this is a regression bug that needs to get fixed. There is nothing in the schemas, or documentation, or release notes indicating that the version property is required, and it seems like a oversight from both the implementing parties and the parties that use this enpoint to communicate the matter (guilty 🙋‍♂️).

Pointing to library use cases here is not relevant since this is used for initial data migration and performing large-scale corrections by service providers.

Pointing to DELETE APIs is not a valid argument either, because FOLIO is terrible at deleting things at scale, both at the database and the API level, often leading to unresponsiveness. 

 

Comment by Theodor Tolstoy (One-Group.se) [ 08/Jun/22 ]

An unobtrusive propasal for a fix is to add a new parameter, like sidestepOptimisticLocking, to do just that, and not risk breaking the code for someone that been waiting and worked on implementing the optimisticLocking in their calls to the Batch API for the last two years...😉

Comment by Marc Johnson [ 08/Jun/22 ]

Theodor Tolstoy (One-Group.se)

to me this is a regression bug that needs to get fixed

How do you think it should be fixed?

There is nothing in the schemas

It is unfortunate that this isn't explicit. This is a general limitation with how FOLIO uses JSON schema. We use a single schema for create and update, meaning that there is no explicit way to describe when a property becomes required for updates.

Charlotte Whitt Khalilah Gambrell Jakub Skoczen

I'm going to move this to Core Platform, as both the batch APIs and optimistic concurrency control were implemented by that team.

Comment by Theodor Tolstoy (One-Group.se) [ 08/Jun/22 ]

Marc Johnson i added a propsal in a subsequent comment above. If you are asking me for the inner implementation, I must admit I do not have a clue....

 

Comment by Marc Johnson [ 08/Jun/22 ]

Theodor Tolstoy (One-Group.se)

i added a propsal in a subsequent comment above.

Thank you for expanding the description of your expectations.

Comment by Erin Nettifee [ 13/Jun/22 ]

Duke reported something that I believe is related - https://folio-org.atlassian.net/browse/MODINVSTOR-846 - though that jira still shows as blocked.

Stewart Engart Jeff Fleming

Comment by Julian Ladisch [ 14/Jun/22 ]

There is nothing in the schemas, or documentation, or release notes indicating that the version property is required

https://folio-org.atlassian.net/wiki/display/REL/Lotus+%28R1+2022%29+Release+Notes#Lotus(R12022)ReleaseNotes-Notesonfunctionality says:

Optimistic locking has been enabled for inventory instances, holdings records, and items (Optimistic locking proposal, UXPROD-1752 Closed , MODINVSTOR-845 Closed ).

Comment by Julian Ladisch [ 14/Jun/22 ]

FOLIO is terrible at deleting things at scale, both at the database and the API level, often leading to unresponsiveness. 

Please create new Jira explaining in more detail why the delete-by-cql API https://folio-org.atlassian.net/browse/MODINVSTOR-576 is terrible.

Comment by Julian Ladisch [ 14/Jun/22 ]

The batch synchronous implementations got the same optimistic locking implementation as the single record APIs. There were long discussions that changing an existing record always requires the correct _version property. It was intended that introducing optimistic locking is a breaking change for both batch and single record APIs.

The incomplete API documentation https://s3.amazonaws.com/foliodocs/api/mod-inventory-storage/p/item-sync.html has been expanded:

POST /item-storage/batch/synchronous

Create or update a collection of items in a single synchronous request; if any item fails the complete batch will be rejected (all or nothing)

POST /item-storage/batch/synchronous

Query Parameters

upsert: (boolean - default: false)

When a record with the same id already exists upsert=true will update it, upsert=false will fail the complete batch. The _version property of each item to be updated must match the stored _version property (optimistic locking).

Do you have more suggestions how to improve this API documentation? Otherwise I suggest to close this documentation bug report as done. There is a new story MODINVSTOR-924 Closed "Batch update with optimistic locking disabled" to discuss the feature request.

Comment by Ian Walls [ 14/Jun/22 ]

The fundamental problem here is that optimistic locking has negatively impacted data migration.  We're prioritizing the prevention of theoretical data loss, based on the worries of librarians in response to their previous systems, over the necessity of data migrators to be able to iteratively load large amounts of inventory data in order to bring FOLIO live within a timeframe.

The very nature of 'upsert' is that we don't know whether a record we're POSTing is already in the system or not.  It's current _version, according to FOLIO, is a mystery.  Looking up each record's FOLIO _version is tantamount to updating each record individually, thereby negating the value of the batch API.

This is not a documentation issue; this is an instance of one feature being implemented at the cost of another.

I believe the correct solution is to make optimistic locking optional, either at the API level, or universally.

Comment by Theodor Tolstoy (One-Group.se) [ 14/Jun/22 ]

Please create new Jira explaining in more detail why the delete-by-cql API https://folio-org.atlassian.net/browse/MODINVSTOR-576 is terrible

I have no reason for doing so, since I was referring to current and past experiences deleting inventory records at scale. If the much-awaited MODINVSTOR-576 Closed passes code review and turns out to be terrible, I will of course file Jiras. 

Comment by Marc Johnson [ 14/Jun/22 ]

Ian Walls

The very nature of 'upsert' is that we don't know whether a record we're POSTing is already in the system or not.

Unfortunately, that conflicts (no pun intended) with the design of some of these records, where the expectations around some fields are different depending upon whether the record is new or not.

I believe there is another issue around this topic relating to HRIDs. There may well be other fields that are affected by this.

This is a challenge that may well come up repeatedly for this style of API, given the mismatch in design considerations.

Comment by Julian Ladisch [ 21/Jun/22 ]

The feature request for disabling optimistic locking has been split off to https://folio-org.atlassian.net/browse/MODINVSTOR-924 "Batch update with optimistic locking disabled"

The documentation for the existing batch sync APIs has been extended to explain in detail how the APIs were designed and implemented:

Therefore this Jira is done and closed.

Generated at Thu Feb 08 22:22:28 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.