Circ Rules and Loan Policies (UXPROD-789)

[UXPROD-3515] Concurrent Check-outs via Self-check do not check for patron item limits Created: 15/Nov/21  Updated: 28/Sep/23  Resolved: 28/Sep/23

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Poppy (R2 2023)
Parent: Circ Rules and Loan Policies

Type: New Feature Priority: P3
Reporter: Cheryl Malmborg Assignee: Tim Auger
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Continues
is continued by CIRC-1821 End to End testing between Mod circul... Closed
is continued by CIRCSTORE-423 Implement Get End Points for checkout... Closed
is continued by FAT-6575 Adding karate test case to check the ... Closed
Relates
relates to CIRC-1822 In continuation of CIRC-1756 Closed
relates to CIRC-1380 SPIKE: Investigate possibility of usi... Closed
relates to CIRC-1405 Create a design document for Check Ou... Closed
relates to CIRC-1756 Implement Checkout Lock feature Closed
relates to CIRC-1786 Spike: Analyze of design document for... Closed
relates to CIRC-1805 Determine performance impact for chec... Closed
relates to CIRCSTORE-416 Implement Post and Delete End Points ... Closed
relates to PERF-627 [Poppy] Checkout by barcode API by en... Closed
relates to CIRC-1378 SPIKE: Check-outs via Self-check do n... Closed
relates to CIRC-1740 SPIKE: Investigate possibility of res... Closed
Requires
requires CIRC-1377 SPIKE: Check-outs via Self-check do n... Closed
Release: Poppy (R2 2023)
Epic Link: Circ Rules and Loan Policies
Front End Estimate: Out of scope
Back End Estimate: XL < 15 days
Development Team: Volaris
PO Rank: 0
Rank: Cornell (Full Sum 2021): R4
Rank: Duke (Full Sum 2021): R3

 Description   

Possible bug in circulation: https://github.com/folio-org/mod-circulation/blob/v22.1.3/src/main/java/org/folio/circulation/resources/CheckOutByBarcodeResource.java#L125-L140

The selfcheck machine connects to FOLIO through SIP2 and checks out multiple items to the same patron at the same time (each checkout message does not wait for the processing result, and multiple checkout messages are sent at the same time).
It might exceed the maximum number of items that the patron is allowed to loan. For example, the maximum number of books a patron can checkout is 15 items, and the self-check machine sends 20 checkout requests almost simultaneously, it turns out that all the 20 books are successfully checked out.

It is normal to valid the circulation rules when checkout item one by one in CheckOutByBarcodeResource.java.
Once the concurrency occurs, folio will only check the circulation rules of current checkout, instead of checking all the concurrent checkout, which may exceed the limit.

This is caused by a race condition: After refuseWhenItemLimitIsReached is called (line 125) many other operations take place before createLoan is called (line 140): https://github.com/folio-org/mod-circulation/blob/v22.1.3/src/main/java/org/folio/circulation/resources/CheckOutByBarcodeResource.java#L125-L140



 Comments   
Comment by Cheryl Malmborg [ 15/Nov/21 ]

I've asked Tiewei Liu to add specifics about the self-check software they are using. Several libraries using self-check have not seen this problem.

Comment by Marc Johnson [ 15/Nov/21 ]

Cheryl Malmborg

What is the intention behind raising this bug? Is it intended that this be prioritised and worked on?

Comment by twliu [ 15/Nov/21 ]

Marc, this is a go-live blocker for Shanghai Library. So it needs to be solved as soon as possible (by December). 

Cheryl, thank you for creating the ticket! Gang Zhou will add the specifics about the self-check software.

Comment by Cheryl Malmborg [ 15/Nov/21 ]

Marc Johnson I am not sure how to characterize this issue. There seems to be an incompatibility between the self-checkout system Shanghai is using and folio processing. I know nothing about their self-check software.

I think the first thing would be a clarification of how folio is keeping track of how many loans a patron has checked out. Are the patron's loans re-queried with each item checked out or does the module increment a session counter?

Comment by Gang Zhou [ 16/Nov/21 ]

Slefcheck machines are from Shanghai RFID, Seaever, Bibliotheca

Comment by Marc Johnson [ 16/Nov/21 ]

twliu Gang Zhou Cheryl Malmborg

Marc, this is a go-live blocker for Shanghai Library. So it needs to be solved as soon as possible (by December).

I understand this is urgent and important.

Khalilah Gambrell Stephanie Buck Hkaplanian Holly Mistlebauer Who needs to be involved in prioritising this, is it the POs / Capacity Planning / the PC?

I think the first thing would be a clarification of how folio is keeping track of how many loans a patron has checked out. Are the patron's loans re-queried with each item checked out or does the module increment a session counter?

I've done a little research into this area. It seems the item limit behaviour was implemented in CIRC-558 Closed as part of the automated patron blocks work by Vega. I suggest we involve the Vega team (Stephanie Buck, Alexander Kurash) in this conversation.

I believe the Vega team also did the work on sessions.

A quick check of the code suggests to me:

  • the item limit checks are performed each check out, based upon open loans, the item status, material type and loan type.
  • check out / check in sessions are primarily related to notices and don't affect other behaviour.

Alexander Kurash How wrong is my understanding?

cc: Holly Mistlebauer

Comment by Julian Ladisch [ 22/Nov/21 ]

The item limit check (refuseWhenItemLimitIsReached) is special because its outcome can change when other loans are created in parallel.

The use case of a machine sending multiple check-out requests for a single patron in parallel is reasonable and should be supported for performance reasons.

A race condition where parallel check-outs exceed the item limit can be completely avoided if PostgreSQL features are used:

  • Move the item limit check (refuseWhenItemLimitIsReached Java code) from mod-circulation into a mod-circulation-storage SQL code.
  • The limit check and creating the loan are merged into a single database transaction.
  • If the limit is reached an error is returned, otherwise the loan is created.
  • The database transaction starts with "SELECT id FROM loans WHERE ... FOR UPDATE" of the patron's open loans. This makes all other transactions of that patron to wait until the transaction has finished and therefore avoids ROLLBACK that might happen when they were running concurrently. The "SELECT ... FOR UPDATE" should select only the loans needed for the item limit count (ROW_COUNT).
  • mod-circulation-storage's transaction SQL code requires some fields to exist in the loan record to do the item limit check, they need to be copied from the mod-inventory-storage item record into the mod-circulation-storage loan record when creating or renewing the loan: isClaimedReturned, materialTypeId, loanTypeId (similar to the existing itemEffectiveLocationIdAtCheckOut or  overdueFinePolicyId)
  • For best performance of the item limit check some database indexes should be created so that the SELECT query can be calculated using only the indexes without fetching the complete loan records.
Comment by Stephanie Buck [ 19/Jan/22 ]

Hi all. This has been moved from a bug to a feature. Raman Auramau and Alexander Kurash have been working on architecture for a solution. There will likely be more stories associated as we move forward. 

Comment by Tim Auger [ 18/Sep/22 ]

So, in a nutshell, for any item being checked out via SIP2, confirm the patron has not exceeded total number of items charged to their account according to circulation rules. It sounds like this check is not happening on each request?

Gurleen Kaur1 Giorgi Ninua I think this can go to a Volaris developer to confirm existing behavior and, if the behavior is not correct, make corresponding changes to enforce the check for each checkout request. 

Comment by Gurleen Kaur1 [ 19/Sep/22 ]

Tim Auger
In edge-sip2 code there is no specific check added for the checkout. As far as I know when perfromCheckout will be invoked then within the mod-circulation module this check should be added or implemented. As Sip2 generally does not contain the business logic and based on the commands its corresponding backend module is invoked within which the validations, business logic, etc is performed and the response is sent back to SIP2 for the end users console.
CC: Giorgi Ninua 

Comment by Marc Johnson [ 21/Sep/22 ]

Tim Auger Gurleen Kaur1

It sounds like this check is not happening on each request?

The check is performed on every check out request.

The concern that folks have is that when a SIP2 based self-checkout machine (or other client) makes that check out requests in parallel, then the checks could overlap with other check outs.

As checks tend to be performed near the beginning of the process, the checks could falsely succeed because a check out that would lead to the item limit being reached had not completed before the check happens.

Does that make sense?

Comment by Julian Ladisch [ 21/Sep/22 ]

This is caused by a race condition: After refuseWhenItemLimitIsReached is called (line 125) many other operations take place before createLoan is called (line 140): https://github.com/folio-org/mod-circulation/blob/v22.1.3/src/main/java/org/folio/circulation/resources/CheckOutByBarcodeResource.java#L125-L140

Assuming 9 items have been loaned and the item limit is 10.

When running in sequence:

item10 refuseWhenItemLimitIsReached <- this finds 9 loaned items and continues
item10 other operations
item10 createLoan
item11 refuseWhenItemLimitIsReached <- this finds 10 loaned items and exits with item limit error
item11 other operations
item11 createLoan

When running in parallel:

item10 refuseWhenItemLimitIsReached <- this finds 9 loaned items and continues
item11 refuseWhenItemLimitIsReached <- this finds 9 loaned items and continues
item10 other operations
item11 other operations
item10 createLoan
item11 createLoan 

item11 is incorrectly loaned because the item11 limit check runs between the item10 limit check and the item10 createLoan (race condition).

Comment by Tim Auger [ 21/Sep/22 ]

Thanks, Marc Johnson Julian Ladisch for the additional info. This makes sense. And, given the speed of batch-like checkout via RFID enabled self-check systems and slow network speeds, could see this happen more frequently than anyone would like. Do either of you believe a refactor could resolve this or is it more than that?

Comment by Marc Johnson [ 21/Sep/22 ]

Tim Auger

Do either of you believe a refactor could resolve this or is it more than that?

I think that is better asked of the relevant development team. I believe the Vega team put together a proposal for resolving this.

Personally, I don't think I would consider the likely solution to this to be a refactoring. I would expect it to involve significant changes. That this was originally a single issue for mod-circulation and is now a feature, suggests other folks also think there is considerably work involved.

Roughly speaking, the system needs to be changed so that check outs for the same patron cannot occur concurrently. This can either happen at the whole process level or (as I think was proposed) that there would be some kind of locking mechanism around the item limit check so that couldn't be performed if any other check out was ongoing.

Comment by Tim Auger [ 21/Sep/22 ]

Marc Johnson Cool. Based on the description and discussions I had with Vince, I thought it might be more than just restructuring. Thanks for the additional information. 

Comment by Khalilah Gambrell [ 05/Oct/22 ]

Gang Zhou  and twliu - how has Shanghai addressed this issue? 

 

cc: Tim Auger and Cheryl Malmborg 

Comment by Tim Auger [ 24/Apr/23 ]

Irina Pokhylets we should move this forward.

Comment by Tim Auger [ 02/May/23 ]

Irina Pokhylets we can move forward with this. Just talked with Steph. Alex K. is who we can help with arch. Cheryl is the community PO

Comment by Irina Pokhylets [ 03/May/23 ]

Tim Auger 

Based on the result of investigation CIRC-1740 Closed , there is no clean way to solve the issue on the SIP-2 level and the issue needs to be solved in mod-circulation. The CIRC-1786 Closed spike has been created for analyzing the proposed technical design.

Comment by Tim Auger [ 03/May/23 ]

Irina Pokhylets thanks! Much appreciated. I changed the team for CIRC-1786 Closed to Volaris. Vega is happy to be consulted (and should be consulted) on any proposed changes. Thanks!

Comment by Irina Pokhylets [ 28/Sep/23 ]

Closing the feature as all related tickets are closed.

Generated at Fri Feb 09 00:32:37 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.