slow connections cause /authn/refresh request to be replayed, causing session termination

Description

Technical Summary: As part of the RTR lifecycle we use a flag in local-storage as a way to synchronize refresh requests across multiple tabs. To avoid the problem of a cancelled request leaving cruft in storage, we inspect the timestamp on the flag and consider a request "stale" if it's too old. That was the problem here: our "too old" timeout was too short; on a busy server, or on a slow connection, or on a client far from its host (say, in New Zealand), two seconds was not long enough. The rotation request would still be in-flight when stripes considered it "stale", allowing a second request to go through. But since the first request was just slow, not dead, the second one is treated as a token-replay attack by the backend, causing all active sessions for that user account to be immediately terminated.

Thus, waiting longer (20 seconds instead of 2 seconds) is a quick fix.

Original Overview: In the Bulk Edit application, when the user downloads a file with a user barcode and reloads the page with 'Preview of record matched,' the page does not fully render, or sometimes the application displays a 'Something went wrong' page.
This issue may occur after the API call POST 'authn/refresh'.

Note: This typically happens when the user logs into the application for the first time or opens the application from the Incognito mode of the browser.
This issue particularly affects automated tests.

IMPORTANT NOTE: After conducting further investigations, it was found that the issue occurs when the user experiences slow or unstable internet connectivity. The issue is reproducible when emulating a slow internet connection using browser-based network throttling set to 3G.

Precondition:
Authorized user with permissions:

"Bulk Edit: In app - Edit inventory records"
"Bulk Edit: In app - View inventory records"
"Bulk edit: In app - Edit user records"
"Users: Can edit user profile"

User has .csv file with the existing on the environment user barcode, ex:Loading file...


Steps to reproduce:

  1. Navigate to the Bulk edit app => Select the "Users" radio button on  the "Record types" accordion => Select  "User barcodes" option from the "Record identifier" dropdown

  2. Upload a .csv file  with "User barcodes" by dragging it on the "Drag & drop" area - The "Preview of record matched" accordion shows the table populated with matched Users records

  3. Reload the page


Expected result: The "Preview of record matched" accordion shows the table populated with matched Users records.
Actual result: The page does not fully render, or sometimes the application displays a 'Something went wrong' page.

Please, see the attached screencast

Log in - FOLIO - Google Chrome 2024-10-02 14-24-38.mp4

 

image-20241002-100206.png

CSP Request Details

Describe issue impact on business: UI sessions for a given user are unexpectedly terminated. For shared accounts, this can have a large impact. What institutions are affected? Trinity College Cambridge, National Library of Australia, Javeriana, University of Canterbury What is the workaround if exists? None What areas will be impacted by fix? The UI's background process of token rotation. This is an NFR; there are no user-visible changes. Brief explanation of technical implementation and the level of effort (in workdays) and technical risk (low/medium/high): There is no code change; we are merely setting a longer timeout (20 seconds instead of 2 seconds). Low-effort, low-risk since there are no code changes. Less than 1 workday. Brief explanation of testing required and level of effort (in workdays). Provide test plan agreed with by QA Manager and PO. This was an intermittent problem that we do not have a specific test for. We could add manual testing with deliberately throttled network requests. What is the roll back plan in case the fix does not work? Publish another patch release restoring the original value, a similarly low-effort low-risk change.

CSP Rejection Details

None

Potential Workaround

None

Attachments

3
  • 02 Oct 2024, 10:18 AM
  • 02 Oct 2024, 10:18 AM
  • 02 Oct 2024, 10:18 AM

Checklist

hide

Activity

Show:

Romy Forrer January 28, 2025 at 2:23 AM

- thanks - will DM you in Slack - I have 2 files - one from an SSO session and one username/pwd session for the same account.

Zak Burke January 28, 2025 at 2:08 AM

, I’m so sorry you continue to be plagued by RTR issues. Please attach a .har to this ticket or DM it to me in Slack if you prefer.

Romy Forrer January 28, 2025 at 1:58 AM

Reproduced the issue on first try - I have a .har ready - we get a 422 on https://okapi-uofcanterbury.folio.ebsco.com/authn/refresh token.refresh.unprocessable

Romy Forrer January 28, 2025 at 12:02 AM

- we now have CSP 6 (on stripes-core version 10.1.4) and staff are still getting signed out of their generic desk logins - I don’t think it’s as frequent as previously but doesn’t seem to be entirely fixed by this fix. I’ll try to reproduce and get a new .har file.

Zak Burke October 14, 2024 at 9:13 PM

CC:

Done

Details

Assignee

Reporter

Priority

Story Points

Sprint

Development Team

Stripes Force

Release

Quesnelia (R1 2024) Service Patch #6

RCA Group

Data related (ex. Can be detected with large dataset only)

CSP Approved

Yes

Affected releases

Quesnelia (R1 2024)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs
Created October 2, 2024 at 9:55 AM
Updated February 25, 2025 at 2:24 AM
Resolved October 21, 2024 at 3:16 PM
TestRail: Cases
TestRail: Runs

Flag notifications