Investigate blocked transaction and streaming get

Description

When standing up a production-ready system with a fresh Postgres v10 or v11 database, or when upgrading a system from Q1 to Q2 we at Tamu cannot load some sample data, or bootstrap in a superuser, or log in with an existing user in the case of an upgraded system. sees the same in his environment with a fresh system.

I have stood up a Folio system in the hosted Rancher dev environment under the Tamu Project and namespace, and reproduced the issue there as well - ruling out Texas A&M's infrastructure as a cause of the problem.

These blocked transaction issues have not been seen in our previous environments, and our underlying infrastructure has not changed. This seems to have been introduced with newer versions of RMB, and needs to be investigated further.

and I diagnosed the issue further, and came up with a work-around in RMB for several storage modules we witnessed the problem in. He states:

https://vertx.io/docs/vertx-pg-client/java/#_using_transactions

"e.g the infamous current transaction is aborted, commands ignored until end of transaction block"

After investigating the login process through [mod-users-bl](https://github.com/folio-org/mod-users-bl/blob/master/src/main/java/org/folio/rest/impl/BLUsersAPI.java#L704), it seemed as if some request to the several modules during this process was causing the failure. The first request to `/authn/login` appears to succeed as it continues to request the user from mod-users. What stood out about the get user request was that it made a request to a stream get by `query=username==admin`. I really don't think this is appropriate for a single resource lookup. After looking into the stream get RMB, I noticed it created a transaction if the sql connection wrapper did not have one. I do not see the necessity for a stream get to be atomic and can potentially be long running anyhow. If the connection is in a transaction, so be it, but always starting a transaction for a stream may be a bad idea. If for some reason I am not aware of, the streaming results requires a transaction, further investigation will be needed.

Removing transaction begin in stream get does resolve the issue and fixed every module we upgraded to it. Unfortunately, the vertx-pg-client documentation specifies a transaction is required to stream results.

https://vertx.io/docs/vertx-pg-client/java/#_cursors_and_streaming

// Streams require to run within a transaction

What is strange the API calls to the stream get still worked. Which leads me to believe a transaction is already started.

The following draft PRs are the changes made to get our deployment past this issue:

https://github.com/folio-org/raml-module-builder/pull/773
https://github.com/folio-org/mod-users/pull/183
https://github.com/folio-org/mod-permissions/pull/95
https://github.com/folio-org/mod-configuration/pull/93_

When bootstrapping in the superuser, I see these errors for the request:

And these are the mod-user logs:

CSP Request Details

None

CSP Rejection Details

None

Potential Workaround

None

Checklist

hide

TestRail: Results

Activity

Show:

Jakub Skoczen October 29, 2020 at 1:18 PM

All storage modules should be upgraded to the RMB fixVersion for this ticket but we will wait with this upgrade until is fixed to bundle these fixes.

Oleksii Popov October 29, 2020 at 1:17 PM

released in two versions

Julian Ladisch October 28, 2020 at 7:34 AM

Current status:
We test

  • mod-inventory-storage-19.5.0-SNAPSHOT.489 (= Upgrade to RMB 31.1.3, Vert.x 3.9.4) and

  • mod-inventory-storage-19.5.0-SNAPSHOT.490 (= Upgrade to RMB 31.1.3, Vert.x 3.9.4, and "FOR UPDATE" fix)

If the first fixes all affected modules should upgrade to RMB 30.2.9 (Goldenrod) or RMB 31.1.3 (Honeysuckle) and Vert.x 3.9.4.
If it doesn't but the second fixes we need to release RMB 30.2.10 (Goldenrod) and RMB 31.1.4 (Honeysuckle) with the "FOR UPDATE" fix and upgrade all affected modules to RMB 30.2.10/31.1.4 and Vert.x 3.9.4.

Then we will be ready for the Goldenrod hotfix 5 release.

jroot October 27, 2020 at 10:17 PM
Edited

mod-users v17.1.2 is the last piece of the puzzle that solves the transaction block issue with pg_audit extension!

Tested upgrading a clone of our previous Q1-2020 instance to Q2-2020 hotfix 4 plus mentioned releases (mod-configuration v5.4.4, mod-oai-pmh v3.1.3, mod-inventory-storage v19.3.6, mod-users v17.1.2).

Upgrade was successful, and I can log in.

Now I will test upgrading this Q2-2020 instance to Q3-2020 with the suggested mod-inventory-storage v19.5.0-SNAPSHOT.489 fix.

Julian Ladisch October 27, 2020 at 8:21 PM

Yes.
Before we release hotfix 5 we should try whether we need another RMB fix for the revoke issue to get it included in hotfix 5.

Done

Details

Assignee

Reporter

Priority

Sprint

Development Team

Core: Platform

Fix versions

Release

Q2 2020 Hot Fix #5

Affected Institution

TAMU

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs
Created September 11, 2020 at 2:42 PM
Updated November 24, 2020 at 5:04 PM
Resolved October 29, 2020 at 1:17 PM
TestRail: Cases
TestRail: Runs