[FOLIO-2808] Reference environment builds broken 2020-09-29 -- upgrading to okapi-4 Created: 29/Sep/20  Updated: 05/Oct/20  Resolved: 02/Oct/20

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Bug Priority: P1
Reporter: David Crossley Assignee: John Malconian
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: Text File mod-authtoken.txt     Text File mod-login.txt    
Issue links:
Blocks
blocks ERM-940 Declare optional okapi interface depe... Closed
is blocked by OKAPI-907 CORS Rejected - Invalid origin Closed
is blocked by OKAPI-906 Preflight requests to Okapi 4 via Str... Closed
Relates
relates to OKAPI-835 secure Okapi internal APIs Closed
relates to OKAPI-904 K8s compatible liveness and readiness... Closed
relates to OKAPI-905 Remove permissions for proxy health s... Closed
relates to STRIPES-703 Software versions shows dependencies ... Closed
Sprint: DevOps: Sprint 97
Development Team: FOLIO DevOps

 Description   

Reference environment Jenkins builds are broken today.

This was the first day after upgrading to okapi-4 ( FOLIO-2787 Closed ).

Done some basic inspection via ssh to folio-snapshot server and look to various module docker logs:

mod-authtoken log starts like this:

...
INFO: Succeeded in deploying verticle
Sep 29, 2020 1:50:28 AM mod-auth-authtoken-module
SEVERE: [](user permissions) nor [](module permissions) do not contain okapi.proxy.health.get
Sep 29, 2020 1:50:28 AM mod-auth-authtoken-module
SEVERE: Access requires permission: okapi.proxy.health.get
...

with that repeated many times.

Glanced at a few other modules, e.g. mod-login
which start like this, and have not much else:

...
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
ERROR StatusLogger Unrecognized format specifier [d]
ERROR StatusLogger Unrecognized conversion specifier [d] starting at position 16 in conversion pattern.
ERROR StatusLogger Unrecognized format specifier [thread]
ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern.
...

See attached logs for those.



 Comments   
Comment by Adam Dickmeiss [ 29/Sep/20 ]

All end points require permissions .. that's the major change for Okapi 4.0.0 OKAPI-835 Closed

Comment by Adam Dickmeiss [ 29/Sep/20 ]

The warnings "getCallerClass" is harmless and is fixed in OKAPI-885 Closed but should be fixed for many/all other modules as well.

Comment by John Malconian [ 30/Sep/20 ]

Issues detected so far:

2020-09-30T19:05:47,257 INFO  ProxyContext         635529/proxy REQ 10.36.1.27:39908 supertenant GET /_/proxy/health  mod-authtoken-2.6.0-SNAPSHOT.77 okapi-4.1.3
2020-09-30T19:05:47,259 INFO  ProxyContext         635529/proxy RES 403 1499us mod-authtoken-2.6.0-SNAPSHOT.77 http://10.36.1.51:9174/_/proxy/health
2020-09-30T19:05:47,259 INFO  DockerModuleHandle   mod-authtoken-2.6.0-SNAPSHOT.77 Sep 30, 2020 7:05:47 PM mod-auth-authtoken-module
2020-09-30T19:05:47,260 INFO  DockerModuleHandle   mod-authtoken-2.6.0-SNAPSHOT.77 SEVERE: [](user permissions) nor [](module permissions) do not contain okapi.proxy.health.get
2020-09-30T19:05:47,260 INFO  DockerModuleHandle   mod-authtoken-2.6.0-SNAPSHOT.77 Sep 30, 2020 7:05:47 PM mod-auth-authtoken-module
2020-09-30T19:05:47,260 INFO  DockerModuleHandle   mod-authtoken-2.6.0-SNAPSHOT.77 SEVERE: Access requires permission: okapi.proxy.health.get

The two issues above occur before any authentication to the system is made.

When logging in as the default tenant admin, a 403 is generated when POSTing to '/bl-users/login?expandPermissions=true&fullPermissions=true' and I'm unable to login.

I don't see any additional information in module logs or okapi logs.

Comment by John Malconian [ 30/Sep/20 ]

Test build here: https://folio-snapshot-test.dev.folio.org/

Comment by John Malconian [ 01/Oct/20 ]

The errors related to the /_/proxy/health endpoint are, as Adam pointed out, generated by health checks from the AWS load balancer. He's released Okapi 4.1.4 that exposes this endpoint unauthenticated. Either way, I think this is still a red herring.

Requests to '/saml/check' and '/bl-users/login?expandPermissions=true' both result in a 403. The issue appears to be CORS related. Consider the following requests to folio-snapshot (Okapi 3) and folio-snapshot-test (Okapi 4):

folio-snapshot:

curl -w '\n' -D - -X OPTIONS \
 -H 'Host: folio-snapshot-okapi.dev.folio.org' \
 -H 'Accept: */*' \
 -H 'Accept-Language: en-US,en;q=0.5' \
 -H 'Accept-Encoding: gzip, deflate, br' \
 -H 'Access-Control-Request-Method: POST' \
 -H 'Access-Control-Request-Headers: content-type,x-okapi-tenant' \
 -H 'Referer: https://folio-snapshot.dev.folio.org/' \
 -H 'Origin: https://folio-snapshot.dev.folio.org' \
 -H 'DNT: 1' \
 -H 'Connection: keep-alive' \
 -H 'Pragma: no-cache' \
 -H 'Cache-Control: no-cache' \
 -H 'TE: Trailers' \
 "https://folio-snapshot-okapi.dev.folio.org/bl-users/login?expandPermissions=true"

HTTP/2 200 
date: Thu, 01 Oct 2020 16:12:02 GMT
content-length: 0
access-control-allow-origin: *
access-control-allow-methods: PUT,PATCH,DELETE,GET,POST
access-control-allow-headers: content-type,X-Okapi-Tenant,X-Okapi-Token,Authorization,X-Okapi-Request-Id,X-Okapi-Module-Id

folio-snapshot-test:

curl -w '\n' -D - -X OPTIONS \
 -H 'Host: folio-snapshot-test-okapi.dev.folio.org' \
 -H 'Accept: */*' \
 -H 'Accept-Language: en-US,en;q=0.5' \
 -H 'Accept-Encoding: gzip, deflate, br' \
 -H 'Access-Control-Request-Method: POST' \
 -H 'Access-Control-Request-Headers: content-type,x-okapi-tenant' \
 -H 'Referer: https://folio-snapshot-test.dev.folio.org/' \
 -H 'Origin: https://folio-snapshot-test.dev.folio.org' \
 -H 'DNT: 1' \
 -H 'Connection: keep-alive' \
 -H 'Pragma: no-cache' \
 -H 'Cache-Control: no-cache' \
 -H 'TE: Trailers' \
 "https://folio-snapshot-test-okapi.dev.folio.org/bl-users/login?expandPermissions=true"

HTTP/2 403 
date: Thu, 01 Oct 2020 16:12:47 GMT
content-length: 30

CORS Rejected - Invalid origin

Comment by John Malconian [ 01/Oct/20 ]

There is nothing in any log file that indicates what is going on here.

Additionally, Adam Dickmeiss, folio-snapshot-test is running Okapi 4.1.4, however, the following error occurs on the /_/proxy/health endpoint:

curl -w '\n' -D - https://folio-snapshot-test-okapi.dev.folio.org/_/proxy/health
HTTP/2 400 
date: Thu, 01 Oct 2020 16:37:26 GMT
content-type: text/plain
vary: origin
x-okapi-trace: GET mod-authtoken-2.7.0-SNAPSHOT.78 http://10.36.1.33:9146/_/proxy/health : 400 3375us

Invalid URL path requested /_/proxy/health
Comment by John Malconian [ 01/Oct/20 ]

The above may be indicative of an issue only after restarting Okapi. Verifying with a new build....

Comment by John Malconian [ 01/Oct/20 ]

Indeed, there are no issues with the health endpoint. The issue seems to be restarting Okapi (where Okapi deploys the module) made a mess of things. I don't know if that's a whole other issue, but I'm not going to worry about it right now. The CORS issue still stands, however. I've asked Jason Root if he sees anything similar. Waiting to get a response back from him although I know his setup is a bit different from ours. I don't think AWS ALB does anything special with CORS except let the backend application handle it, but I'm not ruling anything out.

Comment by Adam Dickmeiss [ 01/Oct/20 ]

FWIW, the local Ansible box (testing) with Okapi 4.1.4 seems to work fine. I guess CORS is not really in use there, which explain what that seems to work. Will continue to see why CORS would break. While we did not change something ourselves, we did upgrade to a newer version of Vert.x for Okapi 4.0.0.

Comment by Adam Dickmeiss [ 02/Oct/20 ]

Hopefully with release of Okapi 4.1.5 the CORS issue will be fixed. CORS is handled by Vert.x before "normal" proxy operation of Okapi is even started. That's why the log is silent about this. Created OKAPI-908 Open for that work.

Comment by Jakub Skoczen [ 02/Oct/20 ]

John Malconian will you a get a chance to try Okapi 4.1.5 today and see if it fixes this problem?

Comment by John Malconian [ 02/Oct/20 ]

With the resolution of OKAPI-907 Closed and the release of okapi 4.1.5, I have unpinned Okapi. The next set of reference builds will include Okapi 4.1.5. A test build of folio-snapshot is currently available at https://folio-snapshot-test.dev.folio.org.

Comment by David Crossley [ 03/Oct/20 ]

With today's build we can now login as diku_admin, but the "Settings" page for "UI modules" is total red. Also says "Version Unknown" for Okapi.

Also:

ERROR: in module @folio/core, operation GET on resource 'enabledModules' failed, saying: Access requires permission: okapi.proxy.tenants.modules.list

Users search and view does seems to be working properly.

So it seems like progress, and maybe just a new permission needed by the Stripes facility that presents the Settings.

Looking via the backend, docker logs for mod-authtoken is full of "SEVERE: Unsupported JWT format" (not sure if relevant).

Comment by Zak Burke [ 05/Oct/20 ]

FYI Adam Dickmeiss, David Crossley's comment that the UI functions notwithstanding, in fact the UI is severely crippled by lack of access to /_/proxy/tenants/$tenant/modules. Marc Johnson suggested this could be the case last week and Anna Melnyk confirmed it today.

In other words, although the rest of the UI isn't showing error messages, it is wrongly assuming that all data behind an optional-okapi-interface is unavailable.

Generated at Thu Feb 08 23:23:24 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.