Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Note
titleThis Page is a Work In Progress


...

  • Currently the requestId is included in these tokens, so if we want to pursue this optimization, it will be at the cost of that.  I think it's only there for convenience - it's unclear if anyone is actually using this information or not.
  • The Should the cache should be shared / synchronized between nodes in the OKAPI cluster?
  • Is it necessary to provide a mechanism for manually purging and/or pruning the cache?
  • Must these tokens be tenant-specific, or can the same token be used by multiple tenants?

...

  • Performance gains would not be seen immediately, as the cache would need to warm up before we could benefit from saving on calls to mod-authtoken
  • Once the cache is warmed, the time savings here would potentially be enormous.  
  • Example: 
    • Current (from baseline measurements):

      Code Block
      POST circulation/check-out-by-barcode?testId=co_test02_7
      
      # of auth calls:                      33
      Total request time:                   384.93 ms
      Total time on calls to mod-authtoken: 198.59 ms (51.59%)
      Total auth module time:               159.00 ms (80.06%)
      Total autn network time:              39.59 ms (19.94%)
      
      # Stats for initial auth call (used to calculate theoretical below)
      Auth Request Time: 5.92 ms
      Auth Module Time:  5.00 ms (84.46%)
      Auth Network Time: 0.92 ms (15.54%)


    • Theoretical

      Code Block
      POST circulation/check-out-by-barcode?testId=co_test02_7
      
      # of auth calls:                      1
      Total request time:                   192.26 ms
      Total time on calls to mod-authtoken: 5.92 ms (3.08%)
      Total auth module time:               5.00 ms (84.46%)
      Total autn network time:              0.92 ms (15.54%)

      The 1 auth call here is the original users call for check-out-by-barcode.  The rest of the auth calls are module-to-module.  Those tokens could be obtained from cache, preventing the need to call mod-authtoken at all. 

Actual Benefits

A PoC for this is still a work in progress.  No actual results have been obtained yet.

PoC Implementation Notes

During implementation of this PoC, I realized that mod-authtoken's plays a bigger role than initially expected.  More specifically, it sets X-Okapi-User-Id and X-Okapi-Permissions headers which are then relayed onto the proxy calls made by OKAPI to the actual module.  This means that these values need to also be cached along with the token, and when using a cached, token they need to be set since the call to mod-authtoken would be skipped.  It also means that cache entries are now user-specific.  In order to keep the system responsive to changes in user status (active/not-active, permissions granted/revoked, etc.) the cache entries must have a relatively short TTL.

After a few iterations we settled on the following:

Cache keys are compound; comprised of 4 parts: <Method>|<PathPattern (or 'Path' if not defined)>|<X-Okapi-User-Id>|<X-Okapi-Token>

Cache values include:  the X-Okapi-Token mod-authtoken would have returned, as well as the values of X-Okapi-User-Id and X-Okapi-Permissions which mod-authtoken would have set.

Actual Benefits

A PoC was evaluated by the Performance Task Force (PFT).  That effort was tracked in PERF-113.  In a test which simulated 8 concurrent users, the benefits seen during check-in/check-out were as follows:

Existing Okapi (v3.1.2) and mod-authtoken (v2.5.1)
Average Check in time: 1.024 sec
Average Check out time: 1.812 sec

POC Okapi and mod-authtoken
Average Check in time: 0.729 sec
Average Check out time: 1.364 sec

% Differences: 
Check in: 41% better in POC version
Check out: 33% better in POC version

mod-authtoken CPU utilization is 4 times better with caching. See attached screenshot and Excel file for more details

Next Steps

  •  Port the token cache PoC to Okapi 4 - https://github.com/folio-org/okapi/tree/tokenCache
  •  Add metrics for tracking cache events (hits/misses/etc.) - https://github.com/folio-org/okapi/tree/tokenCache
  •  Re-test with this version of the PoC
  •  Create stories (OKAPI/MODAT) for the implementation of this optimization - 
    Jira Legacy
    serverSystem JIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODAT-86
     
    Jira Legacy
    serverSystem JIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyOKAPI-890
  •  Sort out other implementation details - See Considerations above.