Skip to end of banner
Go to start of banner

MODAT-80/83 - Auth Performance Improvements

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

This Page is a Work In Progress


Overview

It seems that the current auth flow consumes more time than we had originally hoped/expected.  This investigation is to better understand where the bottlenecks are and to evaluate potential optimizations.

Baseline Measurements

Before evaluating possible performance improvements, I wanted to get a better understanding of where the pain points are.  These measurements will also serve as a baseline for comparison purposes as various enhancements are trialed.

Test Environment

  • AWS ECS
  • Private - nobody else using this except me
  • NGINX reverse proxy running in each container
  • Single instance of each module
  • Goldenrod platform-complete
  • Standard sample/reference data loaded

Methodology

  • Make a check-out-by-barcode request - add a query argument to help easily identify the request in logs
  • Make a check-in-by-barcode request (same loan that was just created) - also add a query arg here to help identify the request in logs
  • Analyze logs
    • Find the checkout/in request using the testId query argument, from here we will get the requestId and find all related/internal requests
      • From this log entry we obtain the total request time from okapi's perspective, including all required internal calls to auth, other modules, etc. 
      • In order to isolate the amount of time spent on network vs time spent in mod-authtoken, we dig into the mod-authtoken reverse proxy log.
      • Correlate each auth request with a log entry in the mod-authtoken nginx transaction log. 
        • requestId isn't explicitly logged, but the okapi token is, which contains the requestId
        • The transaction time logged here constitutes the time spent in the module
        • Derive network time for the auth call from the total request time and the auth module time.
    • Aggregate numbers and report a summary for each check-in/out call.

Results

test01

  • 10 interleaved calls to check-out-by-barcode and check-in-by-barcode 

check-out-by-barcode:

POST circulation/check-out-by-barcode?testId=co_test02_1

# of auth calls:                      47
Total request time:                   1055.05 ms
Total time on calls to mod-authtoken: 975.63 ms (92.47%)
Total auth module time:               843.00 ms (86.41%)
Total autn network time:              132.63 ms (13.59%)

POST circulation/check-out-by-barcode?testId=co_test02_2

# of auth calls:                      33
Total request time:                   504.08 ms
Total time on calls to mod-authtoken: 241.96 ms (48.00%)
Total auth module time:               191.98 ms (79.34%)
Total autn network time:              49.98 ms (20.66%)

POST circulation/check-out-by-barcode?testId=co_test02_3

# of auth calls:                      33
Total request time:                   471.79 ms
Total time on calls to mod-authtoken: 264.62 ms (56.09%)
Total auth module time:               227.00 ms (85.78%)
Total autn network time:              37.62 ms (14.22%)

POST circulation/check-out-by-barcode?testId=co_test02_4

# of auth calls:                      34
Total request time:                   482.83 ms
Total time on calls to mod-authtoken: 231.83 ms (48.01%)
Total auth module time:               198.00 ms (85.41%)
Total autn network time:              33.83 ms (14.59%)

POST circulation/check-out-by-barcode?testId=co_test02_5

# of auth calls:                      34
Total request time:                   415.46 ms
Total time on calls to mod-authtoken: 211.03 ms (50.79%)
Total auth module time:               163.00 ms (77.24%)
Total autn network time:              48.03 ms (22.76%)

POST circulation/check-out-by-barcode?testId=co_test02_6

# of auth calls:                      33
Total request time:                   445.91 ms
Total time on calls to mod-authtoken: 207.60 ms (46.56%)
Total auth module time:               171.00 ms (82.37%)
Total autn network time:              36.60 ms (17.63%)

POST circulation/check-out-by-barcode?testId=co_test02_7

# of auth calls:                      33
Total request time:                   384.93 ms
Total time on calls to mod-authtoken: 198.59 ms (51.59%)
Total auth module time:               159.00 ms (80.06%)
Total autn network time:              39.59 ms (19.94%)

POST circulation/check-out-by-barcode?testId=co_test02_8

# of auth calls:                      34
Total request time:                   421.00 ms
Total time on calls to mod-authtoken: 219.96 ms (52.25%)
Total auth module time:               182.84 ms (83.12%)
Total autn network time:              37.12 ms (16.88%)

POST circulation/check-out-by-barcode?testId=co_test02_9

# of auth calls:                      33
Total request time:                   391.64 ms
Total time on calls to mod-authtoken: 199.07 ms (50.83%)
Total auth module time:               160.00 ms (80.37%)
Total autn network time:              39.07 ms (19.63%)

POST circulation/check-out-by-barcode?testId=co_test02_10

# of auth calls:                      33
Total request time:                   419.94 ms
Total time on calls to mod-authtoken: 203.11 ms (48.37%)
Total auth module time:               164.00 ms (80.74%)
Total autn network time:              39.11 ms (19.26%)

check-in-by-barcode

POST circulation/check-in-by-barcode?testId=ci_test02_1

# of auth calls:                      22
Total request time:                   423.86 ms
Total time on calls to mod-authtoken: 198.28 ms (46.78%)
Total auth module time:               172.00 ms (86.75%)
Total autn network time:              26.28 ms (13.25%)

POST circulation/check-in-by-barcode?testId=ci_test02_2

# of auth calls:                      21
Total request time:                   310.25 ms
Total time on calls to mod-authtoken: 137.93 ms (44.46%)
Total auth module time:               114.00 ms (82.65%)
Total autn network time:              23.93 ms (17.35%)

POST circulation/check-in-by-barcode?testId=ci_test02_3

# of auth calls:                      21
Total request time:                   330.54 ms
Total time on calls to mod-authtoken: 138.82 ms (42.00%)
Total auth module time:               115.00 ms (82.84%)
Total autn network time:              23.82 ms (17.16%)

POST circulation/check-in-by-barcode?testId=ci_test02_4

# of auth calls:                      21
Total request time:                   301.42 ms
Total time on calls to mod-authtoken: 133.59 ms (44.32%)
Total auth module time:               111.00 ms (83.09%)
Total autn network time:              22.59 ms (16.91%)

POST circulation/check-in-by-barcode?testId=ci_test02_5

# of auth calls:                      21
Total request time:                   309.43 ms
Total time on calls to mod-authtoken: 126.37 ms (40.84%)
Total auth module time:               103.00 ms (81.51%)
Total autn network time:              23.37 ms (18.49%)

POST circulation/check-in-by-barcode?testId=ci_test02_6

# of auth calls:                      21
Total request time:                   317.40 ms
Total time on calls to mod-authtoken: 127.88 ms (40.29%)
Total auth module time:               100.00 ms (78.20%)
Total autn network time:              27.88 ms (21.80%)

POST circulation/check-in-by-barcode?testId=ci_test02_7

# of auth calls:                      21
Total request time:                   365.12 ms
Total time on calls to mod-authtoken: 136.78 ms (37.46%)
Total auth module time:               110.00 ms (80.42%)
Total autn network time:              26.78 ms (19.58%)

POST circulation/check-in-by-barcode?testId=ci_test02_8

# of auth calls:                      21
Total request time:                   305.61 ms
Total time on calls to mod-authtoken: 130.92 ms (42.84%)
Total auth module time:               104.00 ms (79.44%)
Total autn network time:              26.92 ms (20.56%)

POST circulation/check-in-by-barcode?testId=ci_test02_9

# of auth calls:                      21
Total request time:                   323.88 ms
Total time on calls to mod-authtoken: 131.24 ms (40.52%)
Total auth module time:               113.00 ms (86.10%)
Total autn network time:              18.24 ms (13.90%)

POST circulation/check-in-by-barcode?testId=ci_test02_10

# of auth calls:                      21
Total request time:                   291.32 ms
Total time on calls to mod-authtoken: 117.92 ms (40.48%)
Total auth module time:               96.00 ms (81.41%)
Total autn network time:              21.92 ms (18.59%)

Observations

  • You can see that some caching is being done in the check-out-by-barcode flow.  This is evident by the number of auth calls.  The first has 47, then subsequent are much lower 33/34
    • I'm not yet sure why it's not always 33 or 34.  
    • A similar pattern can be seen in the check-in-by-barcode flow, though much less so, presumably because the requests are interleaved.  The checkout call happens first, values are cached, so when the check-in call is made, the cached values are used.
  • Network traffic consistently takes up a fair amount of the time spent on auth (~15-20% on average)
  • A large portion of the request time is spent on auth (~40-50% on average)
  • TBD

Experiments

Description of various experiments and PoCs conducted.

Caching module-to-module tokens

In theory these don't really change very frequently, so caching them makes sense. 

Considerations

  • Currently the requestId is included in these tokens, so if we want to pursue this optimization, it will be at the cost of that.  I think it's only there for convenience - it's unclear if anyone is actually using this information or not.
  • The cache should be shared / synchronized between nodes in the OKAPI cluster
  • Is it necessary to provide a mechanism for manually purging and/or pruning the cache?
  • Must these tokens be tenant-specific, or can the same token be used by multiple tenants?

Challenges

  • Figuring out where to put this in OKAPI is a little tricky, but clearly doing it in OKAPI would give us the maximum benefit as it would mean skipping the call to mod-authtoken altogether, saving network time as well.
    • For my PoC I settled on checking for a cached token when determining the set of API calls needed, and populating the cache when handling the auth response.  ##TODO:  provide reference to PoC branch

Theoretical Benefits

  • Performance gains would not be seen immediately, as the cache would need to warm up before we could benefit from saving on calls to mod-authtoken
  • Once the cache is warmed, the time savings here would potentially be enormous.  
  • Example: 
    • Current (from baseline measurements):

      POST circulation/check-out-by-barcode?testId=co_test02_7
      
      # of auth calls:                      33
      Total request time:                   384.93 ms
      Total time on calls to mod-authtoken: 198.59 ms (51.59%)
      Total auth module time:               159.00 ms (80.06%)
      Total autn network time:              39.59 ms (19.94%)
      
      # Stats for initial auth call (used to calculate theoretical below)
      Auth Request Time: 5.92 ms
      Auth Module Time:  5.00 ms (84.46%)
      Auth Network Time: 0.92 ms (15.54%)
    • Theoretical

      POST circulation/check-out-by-barcode?testId=co_test02_7
      
      # of auth calls:                      1
      Total request time:                   192.26 ms
      Total time on calls to mod-authtoken: 5.92 ms (3.08%)
      Total auth module time:               5.00 ms (84.46%)
      Total autn network time:              0.92 ms (15.54%)

      The 1 auth call here is the original users call for check-out-by-barcode.  The rest of the auth calls are module-to-module.  Those tokens could be obtained from cache, preventing the need to call mod-authtoken at all. 

Actual Benefits

A PoC for this is still a work in progress.  No actual results have been obtained yet.


  • No labels