Poppy 2023 R2 - Implement refresh token rotation (RTR) in all affected modules
(FOLIO-3627)
|
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None | Parent: | Poppy 2023 R2 - Implement refresh token rotation (RTR) in all affected modules |
| Type: | New Feature | Priority: | P3 |
| Reporter: | Matthew Jones | Assignee: | Steve Ellis |
| Resolution: | Done | Votes: | 0 |
| Labels: | refresh-tokens | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | CP: R3 2022 roadmap | ||||||||||||||||||||||||||||||||||||||||||||||||
| Story Points: | 2 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Development Team: | Core: Platform | ||||||||||||||||||||||||||||||||||||||||||||||||
| Epic Link: | Poppy 2023 R2 - Implement refresh token rotation (RTR) in all affected modules | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
While long lived access tokens may be convenient for early development, we really need to implement refresh tokens in FOLIO so our access tokens can be short lived. We should get to this sooner rather than later. As I didn't see an existing JIRA filed for refresh tokens, this has been created to ensure we keep it visible on our backlog. We also have some decisions to make regarding refresh token implementation, such as validation, expiration, revocation, and rotation. There are likely suitable third-party libraries worth considering as well, so we're not re-inventing the wheel here. https://tools.ietf.org/html/rfc6749#section-1.5 https://tools.ietf.org/html/draft-ietf-oauth-v2-threatmodel-08#section-4.1.2 |
| Comments |
| Comment by Kurt Nordstrom [ 08/Aug/18 ] |
|
I think one big question to answer here would be how exactly does the refresh token get used? The examples I've seen seem to indicate that the client makes use of the refresh token in order to get a new access token when the current one expires, but how much burden does that put on the front end as it currently exists? |
| Comment by Kurt Nordstrom [ 14/Aug/18 ] |
|
I'm tagging Mike Taylor and Jason Skomorowski here as I need to get an idea of the impact that implementation of refresh tokens would make on the operations of the front end. If there are others who might be more suited to comment here, please tag them in. Here is the run-down of how I envision it working:
|
| Comment by Matthew Jones [ 14/Aug/18 ] |
|
Kurt Nordstrom, Overall this looks like a good summary.
Yes. This is how it should be done. The UI can manage this orchestration at a low level. Such logic would exist both prior to and after API requests are made. For example, if the access token's expiration is visible, the UI can request a new access token, when expired, prior to issuing an API request for some data. In the event the UI does receive a 403 from using an expired token, it can request another access token and upon receipt, replay the original API request for data. Reporting the TTL as an expiration timestamp may be easier (and more accurate) for UI calculations, rather than having to work back to when the token was issued (which can be troublesome when factoring browser refreshes or network latency). 5 minutes may be a little short for the access token (things could get very chatty), but certainly it would be measured in minutes, maybe 15, 30, or 60. As long as its configurable we should be good. In situations where we need to elevate a user's scope for a given operation, that token may have a life of only 5 minutes or less. The refresh token should be treated like a password and not be persisted anywhere visible on the client. We should be able to monitor usage and support the ability to invalidate any refresh token that has been issued. I've also seen implementations that issue new refresh tokens with each new access token request, automatically invalidating the prior refresh token effectively making them single-use. From a resource perspective, the data requested is an access token, not a refresh token, so IMO the endpoint might simply be /token, but that's trivial. |
| Comment by Matthew Jones [ 14/Aug/18 ] |
|
It should also be noted whether we are passing tokens by value or by reference. I believe we are passing by value today. |
| Comment by Kurt Nordstrom [ 15/Aug/18 ] |
|
Tokens are currently stateless, so they're entirely self-contained. I assume that is what we mean by 'passing by value'? If the refresh token is required to be attached to outgoing requests by the client, how do we keep it from showing up in the traffic logs? |
| Comment by Mike Taylor [ 15/Aug/18 ] |
|
Thanks for including me in this. Before we get into Stripes details, a couple of points:
I know I've mentioned this before on another thread, but I never got a meaningful answer: how is this better than just logging in again? It's the same amount of network traffic (one additional call to get the new access token, plus one re-issued call to repeat the application's request with the new token). And I'm not seeing how it's any more secure. In either case, if you can spy on the network traffic, you can steal the necessary secret (either user+password pair, or refresh token), so what exactly is this additional step buying us?
I'm not really worried about that. Stripes is already very chatty. Two more requests every five minutes are not going to make any discernible difference. But as you say, this is going to be configurable anyway (and maybe to different values for different users), so there's no real problem here. Anyway – on to the impact on Stripes. First, it's obviously a non-starter to require application-level code to deal with this. So it needs to be handled in the network abstraction layer. We presently have two of these: stripes-connect, which is very widely used but in the process of becoming deprecated; and Apollo GraphQL, which is not ready to use outside of inventory, but moving fast and based on standard, well-supported tools. We could of course implement all this in stripes-connect, but it strikes me as the kind of code that's hard to get right, and that tends to harbour lots of edge-cases and mysterious bugs. It would be a pretty major undertaking to do this in a way that we were confident about. By contrast, this seems (on the basis of a very cursory search) to be essentially a solved problem for Apollo GraphQL – in fact, it's going to be one of those things where the difficulty is not finding a solution, but _choosing_one. For example:
Rather than have all applications use the same refresh-token code in their invocation of Apollo, we would likely create a wrapper library (stripes-apollo) that knows how to do this. We've been aware for some time that we're likely to need such a library, so this would probably be the point to introduce it. So my conclusion is that it would be a costly waste of time to implement refresh tokens in stripes-connect, and that the effort it requires would be much better invested in getting our GraphQL implementation up to scratch. (This kind of situation is precisely the reason, or one of them, that Jason pushed for GraphQL for so long.) Of course, even if my first point turns out to have some weight, and we choose simply to have the client login again after tokens expire rather than introducing the additional step of a refresh token, it will still be necessary to add session-expiry code to stripes-connect or our use of Apollo, So that conceptual simplification, if we adopted it, would not actually save us much work. |
| Comment by Matthew Jones [ 15/Aug/18 ] |
In terms of network traffic, there may not be much difference. However, the refresh token offers a far better user experience over the user having to repeatedly login We shouldn't be holding on to the user's password past the initial login in order to do this automatically. Further, refresh tokens can be revoked invalidating open sessions (once the last active access token expires) without having to reset the user's password.
Agreed and what meant by "orchestration at a low level". Thanks for being more specific in your assessment. I also agree that this effort should largely involve choosing (and configuring) an existing solution. This sort of problem has been solved many times over. |
| Comment by Mike Taylor [ 15/Aug/18 ] |
I didn't mean the user logging in again! I meant the software.
Why not? (To be clear, I'm not disagreeing with you here, just trying to understand.)
OK, that makes sense.
I never know what people mean by "orchestration". And when they start explaining the difference between "orchestration" and "choreography" my frontal lobe shuts down Anyway, we seem to be much on the same page, |
| Comment by Jason Skomorowski [ 16/Aug/18 ] |
|
I think I largely echo Mike's thoughts on implementation minus the part where we need a stripes-apollo as I think it's pretty reasonable to do this in stripes-core for now as that's where we configure the Apollo provider. Abstracting it out would be best planned alongside work to modularise auth. I also think we should have the current token available in redux (and via the "stripes context" or whatever we wind up using for accessing global state) so that we continue to afford apps the option of directly making a REST request without intervening abstraction. A token with the ability to regenerate a session for 24hrs sounds a bit scary. |
| Comment by Mike Taylor [ 17/Aug/18 ] |
If you're right that we can do all this in the Apollo provider, then yes – putting the code in stripes-core is fine. (I'd assumed that we'd need to have refresh-handling code at the point of using Apollo, which of course will be in many places.)
That seems to me like it's an unreasonably high bar for client code: there's a need to handle the possibility of an expired access token, to refresh it, and to retry the operation every time someone access the WSAPI not via the tools we offer. I mean, sure, let's not actively prevent that; but it's hard to imagine anyone will want to ... And much, much, easier to imagine people carelessly doing a one-off fetch without even thinking about any of the token-handling stuff. |
| Comment by Jakub Skoczen [ 21/Aug/18 ] |
|
Kurt Nordstrom Mike Taylor I am trying to distill this thread into a short API spec. Corrections welcome. Refresh tokens is an approach to avoid long-lived (thus insecure) access tokens. This approach combines the benefit of a stateless (self-verifiable) access token and stateful (revokable) refresh token. The API is as follows: 1. Client authenticates by calling the "login" endpoint and providing credentials (this rarely happens directly, usually done through users-bl)
POST authn/login
{"username":"YYYY","password":"XXXX"}
In the current implementation the response contains the X-Okapi-Response-Token header which is the access token. This needs to change to include both the initial access and refresh token. Kurt Nordstrom the current scheme puts access token in the response header rather than in the body and would require proliferation of headers. I think we should include all token metadata (access, refresh, ttl) in the body. Any problem with this approach? 2. Client performs request against FOLIO with the token attached in the request header. When the token expires, FOLIO responds with 401 and payload indicating the expired token:
GET /some/endpoint
401 Unauthorized
{
"message": "Access token has expired",
"code": "TOKEN_EXPIRED",
"type": "auth"
}
This bit here is essential as it allows the client to understand when to refresh the token without tracking TTL. Looks like the popular web APIs use 401 or 400 and specific payload to indicate expired tokens. We may also include info about where to refresh the token. 3. Client refreshes the access token:
POST authn/refresh
{"refreshToken": "XYZ"}
|
| Comment by Jakub Skoczen [ 21/Aug/18 ] |
|
Heikki Levanto Kurt Nordstrom Guys, one more thing I am not clear about is the impact on intermodule communication, for instance when multiple modules are involved in request processing. What will happen when the access token expires while processing a request: 1. entire request fails with 401 IMHO 1) will not be acceptable, e.g I could imagine a scenerion where the access token is set to a very short TTL (say 1 minute) and the request (comprised of multiple module calls) takes long than that and can never complete. 2) is putting an implementation burden on all modules. |
| Comment by Heikki Levanto [ 21/Aug/18 ] |
|
We have most of #3 in place already. Mod-authtoken can generate specific tokens for the duration of the request, even specific tokens for each module that participates in the processing. Mostly so that we can handle module-specific permissions (f.ex. mod-login has a permission to create a valid token for users it has authenticated). I don't remember if we already create a specific token for each request, but it would not be hard to do so. At the same time, we could include the request-id in the token, to make sure that it gets passed on to further requests, and our logs will have the full "stack trace" of the nested requests. |
| Comment by Kurt Nordstrom [ 21/Aug/18 ] |
|
I don't know that the access tokens granted to modules based on their module permissions need to be set to short-lived. If these tokens never get exposed to the user, there is no longer the worry that they can be persisted for mischief. |
| Comment by Kurt Nordstrom [ 21/Aug/18 ] |
I don't think that this should be a problem. |
| Comment by Heikki Levanto [ 21/Aug/18 ] |
|
As a general principle, we should use short-lived tokens where ever possible. A web service is not likely to run many minutes, because the clients tend to time out at some point. So a 20-minute token should be quite sufficient for inter-module calls. Even a 3-hour token would be acceptable, but I would like to see some kind of expiry time in it, just in case someone manages to sniff network traffic in what is supposed to be our private network, but may not always be so. Cloud service providers have before managed to misconfigure their networks, and admins may leave a SSH session open when they go for lunch... |
| Comment by Mike Taylor [ 22/Aug/18 ] |
|
Jakub Skoczen's overview looks good to me. Just two minor comments:
I'm not fond of this. At present, every HTTP response includes an x-okapi-token. I wouldn't like the token to communicated one way (in the body) in the authentication response, but a different way (in a header) in every other response. So I say just shove the refresh token in an X-Okapi-Refresh header.
An alternative is to use code 498 Invalid Token, which is mentioned on the Wikipedia page as unofficial, but evidently has some precedent. (I'm not arguing for doing it this way, just noting that it's an option.) |
| Comment by Kurt Nordstrom [ 22/Aug/18 ] |
|
I actually like the idea of putting the token in the body as a reply to the login request, since this feels more "explicit" to me. Post credentials, get tokens. Then we continue to use headers for the actual passing of tokens. |
| Comment by Craig McNally [ 26/Sep/19 ] |
|
Jakub Skoczen Kurt Nordstrom Where does this stand? This is currently marked as "in progress" but there haven't been any updates in a long time... |
| Comment by Ann-Marie Breaux (Inactive) [ 24/Mar/21 ] |
|
Craig McNally This came up in the folio-implementers Slack channel today. Since you're familiar with the GOBI order API integration, I have a question - if tokens would have to be refreshed on some regular basis, would that affect the GOBI API order creation in FOLIO? Or is that using a different structure to validate and be allowed to make changes in FOLIO? cc: Dennis Bridges |
| Comment by Jakub Skoczen [ 10/Nov/21 ] |
|
Steve Ellis this is a pretty old ticket and we know have more detailed tickets and a spike which also acts as an umbrella for other tickets, would it be okay for me to close this? |
| Comment by Steve Ellis [ 11/Nov/21 ] |
|
Jakub Skoczen Closing this one is fine with me if it improves the organization of the tickets. |
| Comment by Khalilah Gambrell [ 16/May/22 ] |
|
Steve Ellis ans Jakub Skoczen - has implement refresh tokens been implemented. |
| Comment by Steve Ellis [ 16/May/22 ] |
|
Khalilah Gambrell I think this ticket can be closed and am closing it. Please feel free to reopen it if it is useful to you. We have more specific tickets which are available if you click on the refresh tokens tag. Where this stands currently: We have implemented and merged most of what is needed in mod-authtoken to support this (
|