Performance
(UXPROD-746)
|
|
| Status: | Closed |
| Project: | UX Product |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None | Parent: | Performance |
| Type: | New Feature | Priority: | P1 |
| Reporter: | Holly Mistlebauer | Assignee: | Holly Mistlebauer |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | NFR, performance | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Epic Link: | Performance | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front End Estimate: | Out of scope | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front-End Confidence factor: | Low | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back End Estimate: | Jumbo: > 45 days | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| PO Rank: | 0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Overview: Marc Johnson has created a proposal at https://folio-org.atlassian.net/wiki/x/DgJU. After reviewing the proposal, the Capacity Planning Team has determined that we should proceed with the caching approach. Marc is in the process of creating a document outlining the technical aspects for the devs. Steps:
The "is defined by" stories for this feature should be worked on in this order... Recommended approach to take..
|
| Comments |
| Comment by Holly Mistlebauer [ 21/Sep/21 ] |
|
Marc Johnson: Which record types should we cache first? I would like to create tickets for at least 5. |
| Comment by Marc Johnson [ 21/Sep/21 ] |
I don't think it would be appropriate for me to make that decision. My preference would be for the RA SIG or relevant POs to decide. Cheryl Malmborg do you have a preference? Given the conversation in Capacity Planning, maybe Khalilah Gambrell or Hkaplanian have some thoughts on which should be chosen? |
| Comment by Hkaplanian [ 21/Sep/21 ] |
|
The path forward here is to take the list of requests that the circ apps make that take the most time and from a technical perspective look at what database lookups would give us the most time savings. We already have that data at call level (I believe). In many ways, this is a POC and I assume multiple questions will arise both during and after where the SIG might be able to help us clarify, but at this stage, we have the data needed to make that decision. We already have the list I refer to. Marc, it's up to you at this stage.
|
| Comment by Holly Mistlebauer [ 24/Sep/21 ] |
|
Marc Johnson: I have looked at the response time of each "Intent" using the data available at https://folio-org.atlassian.net/wiki/x/DgJU. The "Intents" with the highest response times are...
I am assuming that we can only cache the "Fetch" "Intents", so "Update item status" is out. Two of the "Fetches" have had an improvement made already ("Fetch automated blocks" and "Fetch item barcode" and one has a separate issue for the Vega team to address ("Fetch manual patron blocks"). Marc Johnson: Should we include any of these 4 in the group of 5 we start with? If not, I am thinking we could do these 5...
Thoughts? |
| Comment by Hkaplanian [ 24/Sep/21 ] |
|
Marc can correct me, but I think a good set of criteria could be:
|
| Comment by Hkaplanian [ 28/Sep/21 ] |
|
Marc, looking at this list:
During a loan, are these called for the same data multiple times? Once per item? 2x per item? IS fetching the lost items fee policy worth it if in total it only saves 11 ms? Just wondering... |
| Comment by Marc Johnson [ 28/Sep/21 ] |
They should be only requested once per check out (which is only for a single item)
Not really, not on their own. It's also worth remembering that this sample is likely misleading (it's all we've got to go on). I think this is where evaluating if this approach is improving the performance is going to be challenging. The current performance is due to the cumulative effect of many requests. That likely means we will only get fairly small (maybe not really any) improvement from stopping any one request. As we don't know how each of these requests degrade under load (we only have a sample under no load and the overall check out API response times), it is challenging to know which ones put the system under pressure and which ones become more significant constraints under load. What this all means is that we aren't likely to know how well we've done until multiple record types have been done and a full load test has been conducted. This makes getting timely feedback on whether we've chosen the right approach and record types challenging. |
| Comment by Marc Johnson [ 28/Sep/21 ] |
Most of them will be fetched for every request, these are the ones I'm confident of:
Loans, requests and items are poor candidates for caching (though not for other forms of derived data, this is why my preference was for persistent derived data) as I imagine check outs for the same item in a short time frame are rare. I don't know what the impact of title level requests will be on this area. I cannot answer that authoritatively without much more analysis of all of the code paths in the system.
Can you help me understand why you are asking this question? The approach that we've chosen (on demand, partial caching) means that we likely won't be loading the entire set of records for any record type into memory and if we do, it will be one at a time.
I believe the unique barcode changes have been aborted for 2021 R2 |
| Comment by Marc Johnson [ 29/Sep/21 ] |
|
I think I've answered the current questions asked and provided my thoughts on implementation. Please let me know if you need anything else at the moment.
Yes, state changes cannot be avoided (at least not with the current process design).
I think we should exclude any of the operations that we have decided to dedicate separate work from the caching for the moment, in order to understand the impact of those improvements separately (ish, depending upon frequency of performance testing) from the caching changes. My Proposed Ordering Given that we aren't going to work with the RA SIG or other stakeholders to understand the tolerances they might accept. And that the response times for most of the record fetches are of a similar magnitude. I think that it makes sense to start with the ones (that I think are) less likely to change and / or the impact of changes will likely be low. The policies are in a slightly strange place in this list, I've put them a little higher than the potential negative impact might suggest, because we might want to get to some of that feedback sooner rather than later. Here is my proposed ordering (my reasoning in brackets):
Cache Policies I suggest we start with
Both of these should be runtime configurable so the PTF team (and other operational folks) can tweak them. |
| Comment by Holly Mistlebauer [ 29/Sep/21 ] |
|
Khalilah Gambrell and Hkaplanian: Hi! I have created the stories for this feature. Should I assign this to Vega? Thanks... |
| Comment by Julian Ladisch [ 05/Oct/21 ] |
|
If optimistic locking is enabled for a table getting an outdated record and using it for PUT will result in an optimistic locking failure that will persist when reloading the record until the cache expiration time has been reached. This can be avoided by invalidating the cache for that record when doing a PUT. |
| Comment by Holly Mistlebauer [ 09/Nov/21 ] |
|
It was decided that caching data would not give us level of performance improvement we need. |