Improve checkout performance by caching data

Description

Overview:
Cornell reports that check-ins and checkouts range from 1 second to 5 seconds (and sometimes up to 11 seconds). Missouri State reports the same rate. We need to improve the processing time for check-ins and checkouts.

has created a proposal at https://folio-org.atlassian.net/wiki/x/DgJU. After reviewing the proposal, the Capacity Planning Team has determined that we should proceed with the caching approach. Marc is in the process of creating a document outlining the technical aspects for the devs.

Steps:

  1. Ask assigned team to come up with approach within 2 weeks.

  2. Try out this caching approach on one record type, choosing the one with the biggest impact.

  3. After we are satisfied with the process, implement caching for as many other record types as we are able during the release. Need to prioritize the rest of the record types so that we get the heavy hitters first.

  4. Have the PTF team analyze the results of caching work completed.

  5. Discuss impact of caching with Resource Access SIG (i.e. if cached record is more than X minutes old, refresh it). We are waiting until we know the impact of caching on the response time so that we are able to present the process with as much information as possible.

  6. Determine next steps based on new PTF team analysis.

The "is defined by" stories for this feature should be worked on in this order...

Recommended approach to take..

  • cache expiration of 5 seconds for all record types

  • maximum cache size of 1000 records (this is pure speculation, as we don't know what impact the caching will have on memory usage)

Priority

Fix versions

None

Development Team

None

Assignee

Solution Architect

Parent Field Value

None

Parent Status

None

Attachments

1

Checklist

hide

TestRail: Results

Activity

Show:

Holly MistlebauerNovember 9, 2021 at 5:57 PM

It was decided that caching data would not give us level of performance improvement we need.

Julian LadischOctober 5, 2021 at 6:17 PM

If optimistic locking is enabled for a table getting an outdated record and using it for PUT will result in an optimistic locking failure that will persist when reloading the record until the cache expiration time has been reached.

This can be avoided by invalidating the cache for that record when doing a PUT.

Holly MistlebauerSeptember 29, 2021 at 5:47 PM
Edited

and : Hi! I have created the stories for this feature. Should I assign this to Vega? Thanks...
cc:

Marc JohnsonSeptember 29, 2021 at 11:30 AM

I think I've answered the current questions asked and provided my thoughts on implementation. Please let me know if you need anything else at the moment.

I am assuming that we can only cache the "Fetch" "Intents", so "Update item status" is out.

Yes, state changes cannot be avoided (at least not with the current process design).

Two of the "Fetches" have had an improvement made already ("Fetch automated blocks" and "Fetch item barcode" and one has a separate issue for the Vega team to address ("Fetch manual patron blocks"). Marc Johnson: Should we include any of these 4 in the group of 5 we start with?

I think we should exclude any of the operations that we have decided to dedicate separate work from the caching for the moment, in order to understand the impact of those improvements separately (ish, depending upon frequency of performance testing) from the caching changes.

My Proposed Ordering

Given that we aren't going to work with the RA SIG or other stakeholders to understand the tolerances they might accept. And that the response times for most of the record fetches are of a similar magnitude.

I think that it makes sense to start with the ones (that I think are) less likely to change and / or the impact of changes will likely be low.

The policies are in a slightly strange place in this list, I've put them a little higher than the potential negative impact might suggest, because we might want to get to some of that feedback sooner rather than later.

Here is my proposed ordering (my reasoning in brackets):

  • tenant locale (singular, is common to all check outs, should change very rarely)

  • loan type (probably small set, likely common to some check outs, low impact if inconsistent)

  • patron group (probably small set, likely common to some check outs, low impact if inconsistent)

  • material type (unsure of set size, likely common to some check outs, low impact if inconsistent)

  • location (unsure of set size, likely common to some check outs, low impact if inconsistent)

  • service point (unsure of set size, likely common to some check outs, low impact if inconsistent)

  • loan policy (small set, likely common to many check outs, possible high impact if inconsistent)

  • lost item policy (small set, likely common to many check outs, possible high impact if inconsistent)

  • overdue fine policy (small set, likely common to many check outs, possible high impact if inconsistent)

  • institution (unsure of set size, likely common to some check outs, low impact if inconsistent)

  • campus (unsure of set size, likely common to some check outs, low impact if inconsistent)

  • library (unsure of set size, likely common to some check outs, low impact if inconsistent)

  • circulation rules (already cached, we may want to replace this with a similar cache to what we implement in other places)

  • instance (large set, unlikely to be common to many check outs)

  • holdings record (large set, unlikely to be common to many check outs)

  • user (large set, unlikely to be common to many check outs)

  • item (large set, not common to any successful check outs within the time frame)

Cache Policies

I suggest we start with

  • a cache expiration of 5 seconds for all record types

  • a maximum cache size of 1000 records (this is pure speculation, as we don't know what impact the caching will have on memory usage)

Both of these should be runtime configurable so the PTF team (and other operational folks) can tweak them.

Marc JohnsonSeptember 28, 2021 at 4:24 PM
Edited

Which of the items listed are run for each and every checkout?

Most of them will be fetched for every request, these are the ones I'm confident of:

  • item

  • holdings record

  • instance

  • location

  • library

  • campus

  • institution

  • material type

  • loans

  • requests

  • user

  • user group

  • patron blocks (both manual and automatic)

  • service point

  • loan policy

  • circulation rules (although there is already caching here)

  • tenant locale

Loans, requests and items are poor candidates for caching (though not for other forms of derived data, this is why my preference was for persistent derived data) as I imagine check outs for the same item in a short time frame are rare. I don't know what the impact of title level requests will be on this area.

I cannot answer that authoritatively without much more analysis of all of the code paths in the system.

Of those, which are relatively small tables that can be loaded into memory easily. Circ rules can be reused with each and every scan and don't change often and could be a good candidate. Same for fine policies.

Can you help me understand why you are asking this question?

The approach that we've chosen (on demand, partial caching) means that we likely won't be loading the entire set of records for any record type into memory and if we do, it will be one at a time.

Ex: Barcode file is probably too large and unique for every scan.

I believe the unique barcode changes have been aborted for 2021 R2 and maybe 2021 R3 due to some organisations not being ready for this change.

Won't Do

Details

Reporter

PO Rank

0

Front End Estimate

Out of scope

Front-End Confidence factor

Low

Back End Estimate

Jumbo: > 45 days

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created September 21, 2021 at 2:58 PM
Updated January 4, 2022 at 9:50 AM
Resolved November 9, 2021 at 5:57 PM
TestRail: Cases
TestRail: Runs