GET item-storage/items?query=barcode=="" is very slow - takes 58 seconds
Description
CSP Request Details
CSP Rejection Details
Potential Workaround
Attachments
- 28 Sep 2021, 01:59 PM
- 28 Sep 2021, 01:55 PM
- 27 Sep 2021, 05:40 PM
- 27 Sep 2021, 05:38 PM
- 27 Sep 2021, 05:35 PM
defines
relates to
Checklist
hideTestRail: Results
Activity
Ann-Marie BreauxSeptember 29, 2021 at 7:28 AMEdited
Closing this issue based on @Julian Ladisch's comments.
@Aliaksandr Fedasiuk has identified a separate issue where Data Import is sending an empty barcode request. That will be fixed as a separate issue (MODINV-508), and may need to be considered for a Juniper hotfix
Julian LadischSeptember 28, 2021 at 10:35 PM
https://folio-org.atlassian.net/browse/MODINVSTOR-523#icft=MODINVSTOR-523 entirely resolves this performance issue for Kiwi.
Julian LadischSeptember 28, 2021 at 10:33 PM
The fix https://github.com/folio-org/mod-inventory-storage/pull/648 = https://folio-org.atlassian.net/browse/MODINVSTOR-523#icft=MODINVSTOR-523 will ship with Kiwi.
The fix has been back-ported to the b21.4 branch of Juniper (mod-inventory-storage-21.0.4) but it was decided that the fix should NOT be released as a Juniper hotfix because people are not ready for item barcode uniqueness. Institutions that are affected by the performance (Cornell) apply a fix manually. The back-port is no longer on the b21.4. branch.
Ann-Marie BreauxSeptember 28, 2021 at 8:09 PM
Hi @Julian Ladisch - that's great to hear! A couple quick questions:
Do you mean that the PR resolves it entirely (and we can close this Jira)? Or do we need to copy what you've described into a separate fix?
Will the PR you referenced be released as a Juniper hotfix, or not until Kiwi?
Julian LadischSeptember 28, 2021 at 7:58 PM
This issue has been fixed by https://github.com/folio-org/mod-inventory-storage/pull/648
CQL
barcode==""
results in SQL
WHERE lower(item.jsonb->>'barcode') LIKE lower('')
with this query plan based on a very fast index scan:
Index Scan using item_barcode_idx_unique on item (cost=0.15..8.17 rows=1 width=200) (actual time=0.050..0.050 rows=0 loops=1)
Index Cond: (lower((jsonb ->> 'barcode'::text)) = ''::text)
Filter: (lower((jsonb ->> 'barcode'::text)) ~~ ''::text)
Overview:
GET item-storage/items?query=barcode=="" is used multiple times in Data Import workflow and we have seen significant slowness because of this query. Why is Data Import making a query for empty barcode?
Steps to Reproduce:
Log into https://pre-kiwi-cap1.int.aws.folio.org/ as folio user
Open Data Import App
Run Create job profile job with 100 records. See attached mrc file(kcp1-DE-100-7492.mrc) for reference.
Job completes successfully but it takes 30+ minutes
While the job is running, the request is made to mod-inventory-storage, item-storage API as below which consumes 100% RDS Postgresql CPU
Request
curl --request GET 'https://okapi-pre-kiwi-cap1.int.aws.folio.org/item-storage/items?query=barcode==""' \ --header 'x-okapi-tenant: fs09000000' \ --header 'x-okapi-token: ${OKAPITOKEN}' \ --header 'Content-Type: application/json'
Response
Status code 200 57.21 seconds
Expected Results:
Should return response in less than 1 second
Actual Results:
taking 58 seconds
Additional Information:
In the database following query is run:
SELECT jsonb,id FROM fs09000000_mod_inventory_storage.item WHERE lower(fs09000000_mod_inventory_storage.f_unaccent(item.jsonb->>'barcode')) LIKE lower(fs09000000_mod_inventory_storage.f_unaccent(''))
Explain Analyze of a simpler version of above query:
EXPLAIN ANALYZE SELECT jsonb,id FROM fs09000000_mod_inventory_storage.item WHERE item.jsonb->>'barcode'=''; Query Plan: "Gather (cost=1000.00..7050169.75 rows=44489 width=782) (actual time=80977.741..80981.200 rows=0 loops=1)" " Workers Planned: 2" " Workers Launched: 2" " -> Parallel Seq Scan on item (cost=0.00..7044720.85 rows=18537 width=782) (actual time=80973.762..80973.762 rows=0 loops=3)" " Filter: ((jsonb ->> 'barcode'::text) = ''::text)" " Rows Removed by Filter: 2979217" "Planning Time: 0.121 ms" "Execution Time: 80981.225 ms"
Before running Explain Analyze, we vaccumed table
vacuum fs09000000_mod_inventory_storage.item
In mod-inventory-storage log, we see lot of calls to get empty barcode. See attached mod-inventory-storage mod-inventory-storage-item-storage.log
13:58:49 [] [] [] [] INFO CQLWrapper CQL >>> SQL: barcode=="" >>>WHERE lower(f_unaccent(item.jsonb->>'barcode')) LIKE lower(f_unaccent('')) LIMIT 10 OFFSET 0
URL:
Interested parties:
All consumers of mod-inventory-storage
CC @Ann-Marie Breaux