Search Enhancements (UXPROD-1705)

[UXPROD-1941] Wait for POC of Elastic Search. Front-end query pre-processor: support words, phrases and booleans Created: 04/Jun/19  Updated: 29/Jun/23

Status: Draft
Project: UX Product
Components: None
Affects versions: None
Fix versions: None
Parent: Search Enhancements

Type: New Feature Priority: P3
Reporter: Jakub Skoczen Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: elastic-search, inventory, platform-backlog, po-mvp, round_iv, search, search_enhancements
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
blocks RMB-428 cql-java: AND multiple tokens Closed
blocks UXPROD-869 Advanced Search (within apps) Closed
is blocked by RMB-385 add 'queryIndexName' to schema.json a... Closed
Relates
relates to UIIN-868 Inventory search. Holdings segment. S... Closed
relates to UIIN-869 Inventory search. Item segment. Searc... Closed
relates to UIU-1068 expand the list of query terms into A... Closed
relates to STCOM-1180 CLONE - Implement Advanced search modal Closed
relates to STSMACOM-767 CLONE - Implement Advanced search modal Closed
relates to UIIN-1920 Implement Advanced search modal Closed
relates to UXPROD-140 Q4 2019 Timebox for Priority Inventor... Closed
relates to UXPROD-1820 RMB 27 release features and core modu... Closed
relates to UXPROD-2119 RMB 28 release features and core modu... Closed
relates to UXPROD-2180 Q1 2020 Timebox for Priority Inventor... Closed
relates to UXPROD-2298 Q2 2020 Timebox for Priority Inventor... Closed
relates to UXPROD-2443 Q3 2020 Timebox for Priority Inventor... Closed
relates to UXPROD-2712 Inventory Elastic Search (Lotus): Tim... Closed
relates to UXPROD-3513 Inventory Elastic Search (Morning Glo... Closed
relates to UXPROD-1015 Boolean/Query Search for Users Draft
relates to UIIN-724 Inventory search. Instance segment. S... Closed
relates to MSEARCH-305 BE- Inventory. Keyword search should ... Closed
relates to UIIN-602 no records found when searching title... Closed
relates to MSEARCH-362 BE: Inventory. Keyword search should ... Draft
Epic Link: Search Enhancements
Front End Estimate: XL < 15 days
Back End Estimate: Medium < 5 days
Development Team: Core: Platform
Kiwi Planning Points (DO NOT CHANGE): 1
Rank: Chalmers (Impl Aut 2019): R4
Rank: Chicago (MVP Sum 2020): R1
Rank: Cornell (Full Sum 2021): R2
Rank: Duke (Full Sum 2021): R2
Rank: 5Colleges (Full Jul 2021): R2
Rank: GBV (MVP Sum 2020): R2
Rank: Lehigh (MVP Summer 2020): R2
Rank: TAMU (MVP Jan 2021): R3
Rank: U of AL (MVP Oct 2020): R4

 Description   

Problem statement

Core search apps in FOLIO (ui-users, ui-inventory) accept only a simple search input string from users. This input cannot include any special characters (* wildcard being an exception), quotes (to represent phrases) or booleans.

From this input the UI generates CQL (boolean) queries according to hard-coded recipes. Either simple ones like index="{userInput}" or more complicated
search expression across several search indexes. E.g, given user input:

john smith

ui-users will generate:

firstName="john smith" OR lastName="john smith" OR username="john smith"

This approach is generally problematic and results in inadequate search behavior (see e.g UIU-939 Closed , UIIN-435 Closed , UIIN-564 Closed ). With the workaround provided through UIU-1068 Closed , some problems with this approach are addressed but others remain (like UIIN-602 Closed ) and the workaround creates an impediment for providing more sophisticated search functions in the future (boolean search, ranking, etc) . We would like to redesign how the UI (apps) handle user search input and include support for both simple term searches and boolean expressions (with quotes, boolean operators and parenthesis).

Proposed solution

The proposed solutions consists of two parts: changes in the UI (SearchAndSort component) and in the backend query converter (which resides in RMB).

UI changes: Design and implement (in JavaScript) a front-end query pre-processor that will provide a simple to use front-end query syntax. The pre-processor must support handling simple tokens (words), quoted tokens (phrases) and boolean operators (AND OR NOT) and convert the user query into CQL that can than be handled by the back-end. See UXPROD-1015 Draft for the eHoldings boolean search screencast.

Example (inventory):

Given, user input for the all search drop-down: "the c programming language" kernighan OR knuth the UI generates the following CQL:

a) assuming we have a back-end search index for all: all="the c programming language" AND all="kernighan" OR all="knuth"

b) assuming the UI needs to expand all into two indexes: title and author: (title="the c programming language" OR author="the c programming language" AND (title="kernighan" OR author="kernighan") OR (title="knuth" OR author="knuth")

Obviously, option b) leads to a longer and more complex CQL query. Hence a backend extension is proposed to address this.

Backend: changes. Add a new feature which will allow creating "compound indexes" for a list of JSON properties (or for a specific subset of JSON), see RMB-385 Closed . This will allow module authors to create "virtual" search fields, e.g fullName that includes both userName and firstName.

Alternative solution considered but rejected

Alternative is to add extensions to the back-end CQL java parser that would not be conformant to the spec but would allow us to use the language verbatim in the UI and support quoting phrases and using booleans (see RMB-428 Closed )



 Comments   
Comment by Jakub Skoczen [ 06/Aug/19 ]

Zak Burke Adam Dickmeiss Guys, I'd like to get your feedback on this issue.

Comment by Zak Burke [ 06/Aug/19 ]

I think defining a front-end query language that we translate into CQL is a good idea. I share the concerns expressed by Julian Ladisch and Mike Taylor on RMB-428 Closed , namely that it couples cql-java too tightly to whatever particular query language we invent, in addition to diverging from the CQL spec there.

There are many good examples of javascript parsers we could use for inspiration or even leverage directly.

Comment by Jakub Skoczen [ 03/Sep/19 ]

Zak Burke Adam Dickmeiss Ok, seems like we are on the same page then. One option is to write our own parser, e.g take inspiration from https://github.com/indexdata/cql-js

Comment by Zak Burke [ 05/Nov/19 ]

Cate Boerema, Charlotte Whitt, Marc Johnson, in our discussion about UIIN-724 Closed yesterday we decided to implement "Search by query language" by adding an element to the search-indexes dropdown and providing the user's input as-is to the backend. We briefly discussed whether to add a query-parser on the front-end to validate the syntax but dismissed it as extra work at this time. This is a good approach for providing search-by-CQL as soon as possible.

In the long term, however, I think we should add front-end parsing. This would enable us to use a single search box in apps like ui-users where there is no search-index dropdown like we have in ui-inventory. This makes the search box much more powerful without requiring any changes to the UI. For example, given a search term like foo bar we would attempt to parse it and find no field and no operators and decide to substitute the value into a default query string (this is what we currently do). Given a search term like lastname:foo and firstname:bar we would attempt to parse it and find some fields (lastname, firstname, the and operator, and the values foo and bar and we could construct a specific CQL query.

Comment by Charlotte Whitt [ 05/Nov/19 ]

Felix Hemme and Julian Ladisch - see Zak Burke's comment.
I think this is exactly what you also suggested for long term solution last week

Comment by Holly Mistlebauer [ 17/Jun/20 ]

Chicago comment from Round IV Outliers spreadsheet: We have a collection of 8 million bibs. We need to be able to produce queries with precision to find items. General keyword is not sufficient. -Kristin Martin

Generated at Fri Feb 09 00:19:48 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.