Users App (UXPROD-784)

[UXPROD-1015] Boolean/Query Search for Users Created: 03/Aug/18  Updated: 03/Jan/24

Status: Draft
Project: UX Product
Components: None
Affects versions: None
Fix versions: None
Parent: Users App

Type: New Feature Priority: P3
Reporter: Charlotte Whitt Assignee: Amelia Sutton
Resolution: Unresolved Votes: 0
Labels: PO-mvp, inventory, round_iv, search, search_enhancements, usermanagement
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: PNG File Skærmbillede 2018-07-16 kl. 16.45.56.png     PNG File Skærmbillede 2019-02-14 kl. 17.01.57.png     PNG File Skærmbillede 2019-02-14 kl. 17.02.40.png     PNG File screenshot-1.png    
Issue links:
Defines
is defined by UIU-1713 Add Query search to Users Open
Relates
relates to UXPROD-2760 Improved searching in User app with E... Open
relates to FOLIO-1473 Expose full CQL in the Inventory app ... Closed
relates to UIEH-429 Add parameter advancedsearch=true to ... Closed
relates to MSEARCH-305 BE- Inventory. Keyword search should ... Closed
relates to UXPROD-907 Advanced Search (Within Apps) Closed
relates to UXPROD-1045 Fulltext Search Closed
relates to MSEARCH-362 BE: Inventory. Keyword search should ... Draft
relates to UXPROD-1941 Wait for POC of Elastic Search. Front... Draft
relates to UXPROD-2458 Boolean/Query Search for Requests Draft
Potential Workaround: CW: Create all common search queries, like the combined: title, contributor, identifier we have implemented in Inventory. And for all not spec'ed combined searches, then search in the discovery would be the alternative
Epic Link: Users App
Front End Estimate: XL < 15 days
Front End Estimator: Hkaplanian
Front-End Confidence factor: Low
Back End Estimate: XL < 15 days
Back End Estimator: Hkaplanian
Estimation Notes and Assumptions: KG: When getting estimates, I think we should get two estimates 1.) Boolean Search across all apps AND 2.) Boolean search for Inventory, Codex, MARCcat only. Based on estimates, we may need to split this feature into multiple features.
Development Team: Prokopovych
Kiwi Planning Points (DO NOT CHANGE): 3
PO Rank: 116
PO Ranking Note: CB: Giving same rank as institutional rank.
Rank: Chalmers (Impl Aut 2019): R4
Rank: Chicago (MVP Sum 2020): R1
Rank: Cornell (Full Sum 2021): R2
Rank: Duke (Full Sum 2021): R1
Rank: 5Colleges (Full Jul 2021): R2
Rank: FLO (MVP Sum 2020): R4
Rank: GBV (MVP Sum 2020): R2
Rank: Grand Valley (Full Sum 2021): R4
Rank: hbz (TBD): R4
Rank: Hungary (MVP End 2020): R1
Rank: Lehigh (MVP Summer 2020): R4
Rank: Leipzig (Full TBD): R1
Rank: Leipzig (ERM Aut 2019): R4
Rank: Mainz (Full TBD): R3
Rank: MO State (MVP June 2020): R4
Rank: TAMU (MVP Jan 2021): R1
Rank: U of AL (MVP Oct 2020): R2

 Description   

Ability to do boolean/query search in apps such as Users. NOTE: We already have boolean/query search in Inventory and eHoldings apps (though they differ in how they are implemented).

---------------------------------------

In Inventory, the "Query search" option allows users to use CQL to search Inventory.

In the eHoldings app a query language is implemented, and here following search functionality is supported

  • boolean operators - Using AND, OR, NOT in the simple search box
  • exact phrase - using quotation marks around the phrase
  • nested search - using brackets
  • right truncation - using asterisk at end of word

Documentation:

NOTE:

  • MARC field searching is out of scope for this epic, as it's covered in the MARCcat epic (the MARCcat app will supply MARC searching)
  • Advanced Search (within Apps) - UXPROD-907 Closed


 Comments   
Comment by Khalilah Gambrell [ 06/Aug/18 ]

Charlotte Whitt, should we focus on applying this feature to the following apps: Inventory and Codex?

Comment by Filip Jakobsen [ 06/Aug/18 ]

Khalilah Gambrell, I think users would (justifiably) expect to be able to do boolean searching in all apps if they can do it in some apps. Is that unrealistic for v1?

Comment by Filip Jakobsen [ 06/Aug/18 ]

Khalilah Gambrell, alternatively, I would advocate for some subtle distinction in the user interface, to make it clear in which contexts boolean searching is supported

Comment by Charlotte Whitt [ 06/Aug/18 ]

Filip Jakobsen - in an ideal world, it should be implemented by all apps, but I agree with Khalilah Gambrell, if we due to lack of development resources, can't implement it for all apps at the same time, then it makes perfectly sense for our users to start out implementing boolean search in the apps holding bibliographic data, and millions of records - e.g. Inventory, Codex, MARCcat, eHoldings etc. - btw the latter has is already

Comment by Filip Jakobsen [ 06/Aug/18 ]

Charlotte Whitt, Khalilah Gambrell, if there is not resources, I think we should make sure to follow my latter suggestion above.

Comment by Mike Taylor [ 19/Sep/18 ]

See comments on FOLIO-1473 Closed for how to do this. The great bulk of the work outlined there is to do with providing FOLIO-wide facilities which all of the UI modules will then pretty easily be able to take advantage of.

Comment by Anya [ 29/Mar/19 ]

Comment from March meeting : also needs "begins with"

Comment by Lisa Sjögren [ 13/Sep/19 ]

Downranked this to a year, since we do not really care how the search functionality is built (BOOLEAN or otherwise), as long as it's good and flexible. When it comes to Inventory, UIIN-564 Closed represents the most typical combined search for Chalmers so that has a higher priority for us than other searches.

Comment by Cate Boerema (Inactive) [ 29/Apr/20 ]

Charlotte Whitt does the "query search" in inventory do this already?

Comment by Charlotte Whitt [ 29/Apr/20 ]

Cate Boerema - yes you can do boolean searches in the Inventory, using the Query search option:

E.g.:

  1. publication = "MIT Press" and publication = "c2004"
  2. title="the" not source ="MARC"
  3. subjects = "history" or identifiers = "OCoLC" not publication = "2017"
Comment by Cate Boerema (Inactive) [ 29/Apr/20 ]

Thanks! So I guess this feature is about general boolean search capabilities for all apps. I was confused by the description which mentions Inventory as an example.

Comment by Mike Taylor [ 29/Apr/20 ]

(Can we call this something other than "query search"?)

Comment by Charlotte Whitt [ 29/Apr/20 ]

Hi Mike Taylor - we had to name it something which we could fit into the search option box, and made sense for the SMEs.
Query language search was an alternative, but that's a bit long - but I admit formally more correct.
CQL query - is also a possibility, but more technical.

Do you have better suggestions?

Comment by Mike Taylor [ 29/Apr/20 ]

It's difficult, I know!

But to my mind, every search is driven by a query, so "query search" is like "oxygen breathing" or "round circle". Almost anything would be better.

Maybe something like "Expert search"?

On the other hand, the technicality of "CQL search" may be a good thing — a signal that this isn't a field to just try typing random things into as you would with Google, but which you need to learn a specific technical thing in order to use.

Comment by Charlotte Whitt [ 29/Apr/20 ]

I kind of like "CQL search" too, but the tricky thing here is, that the developers (suggested by Julian, Zak, Michail) one day would like to improve the usability of the 'Query search', and do something light weight manipulation behind the scenes, and then it would no longer be core CGL.

Let me air possibilities with the MM-SIG, and also see if this has come up in usability test at Cornell, FLO etc.

But thanks for input Mike Taylor

Comment by Cate Boerema (Inactive) [ 08/Jun/20 ]

Marc Johnson how difficult would it be to add a query search option to Users and Requests (analogous to the query search in Inventory)? I am thinking we should create separate UXPRODs, one for each app to which we want to add it. Does that make sense?

Comment by Marc Johnson [ 08/Jun/20 ]

Cate Boerema

how difficult would it be to add a query search option to Users and Requests (analogous to the query search in Inventory)?

Providing a query search is entirely front-end only work. I think Michal Kuklis or Zak Burke might be better placed to advise the effort involved.

The same performance limitations apply to these areas as to inventory, however it is likely there are less database indexes in place for these areas of the system.

I am thinking we should create separate UXPRODs, one for each app to which we want to add it. Does that make sense?

Yes, I think they seem like separate features to me.

Comment by Cate Boerema (Inactive) [ 10/Jun/20 ]

Zak Burke can you please provide an estimate for this feature assuming we implement a Query search in Users similar to what we have in Inventory? We'd need to modify the search and filter pane a bit to allow for this. I'm thinking something like this:

The only other option in the search type menu will be "Query search"

The same performance limitations apply to these areas as to inventory, however it is likely there are less database indexes in place for these areas of the system.

Marc Johnson, is there a way to add indexes preemptively as part of this feature? Do you have thoughts on which indexes we would add, or would we need more info from the SMEs on the kinds of searches they intend to conduct?

Am I correct to assume that, if there are no indexes, we might see slow performance. What about inaccurate result counts?

Comment by Cate Boerema (Inactive) [ 10/Jun/20 ]

Hi patty.wanninger. Could you please discuss the idea of adding a Query search to Users similar to the one in Inventory? I think that's what this feature was about, but I would like to double check. I've cobbled together a mockup (see above comment) that you can look at with them. It would also be useful to ask the SMEs what types of searches they would anticipate doing with a feature like this.

BTW, I have created a separate UXPROD for adding query search to Requests ( UXPROD-2458 Draft ) and have put this as a topic on the RA SIG agenda backlog so we can gather feedback from the RA SMEs (for both this feature and the Requests one).

Thanks much!

Comment by Julian Ladisch [ 10/Jun/20 ]

What about inaccurate result counts?

The result counts/hit counts are estimates using the same algorithm as in inventory.

Comment by Marc Johnson [ 15/Jun/20 ]

Cate Boerema

is there a way to add indexes preemptively as part of this feature? Do you have thoughts on which indexes we would add, or would we need more info from the SMEs on the kinds of searches they intend to conduct?

We would need more information about what kinds of searches that folks are likely to perform.

My understanding is that the intent of a CQL query search is to allow folks to make ad-hoc queries of their own choosing, this makes it hard to predict what searches they are going to do.

If there are strong patterns of what searches folks want to do, it might be worth considering making them specific search options rather than opening up a CQL query search option.

Am I correct to assume that, if there are no indexes, we might see slow performance. What about inaccurate result counts?

Yes, a query that does not utilise an index will likely be slow and increase load on the system.

As Julian Ladisch says, the current estimation technique uses explain plans and statistics which are likely to be inaccurate if there isn't an appropriate index for the query.

This is a general challenge with the current search approach and with CQL query searches in particular as folks can perform any search they want.

Comment by Zak Burke [ 20/Jul/20 ]

I was playing around with this over the weekend and have a few thoughts:

  1. Providing direct access to CQL with no changes to the page is easy if we add logic like "search values that start with open-paren will be parsed as CQL and everything else uses the built-in query". That would mean a search-value like smith would be parsed into a query like
    ((username="smith*" or personal.firstName="smith*" or personal.lastName="smith*" or personal.email="smith*" or barcode="smith*" or id="smith*" or externalSystemId="smith*")) sortby personal.lastName personal.firstName
    

    as it currently does, but search values like (username==smith) or (personal.firstName==ezra and personal.lastName=cornell) would be fed straight to the API. TBH, there's no significant downside I can think of here.

  2. CQL is not especially user-friendly, and getting at the data you want requires knowing details about the shape of that data that are not exposed in the UI at present. e.g. to query users by exact-first-name, you would have to use the query (personal.firstName==smith). OK, easy enough, except you have to know about the personal. prefix, and firstName is case-sensitive; a search for (personal.firstname==smith) will fail. This problem is not unique to CQL; for any kind of data, e.g. a Jira query, you need to know what the field-names are. But with case-sensitive field-names like personal.lastName, we have this problem worse than most. In inventory, this is especially obvious for fields like ISSN, where search for the ISSN 6316800312 requires a query that nobody is going to figure out on their own:
    (identifiers =/@value/@identifierTypeId="8261054f-be78-422d-bd51-4ed9f33c3422" "6316800312") sortby title
    
  3. There are JS tools for lucene query syntax that will take a query like lastname:smith and parse it to an abstract syntax tree, which we could then map and reassemble into CQL. In English, that means we could map queries like firstname:smith to personal.firstName==smith or issn:6316800312 to identifiers =/@nasty CQL with very little effort because the hard part (defining the query grammar and writing the parser) is already done. The advantage of this is simpler queries. The disadvantage of this is the introduction of yet another query language.

I wrote POCs for #1 and #3. Personally, I see no reason not to implement #1 right away; there's no downside I can think of and the work is already done. #3 I think is at least worth discussing.

Comment by Marc Johnson [ 20/Jul/20 ]

Zak Burke

That would mean a search-value like smith would be parsed into [an extended query like it currently does], but search values like (username==smith) or (personal.firstName==ezra and personal.lastName=cornell) would be fed straight to the API. TBH, there's no significant downside I can think of here.

If I'm understanding this correctly, from my perspective, some potential considerations of this approach are an increase in complexity of both the user experience and the system. This requires that the user understands how an entered search phrase is interpreted by the system e.g. they need to know that using parentheses short circuits the typical interpretation. They would need this tacit knowledge in order to use the system effectively.

I don't understand how this is preferable to providing an entirely separate option for searching using CQL, where users have explicitly chosen to embrace that complexity.

The standard considerations of exposing CQL directly to users apply to either of these equally. Primarily that it increases the likelihood of queries being used that do not align with database indexes and hence perform slower, and likely the desire to introduce additional database indexes.

CQL is not especially user-friendly

As I understand it, it was chosen in order to allow it to be used directly by users.

and getting at the data you want requires knowing details about the shape of that data that are not exposed in the UI at present

That is true, and is a part of the complexity of exposing a query language directly to users, especially one that is implicitly coupled to the structure of the underlying data.

In inventory, this is especially obvious for fields like ISSN, where search for the ISSN 6316800312 requires a query that nobody is going to figure out on their own

This is a really good example of an impact of exposing a query language based upon the underlying data structure directly to the user.

There are JS tools for lucene query syntax that will take a query like lastname:smith and parse it to an abstract syntax tree, which we could then map and reassemble into CQL. In English, that means we could map queries like firstname:smith to personal.firstName==smith or issn:6316800312 to identifiers =/@nasty CQL with very little effort because the hard part (defining the query grammar and writing the parser) is already done. The advantage of this is simpler queries. The disadvantage of this is the introduction of yet another query language.

Introducing a second query language specific to the reference UI in order to address perceived complexity in the existing query language would be a significant decision. Some of that depends upon if the users would find lucene syntax more or less familiar than CQL syntax.

If this were to be explored further, I think it could be at least considering support for custom indexes for CQL (which might allow for queries like {{issn==6316800312) or moving to a different single query language as alternatives to this.

Given that some folks are already exploring ElasticSearch, which I assume has it's own query language, we should taken that into account as well. A significant aspect of that work will be deciding whether to try to hide the presence of ElasticSearch behind the existing query language.

Comment by Marc Johnson [ 20/Jul/20 ]

Charlotte Whitt

In the eHoldings app a query language is implemented, and here following search functionality is supported

Is this query language separate to CQL? If so, please could a reference to some details be provided.

Comment by Khalilah Gambrell [ 20/Jul/20 ]

Marc Johnson, eholdings app is powered by an EBSCO api

Comment by Marc Johnson [ 20/Jul/20 ]

Khalilah Gambrell Thanks

eholdings app is powered by an EBSCO api

To make sure I followed that correctly, does that mean that the e-holdings app does not provide CQL support and instead directly exposed the EBSCO KB query language to users?

Comment by Mike Taylor [ 20/Jul/20 ]

IIRC, Julian Ladisch said recently that cgl2pgjson already supports an ISBN index.

Comment by Khalilah Gambrell [ 20/Jul/20 ]

Marc Johnson, correct.

Comment by Marc Johnson [ 20/Jul/20 ]

Khalilah Gambrell Charlotte Whitt

Does that mean that this issue to is to create a query language similar to the EBSCO API query language, separate to CQL, for the users UI?

Comment by Zak Burke [ 21/Jul/20 ]

The Lucene syntax example was meant to be no more than an example. IndexData wrote a JS CQL parser that we make use of in some unit tests and we could use that just as easily to parse the value from the search-box. I had forgotten about the CQL parser at the time.

To Marc Johnson's question, is this story "to create a query language ... separate to CQL, for the users UI?" I think it's either that or it's "expose CQL for the users UI". The latter is very easy to do.

  1. If we do it as in (1) above with some short-cut hint in the search field like starting open-parenthesis or cql:, then my estimation is "1 point, POC is already done". Marc raised the concern that this requires "tacit knowledge in order to use the system effectively" which is technically true, but which I think is easily mitigated with a link to a "how to use the search box effectively" page/popup/etc. This is how Google works, though they don't even link to their search syntax, which seems rude.
  2. If we dress up the search form like that for inventory with menu-options for "default search" and "CQL search" then we have to faff about with the form a bit but there's little additional work.
  3. If we want a syntax separate to CQL, on the one hand that's a whole different story; on the other, again, there's already a POC with Lucene syntax. Incidentally, Elasticsearch supports Lucene syntax as well. Surely a decision like this would need approval from some sort of committee and/or SIG.
Comment by patty.wanninger [ 22/Jul/20 ]

The User Management SIG has prioritzed https://folio-org.atlassian.net/browse/UXPROD-2092 over this feature.

Comment by Zak Burke [ 23/Jul/20 ]

If we provide direct access to CQL, patty.wanninger, this and UXPROD-2092 Closed are nearly one and the same.

Comment by Charlotte Whitt [ 24/Jul/20 ]

Marc Johnson

Does that mean that this issue is to create a query language similar to the EBSCO API query language, separate to CQL, for the users UI?

Short answer is: No The query language we use in the FOLIO apps should be consistent. It would be very confusing for staff users if we implement different syntax when using the Users app compared with using Inventory app.

Comment by Julian Ladisch [ 24/Jul/20 ]

ISSN 6316800312 requires a query that nobody is going to figure out on their own:

(identifiers =/@value/@identifierTypeId="8261054f-be78-422d-bd51-4ed9f33c3422" "6316800312") sortby title

It is trivial to provide issn=6316800312 syntax by adding an additional index to schema.json similar to the existing isbn index: https://github.com/folio-org/mod-inventory-storage/blob/v19.2.4/src/main/resources/templates/db_scripts/schema.json#L460-L469
We have https://folio-org.atlassian.net/browse/MODINVSTOR-475 for issn and similar fields.

Comment by Marc Johnson [ 24/Jul/20 ]

Julian Ladisch

It is trivial to provide issn=6316800312 syntax by adding an additional index to schema.json similar to the existing isbn index

Thanks. I suggest moving the conversation about searching by identifier type to UIIN-1214 Open which appears to specifically cover that situation.

Comment by Julian Ladisch [ 24/Jul/20 ]

> search values that start with open-paren will be parsed as CQL

I disagree. Open parenthesis should not be a magic character that triggers a CQL search.

Given that I read about some instance in an email, PDF or on a website,
and I want to find the instance/holding/item information.
When I copy and paste the information into the search box,
then I expect that punctuation is ignored, for example parentheses.
Note: This applies to the simple search, there may other search boxes for advanced search where punctuation is relevant.

Library of Congress has 4 different search boxes:

These different search options provide better usability than combining them into one.

Comment by Charlotte Whitt [ 24/Jul/20 ]

Hi Khalilah Gambrell and Filip Jakobsen - has something similar been considered?

Library of Congress has 4 different search boxes:

These different search options provide better usability than combining them into one.

Comment by patty.wanninger [ 17/Sep/20 ]

The user management SIG has ascertained that the current searching capabilities of the user app meet the requirements, so are not pursuing stories for this ticket. 9/16/2020

Boolean searching IS available currently.

Comment by Debra Howell [ 29/Sep/20 ]

patty.wanninger and Charlotte Whitt - Requesting status of this feature. There seems to be a comment in the Workaround field rather than an approved workaround. Perhaps that comment could be moved to this Comment section and this feature could be re-evaluated?

Comment by Julian Ladisch [ 30/Sep/20 ]

Debra Howell The workaround field is labeled "potential workaround". 9 Institutions have ranked this feature as R1 or R2 and therefore indicate that the potential workaround is not sufficient for them.
People are asked to put their institution name in front of a workaround they are going to use. The workaround is not used.
The ranking shows that this feature is needed, no re-evaluation is needed.
We should keep the workaround section as it might be useful for other (new) libraries.

Comment by Anya [ 30/Sep/20 ]

Julian Ladisch it has been said that if comments appear in the "potential workaround" it is not pulled by the capacity team.

Generated at Fri Feb 09 00:12:15 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.