Codex search results are taking Nonfiling characters into account when sorting
Description
CSP Request Details
CSP Rejection Details
Potential Workaround
Attachments
Checklist
hideTestRail: Results
Activity
Holly Mistlebauer December 21, 2021 at 6:53 PM
This ticket has been closed because it is over 3 years old and has a very low priority.

Mike Taylor August 21, 2018 at 2:09 PM
Strongly agree. These conversations got into a lot of unnecessary complexity by trying to solve the difficult case of the Codex before having solved the (relatively!) easy case of the local inventory.

Jakub Skoczen August 21, 2018 at 9:25 AMEdited
guys, I'd like to make sure we are clear about the scope of what can (and will) be done vs what is outside of Core Team conrol. I suggest particular issue in two stages:
1. Stage 1: address sort and search issues in Inventory (and other modules that index data locally in FOLIO), relevant issues here are FOLIO-1246 (which is an umbrella for more powerful search functionality including ranking, stropwords etc) and (which is about ensuring that tenant locale is used for driving the DB collation setting and will address locale-specific sorting issues)
2. Stage 2: address sort and search issues in Codex Search app, here we are generally limited by the quality of results from the upstream sources, one of which we control directly (Inventory) whlle for the other (EBSCO KB) we can request certain tuning.

Mike Taylor February 21, 2018 at 10:31 PM
There are basically two approaches to searching multiple sources at once.
1. Harvest everything into one big database and search that.
2. Search in real time and merge the results.
There are advantages and disadvantages to each approach. #1 needs more up-front effort and more sysadmin, but yields faster and more consistent results. This is what Summon does. #2 is more lightweight, but slower and dependent on the capability of the sources.
The Codex is a type-2 solution.
We would perhaps like to do a type-1 solution, but the fundamental problem is that we can't in general harvest all the things we want. For example, the EBSCO KB is proprietary and not available for harvesting. So for now at least, this is a non-starter.

Theodor Tolstoy (One-Group.se) February 21, 2018 at 10:25 PM
That is true, but i thing you can come a long way using automated approaches.
Maybe this is not the best place to ask this question, but why is there not a Search engine in Codex?
Overview: When conducting title level searches in Codex, The sort algorithm does seem to take definite article and other Nonfiling characters into consideration. This seems to be true for both Swedish and English.
Steps to Reproduce:
Create a couple of records in Inventory with titles starting on a, å, ä or similar
For example:
"Den aktansvärda"
"Den äkta varan"
"Den åländska skärgården"
"The Åland archipelago"
"Ålöndska skärgården"
"The Aland archipelago"
Go to Codex and conduct a title search for åland
Sort the results on title in ascending order (arrow pointing up)
Expected Results:
Search results sorted in the following order:
The Aland archipelago ("The " should be disregarded)
Northern Territories, Asia-Pacific Regional Conflicts and the Åland...
A User’s Guide to the Nestle-Aland 28 Greek New Testament ("A " should be disregarded)
Ware Conterfeyhung eines abscheulichen Aland Fisches...
The Åland archipelago ("The " should be disregarded)
Den åländska skärgården ("Den " should be disregarded since it is a Swedish definite article)
Note: Not all of these results (the result items themselves) are not expected to emerge. Disregard from that, the point is that the nonfiling charachters has been taken into account in the sort.
Actual Results:
See attached image