Search result suggestions
The completion suggester provides auto-complete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.
MSEARCH-13
For SPIKE MSEARCH-13 suggestion endpoint has been implemented in branch: feature/msearch-13. This controller allows performing suggest requests to Elasticsearch using 2 query parameters - query (suggestion prefix to analyze) and limit (default value is 5).
-XGET .../search/instances/suggestions?query=book&limit=5
Required Elasticsearch index mappings for suggestion field
{
"suggest": {
"type": "completion",
"analyzer": "simple",
"max_input_length": "50" # terms longer than 50 characters will be truncated to reduce memory consumption
}
}
Other fields can be copied to this field using copy_to functionality in resource metadata description:
{
...
"title": {
"searchTypes": "sort",
"inventorySearchTypes": [ "title", "keyword" ],
"index": "multilang",
"showInResponse": true,
"mappings": {
"copy_to": [ "sort_title", "suggest" ]
}
},
...
}
Elasticsearch suggest query:
{
"from": 0,
"size": 0,
"_source": "false",
"suggest": {
"completion": {
"prefix": "book", # suggestion query prefix
"completion": { # type of the suggestion
"field": "suggest", # field, that will be used as source of suggestions (required)
"size": 5, # number of suggest terms to return
"skip_duplicates": true # removes duplicates from result
}
}
}
}
Performance results of completion query:
- Indexed 2,5 million of instances
- Elasticsearch requires 2500 MB of Java heap to store completion data
- Response time ~8-10ms from Elasticsearch
MSEARCH-119
SPIKE MSEARCH-119 assumes that is there is a way to return suggest results using wildcard or prefix query.
Elasticsearch field mapping
"suggest": {
"type": "keyword",
"normalizer": "keyword_lowercase",
"store": true
}
Elasticsearch query
{
"from": 0,
"size": 0,
"query": {
"prefix": {
"keyword_suggest": {
"value": "wit"
}
}
},
"_source": false,
"stored_fields": [ "keyword_suggest" ]
}
It will return response like
Values from field → keyword_suggest using java code to retrieve relevant Suggest Term using startWith() method.
Disadvantages of this approach:
- It results in N random documents from Elasticsearch index without relevancy (score=1 for all search hits)
- Using copy_to functionality all values returned in a lowercase way
- There is no way to use fuzziness search, suggestions can be provided using only by exact prefix match
Performance results:
- Indexed 2,5 million instances
- Response time ~20-50ms from Elasticsearch
- Reindexing process is slightly faster and it does not require a lot of Java Heap
Performance tests results: