The completion suggester provides auto-complete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters.
MSEARCH-13
For SPIKE MSEARCH-13 suggestion endpoint has been implemented in branch: feature/msearch-13. This controller allows to perform suggest request to Elasticsearch using 2 query parameters - query (suggestion prefix to analyze) and limit (default value is 5).
-XGET .../search/instances/suggestions?query=book&limit=5
Required Elasticsearch index mappings for suggestion field
{ "suggest": { "type": "completion", "analyzer": "simple", "max_input_length": "50" # terms longer than 50 characters will be truncated to reduce memory consumption } }
Other fields can be copied to this field using copy_to functionality in resource metadata description:
{ ... "title": { "searchTypes": "sort", "inventorySearchTypes": [ "title", "keyword" ], "index": "multilang", "showInResponse": true, "mappings": { "copy_to": [ "sort_title", "suggest" ] } }, ... }
Elasticsearch suggest query:
{ "from": 0, "size": 0, "_source": "false", "suggest": { "completion": { "prefix": "book", # suggestion query prefix "completion": { # type of the suggestion "field": "suggest", # field, that will be used as source of suggestions (required) "size": 5, # number of suggest terms to return "skip_duplicates": true # removes duplicates from result } } } }
Performance results of completion query:
- Indexed 2,5 millions of instances
- Elasticsearch requires 2500 mb of java heap to store completion data
- Response time ~8-10ms from Elasticsearch
MSEARCH-119
SPIKE MSEARCH-119 assumes that is there is a way to return suggest results using wildcard or prefix query.
Elasticsearch field mapping
"suggest": { "type": "keyword", "normalizer": "keyword_lowercase", "store": true }
Elasticsearch query
{ "from": 0, "size": 0, "query": { "prefix": { "keyword_suggest": { "value": "wit" } } }, "_source": false, "stored_fields": [ "keyword_suggest" ] }
It will return response like
Values from field → keyword_suggest using java code to retrieve relevant Suggest Term using startWith() method.
Disadvantages of this approach:
- It results N random documents from Elasticsearch index without relevancy (score=1 for all search hits)
- Using copy_to functionality all values returned in the lowercase way
Performance results:
- Indexed 2,5 million of instances
- Response time ~20-50ms from Elasticsearch
- Reindexing process is slightly faster and it does not require a lot of Java Heap
Performance tests results: