Dictionary
- Token (or term) - the single entry in the index, which is the result of tokenization and analyzing of the string
- For text fields (e.g. title, or sentences), each word or stem of the word is token.
- For keyword fields (e.g. barcode or other identifiers) it is whole value is token. Or
- Field - JSON field in instance (or holding or item), which is indexed as field in elastic document. Every field can have no value, single value or array of values.
- Elastic document - json string with fields, which is stored in elastic. For each field in Elastic there is a mapping
- Mapping - metadata how to index the value of the field. Contains information about tokenizer, analyzer, etc.
Efficiency level | ||
Rank | Search complexity | Description |
---|---|---|
1 | O(1) in memory | Extremely fast (e.g. typically <100 ms) |
2 | O(1) | Fast (e.g. typically <500 ms) |
3 | O(log(n)) | Fast (e.g. typically < 1000 ms) |
4 | xtables | Fast enough (e.g. typically < 1000 ms) |
5 | O(n) | Slow up to minutes on big datasets |
Use cases that will not be effective in Elastic
Function | Efficiency level | Indexed instance data | Search input text | Documentation | Postgres |
---|---|---|---|---|---|
Full text search for terms with stemming and stop-word filtering | 2 | { "title": "The Lord of the Rings", ...} | title = Lords of the Ring | ||
Keyword search (aka exact match). | 2 | {"barcode" : "12345678", ... } | barcode = 12345678 | ||
Full text Search for terms with over all text fields and exact match for all keyword fields. Note: Stemming and Analyzing of various languages is supported, but we need to do use it only for required predefined list of languages. | 2 | { "title": "The Lord of the Rings", "publicNote" : "silver covering", "barcode" : "12345678" ... } | Lord of the Ring silver covers or Lord of the Ring 12345678 | ||
Range filter | 2 | { "createdDate" : "12-12-2020", ...} | |||
Autocomplete | 1 | ||||
Facets | 3 | ||||
Wildcard search with * on left and right. Something which starts with Terms that contains | in most cases 3, can be 2 if index_prefixes is on | {"hrid" : "12345678", ... } | barcode = 12345* | https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html | |
Wildcard search with * on left | 5, but there can be optimizations that make it 2 for certain cases | {"hrid" : "12345678", ... } | barcode = 12345678 | https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html or maybe https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html |
Function | Efficiency level | Indexed instance data | Search input text | Documentation |
---|---|---|---|---|
Wildcard search with * on left and right. Something which starts with Terms that contains | in most cases 3, can be 2 if index_prefixes is on | {"hrid" : "12345678", ... } | barcode = 12345* | https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html |
Wildcard search with * on left | 5, but there can be optimizations that make it 2 for certain cases | {"hrid" : "12345678", ... } | barcode = 12345678 | https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html or maybe https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html |