Dictionary
- Token (or term) - the single entry in the index, which is the result of tokenization and analyzing of the string
- For text fields (e.g. title, or sentences), each word or stem of the word is token.
- For keyword fields (e.g. barcode or other identifiers) it is whole value is token. Or
- Field - JSON field in instance (or holding or item), which is idexed indexed as field in elastic document. Every field can have no value, single value or array of values.
- Elastic document - json string with fields, which is stored in elastic. For each field in Elastic there is a mapping
- Mapping - metadata how to index the value of the field. Contains inforamtion information about tokenizer, analyzer, etc.
Efficiency level | |||
Rank | Elastic | Postgre Search complexity | Description |
---|---|---|---|
1 | O(1) in memory | Extremely fast (e.g. typically <100 ms) | |
2 | O(1) | Fast (e.g. typically <500 ms) | |
3 | O(log(n)) | Fast (e.g. typically < 1000 ms) | |
4 | xtables | Fast enough (e.g. typically < 1000 ms) | |
5 | O(n) | Slow up to minutes on big datasets |
Use cases that will not be effective in Elastic
Function | Efficiency |
---|
level | Indexed instance data | Search input text | Documentation |
---|
Postgres | |||||
---|---|---|---|---|---|
Full text search for terms with stemming and stop-word filtering | 2 | { "title": "The Lord of the Rings", ...} | title = Lords of the Ring | ||
Keyword |
search (aka exact match). | 2 | {"barcode" : "12345678", ... } | barcode = 12345678 | ||
Full text Search for terms with over all text fields and exact match for all keyword fields. | 2 | { "title": "The Lord of the Rings", "publicNote" : "silver covering", "barcode" : "12345678" ... } | Lord of the Ring silver covers or Lord of the Ring 12345678 | ||
Range filter | 2 | { "date" : "s" | |||
Autocomplete | 1 | ||||
Facets | 3 | ||||
Result count | 2 |
Use cases that will not be effective in Elastic
Function | Efficiency | ranklevel | Summary | Details | Example | Reference link | Terms that contains | wildcard searchIndexed instance data | Search input text | Documentation |
---|---|---|---|---|---|---|---|---|---|---|
Wildcard search with * on left and right. Something which starts with Terms that contains | in most cases 3, can be 2 if index_prefixes is on | {"hrid" : "12345678", ... } | barcode = 12345* | https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html | ||||||
Wildcard search with * on left | 5, but there can be optimizations that make it 2 for certain cases | {"hrid" : "12345678", ... } | barcode = 12345678 | https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html or maybe https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html |
...