#
Elasticsearch
The Elasticsearch plugin provides Rosetta with functionality relating to the Elasticsearch search engine.
#
Dependency Information
<dependency>
<groupId>com.k-int.rosetta</groupId>
<artifactId>rosetta-elasticsearch</artifactId>
<version>3.0.0</version>
</dependency>
#
Providers
#
elasticsearch
Entity Kind: Provider
Type: elasticsearch
The elasticsearch
Provider allows Rosetta to communicate with Elasticsearch clusters. Specifically, it interprets the
object contained in the request
field of a DataServiceRequest as an
Elasticsearch search request,
which is then executed against the _search
endpoint for a specific index, or for all indices if none is specified.
The response is processed such that:
- The
_source
field of each hit is registered as a provider result - The
hits.total.value
field is used as thetotal
value within the provider statistics - Aggregations are processed into a custom provider field,
aggregations
, which is a list of Rosetta Aggregation objects.
Currently only terms, date range and filters aggregations are supported.
#
Properties
#
Example
name: my-provider
type: elasticsearch
properties:
protocol: http
host: localhost
port: 9200
path_prefix: /es/base/path/
index: my-index
skip_ping: true
username: rosetta
password: password
client_timeout: 60000
aggregations_key: aggs
date_range_format: {from} to {to}
#
Glyphs
#
elastic-generic-search
Entity Kind: Glyph
Type: elastic-generic-search
The elastic-generic-search
Glyph transforms a GenericSearchRequest
into an Elasticsearch search request.
The top-level query generated by this Glyph is a boolean Elasticsearch query.
Each GenericSearchRequest
parameter is converted as follows:
- The
from
andsize
parameters are interpreted literally. - Each value in
queries
is used to generate one query for each of the fields configured in thesearch_config.search_fields
property.
Aquery_string
query is generated if query string syntax elements are detected (e.g. "OR" or "AND") in the value, or amatch
query is generated otherwise.
All queries generated from thequeries
list are combined in ashould
clause of aboolean
query, which is in turn combined with queries resulting from other parameters using amust
clause of the top-levelboolean
query. - Each value in
ids
is used to generate one query for each of the fields configured in thesearch_config.id_fields
property.
Amatch
query is generated in each case. All queries generated from theids
list are combined in ashould
clause of aboolean
query, which is in turn combined with queries resulting from other parameters using amust
clause of the top-levelboolean
query. - Each entry in the
filters
map is treated as a separate filter whose key is interpreted as either a field alias or a literal Elasticsearch field (see the Properties table) and the value is a list of values against which that field should match. Each filter is combined in thefilter
list of the top-levelboolean
query (so uses AND-like logic) and each filter value is combined in a theshould
clause of a nestedboolean
query (OR-like logic). - Each entry in the
fields
map generates a separate query where the key interpreted as either a field alias or a literal Elasticsearch field and the value either generates aquery_string
query or amatch
query, depending on the presence of query string syntax (like withqueries
). The queries generated by each entry contribute to themust
clause of the top-levelboolean
query. - Each item in the
aggregations
list generates either adate_range
aggregation if the value matches a configured value ofdate_aggregations[].agg_name
, or aterms
aggregation on the specified field otherwise. The value is interpreted as either a field alias or a literal Elasticsearch field. - Each entry in the
autocomplete
map generates a specializedterms
aggregation using the key as the field/alias and the value to produce a regular expression in theinclude
parameter of the aggregation, such that only terms containing words prefixed by the value are included in the entry list for that aggregation (note the regular expression produced is case-insensitive). This parameter can be used to produce "autocomplete" style suggestions for a given text input and field, which can be useful for faceting on a field with many distinct values, for instance. - Each item in the
sort_clauses
list generates an item in the Elasticsearchsort
request parameter usingpath
as the sort field/alias anddirection
as the sort direction.
#
Properties
#
Field
#
OptionalField
As
#
RequirementLevel
#
Default Search Fields
- field: _generic_all_std
boost: 0.0
- field: _all
boost: 0.0
#
Default Id Fields
- field: @admin.id
boost: 0.0
- field: admin.id
boost: 0.0
#
Example
name: my-glyph
type: elastic-generic-search
properties:
search_config:
search_fields:
- field: my_search_field
boost: 0.0
id_fields:
- field: my_id_field
boost: 0.0
fields:
- field: my_field
boost: 0.0
exists_fields:
- field: my_exists_field
boost: 0.0
requirement: none
path_config:
paths:
alias_1: field_1
alias_2: field_2
alias_3: field_3
require_alias: false
aggregation_configs:
- name: alias_1
size: 20
auto_complete_size: 100
default_aggregation_size: 10
date_aggregations:
- agg_name: date_field_alias
path: date_elasticsearch_field_1
additional_path: date_elasticsearch_field_2
format: dd/MM/yyyy
date_ranges:
- from: 1850
to: 1900
- from: 1900
to: 1950
- from: 1950
to: 2000
date_range_format: '{from} - {to}'
global_filter: my_field:(value_1 OR value_2 OR value_3)
match_operator: and
default_operator: and
request_timeout: 3000