Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Common operations of Elasticsearch: mapping section

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

[TOC]

In fact, it is whether the field type of es is automatically detected by es or specified by ourselves, so it will be divided into dynamic mapping and static mapping.

1 dynamic mapping 1.1 Mapping rules data in JSON format automatically speculate field types null no fields are added true or falseboolean type floating point type numeric float type numeric long type JSON object object type array determined by the first non-null value in the array string may be date type (enable date detection), double or long type, text type, keyword type 1.2 date detection

It is enabled by default (es5.4). The test case is as follows:

PUT myblogGET myblog/_mappingPUT myblog/article/1 {"id": 1, "postdate": "2018-10-27"} GET myblog/_mapping {"myblog": {"mappings": {"article": {"properties": {"id": {"type": "long"} Postdate: {"type": "date"}

When date detection is turned off, it will not be detected as a date, as follows:

PUT myblog {"mappings": {"article": {"date_detection": false}} GET myblog/_mappingPUT myblog/article/1 {"id": 1, "postdate": "2018-10-27"} GET myblog/_mapping {"myblog": {"mappings": {"article": {"date_detection": false "properties": {"id": {"type": "long"}, "postdate": {"type": "text", "fields": {"keyword": {"type": "keyword" "ignore_above": 256} 2 static mapping 2.1 basic case PUT myblog {"mappings": {"article": {"properties": {"id": {"type": "long"}, "title": {"type": "text"} "postdate": {"type": "date"} GET myblog/_mappingPUT myblog/article/1 {"id": 1, "title": "elasticsearch is wonderful!" "postdate": "2018-10-27"} GET myblog/_mapping {"myblog": {"mappings": {"article": {"properties": {"id": {"type": "long"}, "postdate": {"type": "date"} "title": {"type": "text"} 2.2 dynamic attribute

By default, when you add a document, es is also added if a new field appears, but this can be controlled and can be set through dynamic:

The dynamic value indicates that the default value of true is true, and automatically add fields false ignoring the new field strict strict mode PUT myblog {"mappings": {"article": {"dynamic": "strict", "properties": {"id": {"type": "long"}, "title": {"type": "text"}, "postdate": {"type": "date"} GET myblog/_mappingPUT myblog/article/1 {"id": 1 "title": "elasticsearch is wonderful!", "content": "a long text", "postdate": "2018-10-27"} {"error": {"root_cause": [{"type": "strict_dynamic_mapping_exception", "reason": "mapping set to strict, dynamic introduction of [content] within [article] is not allowed"}], "type": "strict_dynamic_mapping_exception" "reason": "mapping set to strict, dynamic introduction of [content] within [article] is not allowed"} "status": 400} 3 Field types 3.1General field types first level classification second level classification specific types core types string types string, text, keyword numeric types long, intger, short, byte, double, float, half_float, Scaled_float date type date Boolean type boolean binary type binary range type range compound type array type array object type object nested type nested geographic type geographic coordinates geo_point geographic graphics geo_shape special type IP type ip range type completion token count type token_count attachment type attachment extraction type percolator

The following is only a list of some commonly used in personal work, details can be found in the official document: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html.

3.1.1 string

Ex is not supported after 5.x, but can still be added, replaced by text or keyword.

3.1.2 text

The contents of the fields used for full-text search are analyzed by the word splitter, and the string is divided into word items by the word separator before the inverted index is generated.

In practical applications, text is often used in long text fields, such as article's content. Obviously, such fields do not make much sense for sorting and aggregation.

3.1.3 keyword

Can only be searched by exact values, different from the text type.

The terms of its index are the field content itself, so in practical application, it will be used for comparison, sorting, aggregation and other operations.

3.1.4 numeric type

The details of the specific attention can be considered in the official documents, the general use can meet the needs.

3.1.5 date

There is no date type in json, so the default es time can be in the form of:

1. "yyyy-MM-dd" or "yyyy-MM-ddTHH:mm:ssZ", that is, "yyyy-MM-dd HH:mm:ss" needs to be written in the form of "2018-10-22T23:12:22Z", which actually adds a time zone; 2. Represents the number of long integers of the millisecond timestamp. The number of integers representing the timestamp in seconds

Es internally stores the number of long integers for millisecond timing.

Of course, the above is only by default. When setting the type of the field, we can also set the time format defined by ourselves:

PUT myblog {"mappings": {"article": {"properties": {"postdate": {"type": "date", "format": "yyyy-MM-dd HH:mm:ss"}

Format can also specify multiple date formats, separated by "| |":

"format": "yyyy-MM-dd HH:mm:ss | | yyyy/MM/dd HH:mm:ss"

You can then write data in the defined time format:

PUT myblog/article/1 {"postdate": "2017-09-23 23:12:22"}

In my work scenario, if what needs to be saved is time, it is often processed as a millisecond timestamp, then stored in es, and then processed as a time string when it is taken out and displayed.

3.1.6 boolean

After setting the field type to boolean, the values you can fill in are: true, false, "true", "false".

3.1.7 binary

The binary type accepts a string encoded by base64.

3.1.8 array

Es does not have a dedicated array type, and by default any field can contain one or more values, but the values in an array must be of the same type. When adding data dynamically, the type of the first value of the array determines the type of the entire array (that is, the type of this field), and mixed arrays are not supported. An array can contain null values, and an empty array [] is treated as a missing field. In addition, the use of the array type in the document does not need to be configured in advance and is supported by default.

For example, add the field data of the following array:

DELETE my_indexPUT my_index/my_type/1 {"lists": [{"name": "xpleaf", "job": "es"}]}

In fact, the type of the field is dynamically mapped to text:

GET my_index/my_type/_mapping {"my_index": {"mappings": {"my_type": {"properties": {"lists": {"properties": {"job": {"type": "text" "fields": {"keyword": {"type": "keyword", "ignore_above": 256}, "name": {"type": "text" "fields": {"keyword": {"type": "keyword", "ignore_above": 256}

Direct search is also supported:

GET my_index/my_type/_search {"query": {"term": {"lists.name": {"value": "xpleaf"}

Return the result:

{"took": 0, "timed_out": false, "_ shards": {"total": 5, "successful": 5, "failed": 0}, "hits": {"total": 1, "max_score": 0.2876821, "hits": [{"_ index": "my_index", "_ type": "my_type" "_ id": "1", "_ score": 0.2876821, "_ source": {"lists": [{"name": "xpleaf", "job": "es"}]}} 3.1.9 object

You can write a json object directly to es, as follows:

DELETE my_indexPUT my_index/my_type/1 {"object": {"name": "xpleaf", "job": "es"}}

In fact, the type of the field is dynamically mapped to text:

{"my_index": {"mappings": {"my_type": {"properties": {"object": {"properties": {"job": {"type": "text" "fields": {"keyword": {"type": "keyword", "ignore_above": 256}, "name": {"type": "text" "fields": {"keyword": {"type": "keyword", "ignore_above": 256}

Direct search is also possible:

GET my_index/my_type/_search {"query": {"term": {"object.name": {"value": "xpleaf"}

Return the result:

{"took": 0, "timed_out": false, "_ shards": {"total": 5, "successful": 5, "failed": 0}, "hits": {"total": 1, "max_score": 0.2876821, "hits": [{"_ index": "my_index", "_ type": "my_type" "_ id": "1", "_ score": 0.2876821, "_ source": {"object": {"name": "xpleaf", "job": "es"}

The object object, which is actually flattened inside the es, as above, in es is actually:

{"object.name": "xpleaf", "object.job": "es"}

3.1.10 nested

The nested type is a special case of the object type, which allows an array of objects to be indexed and queried independently. Lucene has no concept of internal objects, so es flattens the object hierarchy into a simple list of field names and values.

Although it is a special case of the object type, the type of its field is fixed, that is, nested, which is the biggest difference from object.

So why use the nested type? can't you just use object? Here is an official example to illustrate (https://www.elastic.co/guide/en/elasticsearch/reference/5.6/nested.html):

Arrays of inner object fields do not work the way you may expect. Lucene has no concept of inner objects, so Elasticsearch flattens object hierarchies into a simple list of field names and values. For instance, the following document:

PUT my_index/my_type/1 {"group": "fans", "user": [{"first": "John", "last": "Smith"}, {"first": "Alice", "last": "White"}]}

Would be transformed internally into a document that looks more like this:

{"group": "fans", "user.first": ["alice", "john"], "user.last": ["smith", "white"]}

The user.first and user.last fields are flattened into multi-value fields, and the association between alice and white is lost. This document would incorrectly match a query for alice AND smith:

GET my_index/_search {"query": {"bool": {"must": [{"match": {"user.first": "Alice"}}, {"match": {"user.last": "Smith"}}]}

The above is the problem caused by the direct use of object, that is to say, when actually doing the above search, the document should not be matched, but it does. Using the nested object type allows you to maintain the independence of each object in the array, and the nested type indexes each object in the array as a separate hidden document, which means that each nested object can be searched independently.

If you need to index arrays of objects and to maintain the independence of each object in the array, you should use the nested datatype instead of the object datatype. Internally, nested objects index each object in the array as a separate hidden document, meaning that each nested object can be queried independently of the others, with the nested query:

PUT my_index {"mappings": {"my_type": {"properties": {"user": {"type": "nested"} PUT my_index/my_type/1 {"group": "fans", "user": [{"first": "John", "last": "Smith"}} {"first": "Alice", "last": "White"}]} GET my_index/_search {"query": {"nested": {"path": "user", "query": {"bool": {"must": [{"match": {"user.first": "Alice"} {"match": {"user.last": "Smith"} GET my_index/_search {"query": {"nested": {"path": "user" "query": {"bool": {"must": [{"match": {"user.first": "Alice"}}, {"match": {"user.last": "White"}]} "inner_hits": {"highlight": {"fields": {"user.first": {}

Indexing a document with 100 nested fields is actually indexing 101documents, each nested document being indexed as a separate document. To prevent over-definition of the number of nested fields, the number of nested fields that can be defined per index is limited to 50.

3.1.11 range

The range type and its value range are as follows:

Type range integer_range- 2 ^ 31 ~ 2 ^ 31-1float_range32-bit IEEE 754longitude range-2 ^ 63 ~ 2 ^ 63-1double_range64-bit IEEE 754date_range64 bit integer, millisecond timing 3.2yuan field

A meta field is a field that describes the document itself, which is classified and described as follows:

Meta-field classification specific properties the meta-field whose index document belongs to the index _ uid contains compound fields of _ type and _ id (values are {type} # {id}) _ type of type document _ meta-field of id source document of id document _ size of original JSON string _ size_source field of source document _ all contains super field of index all fields _ field_names document contains non-empty value All fields of the route meta-field _ parent specifies the parent-child relationship between documents _ routing customizes the routing value of the document to a specific shard custom meta-field _ meta is used for custom metadata

For more information on each field, please see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-fields.html.

4 mapping parameters

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-params.html.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report