Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to operate the delete mapping type of Elasticsearch

2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains the "Elasticsearch delete mapping type how to operate", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "Elasticsearch delete mapping type how to operate" it!

A preface

Creating an index in elasticsearch7.0.0 or later can contain only a single mapping type. Indexes created in version 5.x with multiple mapping types will continue to run in elasticsearch7.x as before. The type is deprecated in API in elasticsearch7.0.0 and completely deleted in 8.0.0.

Second, what is the mapping type?

Since elasticsearch was published, each document has been stored in a single index and assigned a single mapping type. The mapping type is used to represent the type of document or entity to be indexed. For example, a Weibo (twitter) index may have two types: user (user) type and tweet (user type).

Each mapping type can have its own field, so the user type may have full_name, user_name, email fields, while the tweet type may have content, tweet_at fields, and user_name fields of the user (user) type. Each document has a meta field of _ type containing the type name, and by specifying the type name in URL, the search can be limited to one or more types:

GET twitter/user,tweet/_search {"query": {"match": {"user_name": "kimchy"}

The _ type field is combined with the document _ id to generate the _ uid field, so the document _ id with the same type can be stored in a single index.

Mapping types are also used to parent-child relationships in documents, so the document question of the type can be the parent class answer of the type document.

After a long period of nonsense, everything is fine, isn't it? Then why delete the mapping type?

Third, why delete mapping types?

At first (actually up to now), in order to understand the data organization of elasticsearch, we usually compare elasticsearch with relational database. For example, when we talk about an es index, we usually compare it to a database in a SQL database, while a type is equivalent to a table in a SQL database.

This is a terrible metaphor! Let us have a misunderstanding. Because in an SQL database, tables are independent of each other, fields in one table are independent of fields with the same name in another table, while fields in mapping types are not the case.

In the index of elasticsearch, fields with the same name of different mapping types are internally supported by the same Lucene field. In other words, using the example above, the user_name field in the user (user) type is stored in exactly the same field as the user_name field in the tweet type, and the user_name field in both types must have the same mapping (definition).

When we want to delete one type of date field and another type of Boolean field in the same index, this can lead to frustration (which can be understood as a deletion failure).

Most importantly, storing different entities with few or no common fields in the same index results in sparse data and interferes with Lucene's ability to effectively compress documents.

For these reasons, we decided to remove the concept of mapping types from elasticsearch.

Alternative to four mapping types 4.1 store mapping types separately in the index

The first method is to have an index for each document type, for example, in a Weibo (twitter) index, we can separate the tweet type from the user type and store them in a separate index. In this way, the two indexes of each other do not cause field conflicts.

There are two benefits to this approach:

Data is more likely to be dense, so it benefits from the compression techniques used in Lucene.

The statistics of terms used for full-text search scores will be more accurate, so that all documents in the same index represent a single entity.

The size of each index can be adjusted according to the number of documents it contains, for example, we allocate fewer primary shards for the user type and more primary shards for the tweet type.

4.2 Custom type fields go back to the top

Of course, there is a limit to how many primary shards can be stored in the cluster, and we don't want to waste the entire shard for a collection of thousands of documents. In this case, we can implement our own custom type field, which works similar to the old _ type.

Or the above Weibo (twitter) example, at first, the mapping type looks like this:

PUT twitter {"mappings": {"user": {"properties": {"name": {"type": "text"}, "user_name": {"type": "keyword"}, "email": {"type": "keyword"} "tweet": {"properties": {"content": {"type": "text"}, "user_name": {"type": "keyword"}, "tweet_at": {"type": "date"} PUT twitter/user/kimchy {"name": "Gouzi" "user_name": "Ergouzi", "email": "dog@twodog.com"} PUT twitter/tweet/1 {"name": "kimchy", "tweet_ad": "2019-04-30T10:26:20Z", "content": "single dog"} GET twitter/tweet/_search {"query": {"match": {"user_name": "kimchy"}

As with the example above, please test it in version 5.x or below

We can also do the same by adding a custom type field:

PUT twitter {"mappings": {"doc": {"properties": {"type": {"type": "keyword"}, "name": {"type": "text"}, "user_name": {"type": "keyword"} "email": {"type": "text"}, "content": {"type": "text"}, "tweet_at": {"type": "date"} PUT twitter/doc/user-kimchy {"type": "user", "name": "Gouzi" "user_name": "Ergouzi", "email": "dog@twodog.com"} PUT twitter/doc/tweet-1 {"type": "tweet", "user_name": "kimchy", "tweet_at": "2019-04-30T10:26:20Z" "content": "single dog asking for care"} GET twitter/_search {"query": {"bool": {"must": [{"match": {"user_name": "kimchy"}}] "filter": {"match": {"type": "tweet"}

Version 6.5.4 of the above example runs correctly.

Five parent / child with no mapping type

Previously, a parent-child relationship was represented by setting one mapping type as a parent and one or more other mapping types as children. Now, without multiple types, we can no longer use this syntax. Except that the relationship between documents has been changed to use the new join field, the parent-child feature will continue to work as before.

6. Plan to delete mapping type

This plan to delete the mapping type is a big change for the user, so we try to make it as easy as possible, and the change will be as follows:

In elasticsearch6.6.0:

The index.mapping.single_type:true setting on the index will enable the single index type enforced in 6. 0.

Join field substitutions for parents and sons can be used to create indexes in 5. 6.

In elasticsearch7.x:

Indexes created in 5.x will continue to run in 6.x, just as they did in 5.x.

Indexes created in 6.x allow only a single type for each index, and any field can be used for that type, but must be unique.

The _ type name can no longer be combined with _ id to form the _ uid field, and the _ uid field has become an alias for the _ id field.

The new index no longer supports the old parent / child relationship, but should use the join field.

The _ default_mapping type is not recommended.

The query string parameter (include_type_name), which simply indicates whether the request and response should contain type names, defaults to true and should be set to an explicit value so that it is ready to upgrade to 7.0. the query string parameter (include_type_name) is supported by index creation, index templates, and mapping API. Not setting include_type_name will result in a deprecation warning, and indexes without explicit types will use the default type name _ doc.

In elasticsearch7.x:

It is not recommended to specify a type in the request. For example, document types are no longer required for indexed documents. For automatically generated id, the new index API is PUT {index_name} / _ doc/ {id} in the case of explicit ids and POST {index_name} / _ doc.

The include_type_name parameter in index creation, index template, and mapping API will default to false, and failure to set the parameter will result in a startup warning.

The _ default_mapping type was deleted.

In elasticsearch8.x:

Specifying a type in a request is no longer supported.

The include_type_name parameter has been deleted.

Seventh, migrate multi-type indexes to a single type

Reindex API can be used to convert multi-type indexes to single-type indexes. The following example can be used in Elasticsearch 5.6 or Elasticsearch 6.x. In 6.x, you do not need to specify index.mapping. The default is a single type.

7.1 Index for each document type

The first example splits the Weibo (twitter) index into a tweets index and a user (users) index:

PUT users {"mappings": {"user": {"properties": {"name": {"type": "text"}, "user_name": {"type": "keyword"} "email": {"type": "keyword"} PUT tweets {"mappings": {"tweet": {"properties": {"content": {"type": "text"}, "user_name": {"type": "keyword"} "tweet_at": {"type": "date"}} POST _ reindex {"source": {"index": "twitter", "type": "user"}, "dest": {"index": "users"}} POST _ reindex {"source": {"index": "twitter", "type": "tweet"} "dest": {"index": "tweets"}}

The above code runs correctly in version 6.5.4.

The above example means that previously we had two types (tweet and user) in the Weibo (twitter) index.

Now it's time to separate the two types into separate indexes.

So, first create their respective indexes (tweets and users), and then complete the migration through POST _ reindex.

7.2 Custom Type Field

The second example adds a custom type field and sets it to the original value _ type.

It also adds types to id in case any different types of documents have conflicting id:

PUT new_twitter {"mappings": {"doc": {"properties": {"type": {"type": "keyword"}, "name": {"type": "text"}, "user_name": {"type": "keyword"} "email": {"type": "keyword"}, "content": {"type": "text"}, "tweet_at": {"type": "date"} POST _ reindex {"source": {"index": "twitter"} "dest": {"index": "new_twitter"}, "script": {"source": "" ctx._source.type = ctx._type Ctx._id = ctx._type + "-" + ctx._id; ctx._type = "doc"; ""}}

The above code runs correctly in version 6.5.4.

Thank you for your reading, the above is the "Elasticsearch delete mapping type how to operate" the content, after the study of this article, I believe you on the Elasticsearch delete mapping type how to operate this problem has a deeper understanding, the specific use of the need for you to practice and verify. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report