Operation of elasticsearch slicing 07/13 Update SLTechnology News&Howtos

Operation of elasticsearch slicing

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "the operation of elasticsearch fragmentation". In the operation of actual cases, many people will encounter such a dilemma. Next, let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

The importance of fragmentation

All the data in Es is evenly stored in the shards of each node in the cluster, which will affect the performance, security and stability of ES, so it is necessary to understand it.

What is slicing?

To put it simply, it is the file block of all the data in ES, and it is also the smallest unit block of data. The core of the whole ES cluster is to achieve an amazing speed in the distribution, index, load, routing and so on of all fragments.

Real-listed scenarios:

Suppose that IndexA has 2 shards, and we insert 10 pieces of data (10 documents) into IndexA, then the 10 pieces of data will be divided into 5 shards as evenly as possible and stored in the first shard, and the remaining 5 pieces will be stored in another shard.

Similar to the concept of table partitioning in mainstream relational databases, if you are familiar with relational databases.

The setting of sharding

When creating an IndexName index, you can set curl in Mapping as follows

PUT indexName {"settings": {"number_of_shards": 5}}

Be careful

After the index is established, the number of fragments cannot be changed.

Number of fragments (data node calculation)

Is it better to have as many pieces as possible or as few as possible? It is judged by the amount of data in the entire index.

Real-listed scenarios:

If the size of all the data files in IndexA is 300G, how to customize the scheme?

Suggestion: (reference only)

1. Each shard data file is smaller than 30GB

2. A shard in each index corresponds to a data node

3. The number of nodes is greater than or equal to the number of fragments (excluding the number of replicated fragments)

According to the recommendation, at least 14 shards, 11 data shards and 3 master nodes are required.

Results: 11 data nodes (Node) are built, and the number of shards specified by Mapping is 10, which satisfies one shard for each node, and the data band of each shard is about 30G. At the same time, as a robustness and expansibility, one more data node.

SN (number of fragments) = IS (index size) / 30

NN (number of nodes) = SN (number of fragments) + MNN (number of main nodes [numerous data]) + NNN (number of load nodes)

Fragment query

We can specify es to specific sharding query in order to further achieve es fast query.

1:randomizeacross shards

Randomly select fragments to query data, the default mode of es

2:_local

Priority is given to the shard query data on the local node and then to the shard query on other nodes. The local node has no IO problem but may cause load imbalance. The amount of data is complete.

3:_primary

Only in the main part of the query, not copy check, the general data is complete.

4:_primary_first

Priority to check in the main shard, if the main shard failed, then go to the copy check, the general data is complete.

5:_only_node

Query only in shards in the nodes of the specified id, the data may be incomplete.

6:_prefer_node

Priority in assigning you to the node query, the general data is complete.

7:_shards

Queried in the specified shard, the data may be incomplete.

8:_only_nodes

You can customize to specify multiple node queries, es does not provide this way, you need to change the source code.

/ * specify sharding query * / @ Test public void testPreference () {SearchResponse searchResponse = transportClient.prepareSearch (index) .setTypes ("add") / / .setPreference ("_ local") / / .setPreference ("_ primary") ") / / .setPreference (" _ primary_first ") / / .setPreference (" _ only_node:ZYYWXGZCSkSL7QD0bDVxYA ") / / .setPreference (" _ prefer_node:ZYYWXGZCSkSL7QD0bDVxYA ") .setPreference (" _ shards:0,1 ") 2 ") .setQuery (QueryBuilders.matchAllQuery ()) .setExplain (true) .setQuery () SearchHits hits = searchResponse.getHits (); System.out.println (hits.getTotalHits ()); SearchHit [] hits2 = hits.getHits (); for (SearchHit h: hits2) {System.out.println (h.getSourceAsString ()) This is the end of the content of "the Operation of elasticsearch fragmentation". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.