What is the storage route for Elasticsearch documents? 07/06 Update SLTechnology News&Howtos

What is the storage route for Elasticsearch documents?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

What is the storage route of Elasticsearch documents? in view of this question, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and easy way.

Subject to Elasticsearch 7.9.2.

problem

When you create a document in an index, how do you determine which shard to put in?

There are three potential ways:

Random

Save the correspondence between the document id and the part number in the database

Calculation instead of storage: calculate the fragment number according to a certain algorithm

If method 1 is adopted:

It is easy to create (just get a random number), but you need to find the document in multiple fragments when querying the document.

If you use the second way:

It is simple, direct and reliable to implement, but when the amount of data is large, the table will be very large and the query is relatively slow.

If you use the third method:

You need to do a calculation when creating and querying, and the advantage is that you don't have to maintain the correspondence.

The implementation of ES

ES takes the third approach.

PUT / / _ doc/?routing=

As above, if you specify id and routing when creating the document, the shard number to which the document is placed is:

Es_hash (routing)% number of slices

If routing is not specified, the document id is used as a routing in the calculation.

After the document for the specified routing is created, there is a _ routing field:

{"_ index": "myindex", "_ id": "aaa", "_ routing": "myrk", "_ source": / / other fields} use routing potential problems and ways to avoid them

Assume that:

An index has n > = 2 fragments.

Es_hash ("a")% n = 0

Es_hash ("r")% n = = 1

Execute in turn:

PUT / the_index/_doc/aPUT / the_index/_doc/a?routing=r

Then 2 documents with id an appear in the ES. This is by no means what users expect.

The reason for this is that an ES shard is actually an Lucene index. Id guarantees for documents vary within the same Lucene index, but this cannot be guaranteed among multiple Lucene indexes.

Ways to avoid:

For an id document, if you want to use routing, use it all the time, otherwise you won't use it all the time.

This is the answer to the question about the storage and routing of Elasticsearch documents. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.