How to realize Cluster Distribution in ElasticSearch 07/01 Update SLTechnology News&Howtos

How to realize Cluster Distribution in ElasticSearch

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article shows you how to achieve cluster distribution in ElasticSearch, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

Index (index)

The word "index" has multiple meanings in the context of ElasticSearch: index (noun): compared to the traditional relational database field, an index is equivalent to a database in SQL. The index is identified by its name, which must be in all lowercase characters, and the document is created, searched, updated, and deleted by referencing this name.

Index (verb): to index a document is to store a document in an index (noun) so that it can be retrieved and queried. This is very similar to the INSERT keyword in the SQL statement, except that the new document replaces the old document when the document already exists.

Inverted index: relational databases increase the speed of data retrieval by adding an "index" such as a B-tree (B-tree) index to a specified column. ElasticSearch and Lucene use a structure called "inverted index" to achieve the same goal.

For example, the relationship between a document and an entry is shown in the following figure:

Figure 1: relationship between document and entry

After the field value is analyzed, it is stored in the inverted index, which stores the relationship between the participle (Term) and the document (Doc). The simplified inverted index is shown below:

Figure 2: inverted index

Type (Type)

The type is a logical category/partition within the index, but its meaning depends entirely on the user's needs. Therefore, one or more types (type) can be defined within an index. In general, types are predefined for documents that have the same domain. Compared with the traditional field of relational database, type is equivalent to "table".

Document (Document)

A document is similar to a complete line of data. In ElasticSearch, the document is represented based on JSON format. The document is the atomic unit of index and search, and it is a container containing one or more fields (Field). Each document can store a different set of domains, but documents of the same type (Type) should have at least some degree of similarity.

Node (Node)

A running ElasticSearch instance is called a node, while a cluster is made up of one or more nodes with the same cluster.name configuration that share the pressure of data and load.

There are three different types of nodes in an ES cluster:

Master node: responsible for managing all cluster-wide changes, such as adding or deleting indexes, or adding or deleting nodes, etc. The master node does not need to involve document-level changes and searches. It can be set through the property node.master.

Data node: stores data and its corresponding inverted index. By default, each node is a data node (including the master node), which can be set through the node.data attribute.

Orchestration node: if both the node.master and node.data attributes are false, this node is called an orchestration node, which is used to respond to customer requests and balance the load of each node.

Fragmentation (Shard)

The data in an index is stored in multiple shards, which is equivalent to a horizontal table. A shard is an example of Lucene, which itself is a complete search engine. Our documents are stored and indexed into shards, but the application interacts directly with the index rather than with the shards.

A shard can be a master shard or a replica shard. Any document in the index belongs to a main shard, so the number of main shards determines the maximum amount of data that the index can store. A replica fragment is just a copy of a master slice. Replica fragmentation serves as a redundant backup to protect data from loss in the event of hardware failure, and provides services for reading operations such as searching and returning documents.

Cluster distributed low-level implementation

Now that we have a preliminary understanding of the basic concepts of ElasticSearch, let's delve into these internal details to help you better understand how data is stored and queried in distributed systems.

ES actually uses sharding to achieve distribution. Sharding is the container of data, the document is stored in the shard, and the shard is assigned to each node in the cluster. When your cluster size expands or shrinks, ES automatically migrates shards in each node, so that the data is still evenly distributed in the cluster.

The number of primary shards has been determined when the index is established, but the number of replica shards can be modified at any time. By default, an index has 5 primary shards, and its copies can have any number.

The state of the main shard and the replica shard determines the health of the cluster. Only the main shard or its corresponding replica shard will be saved on each node, and the same replica shard will not exist in the same node. If there is only one node in the cluster, replica shards will not be allocated. In this case, the health status of the cluster is yellow, and there is a risk of data loss.

Distributed document CRUD

Index new document (Create)

When a user submits a request to a node to index a new document, the node calculates which shard the new document should be added to. Each node stores information about which node each shard is stored in, so the coordinator node sends the request to the corresponding node. Note that this request will be sent to the master shard, and when the master shard finishes indexing, the request will be sent to all its replica shards in parallel to ensure that each shard holds the latest data.

Each time a new document is written, it is first written to memory and the operation is written to a translog file (transaction log). At this time, if a search operation is performed, the new document cannot be indexed.

Figure 3: new documents are written to memory and operations are written to translog

ES performs a refresh operation (refresh) every 1 second (this time can be modified), during which time new documents written to memory are written to a file system cache (filesystem cache) and form a segment. At this time, the documents in this segment can be searched, but have not been written to the hard disk, that is, if there is a power outage, these documents may be lost.

Figure 4: emptying memory after performing a refresh and writing new documents to the file system cache

If new documents are constantly written, this process will be repeated over and over again. A new segment will be generated every second, and the translog file will become larger and larger.

Figure 5:translog continues to add new document records

Every 30 minutes or the translog file becomes very large, a fsync operation is performed. At this point, all segment in the file system cache will be written to disk, and the translog will be deleted (a new translog will be generated later).

Figure 6: segment writes to disk after fsync, emptying memory and translog

As can be seen from the above process, documents stored in memory and file system cache between two fsync operations are insecure and will be lost in the event of a power outage. So ES introduces translog to record all operations between the two fsync, so that the machine recovers from the failure or restarts, and the ES can be restored according to the translog.

Of course, translog itself is also a file, which exists in memory and will be lost in the event of a power outage. As a result, ES writes translog to disk every 5 seconds or after a write request completes. It can be considered that an operation on a document is safe and recoverable once it is written to disk, so only when the current operation record is written to disk will ES return the result of the successful operation to the client that sent the operation request.

In addition, since a new segment is generated every second, there will soon be a large number of segment. For a query request for a shard, all segment in the shard will be queried in turn, which will reduce the efficiency of the search. So ES automatically starts the work of merging segment, merging a portion of segment of similar size into a new large segment. The process of merging is actually creating a new segment, and when the new segment is written to disk, all the merged old segment is cleared.

Figure 7: merge segment

Figure 8: delete the old segment after the merge is complete, and the new segment is available for search

Update (Update) and delete (Delete) documents

The index of ES cannot be modified, so update and delete operations are not performed directly on the original index.

The segment on each disk maintains a del file that records the deleted files. Whenever the user makes a delete request, the document is not actually deleted and the index is not changed, but marks the document as deleted in the del file. Therefore, the deleted document can still be retrieved, but it is filtered out when the search result is returned. Documents that are marked for deletion are actually deleted each time the segment merge is started.

Updating the document will first find the original document and get the version number of the document. The modified document is then written to memory, the same process as writing a new document. At the same time, the old version of the document is marked for deletion, and in the same way, the document can be searched but eventually filtered out.

Read operation (Read): query process

The process of query is generally divided into two stages: query (query) and retrieval (fetch). The task of this node is to broadcast query requests to all relevant shards and integrate their responses into a globally sorted result set, which is returned to the client.

Query phase

When a node receives a search request, the node becomes a coordinator node.

Query process distributed search

Figure 9: query process distributed search

The first step is to broadcast the fragmented copy of the request to each node in the index. The query request can be processed by a master shard or a replica shard, and the coordinator node will poll all shard copies in subsequent requests to share the load.

Each shard will build a priority queue locally. If the client requires that the number of result sets starting from the from name in the result sort is size, then each node needs to generate a result set of the size of from+size, so the size of the priority queue is also from+size. The shard returns only a lightweight result to the orchestrating node, containing the ID of each document in the result set and the information needed to sort.

The coordinating node will summarize the results of all the fragments and sort them globally to get the final query sorting result. The query phase ends.

Retrieval phase

The query process gets a sort result that marks which documents meet the search requirements, and these documents still need to be obtained and returned to the client.

The coordinator node determines the document that actually needs to be returned and sends a get request to the shard containing the document; the shard acquires the document and returns it to the coordinator node; the coordinator node returns the result to the client

The above is how to achieve cluster distribution in ElasticSearch. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.