Analysis of distributed examples in Elasticsearch 07/11 Update SLTechnology News&Howtos

Analysis of distributed examples in Elasticsearch

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article shares with you the content of distributed sample analysis in Elasticsearch. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

High availability and scalability

Service availability-allows nodes to stop service

Data availability-some nodes are lost and no data is lost

Expandability

The increase in the number of requests / the continuous growth of data can achieve horizontal expansion.

Node

The node is an instance of es

Is essentially a JAVA process.

Shards and copies

Slicing characteristic

Store part of the data in pieces, which can be distributed on any node.

Shards are specified and cannot be changed when creating the index. The default is 5.

Slicing can be divided into master and deputy, and the fourth-line data is highly available.

The data of replica fragments are synchronized by the main parts to improve the read throughput.

Instance operation

Cluster setting: node1/node2/node3, and node1 is the primary node; create index: PUT test_index {"settings": {"number_of_shards": 3, # # number of shards 3 "number_of_replicas": 1 # # number of copies 1} shards and replica distribution: node1-p0 R1 node2-p1 R2 node3-p2 R0

problem

1-in the above distributed environment, does increasing the number of nodes increase the data capacity of test_index? No, because there are only 3 shards and are already distributed on 3 nodes, the new nodes cannot be used. 2-can increasing the number of replicas improve the read throughput of test_index? No, the new copies are still distributed on three nodes, using the same resources; 3-suggestion: the number of shards is too small to expand horizontally by adding new nodes; the number of shards is too large, resulting in multiple shards on one node, resulting in a waste of resources; cluster status

Cluster status

Green: health status. All primary and secondary shards are allocated normally.

Yellow: the main shard allocation is normal, but the replica shard allocation is not normal.

Red: there is an unassigned primary shard

Fail-over

Document distributed storage

Mapping from document to fragment

Documents are evenly distributed in slices and make full use of resources.

Brain fissure problem

Cerebral fissure

Two master in the same cluster maintain different cluster state, and the correct master cannot be selected after network recovery.

Instance operation

Real-time document search

Refresh

It takes time for segment to write to disk. With the help of the file system cache feature, segment is cached and open for real-time search, which is called refresh.

Refresh previously stored the document in a buffer, and refresh emptied the document in the buffer to generate segment

Translog

Resolve the problem of downtime when segment is not written to disk in memory

When a document is written to buffer, the requested operation is written to translog,6.x at the same time. By default, each request is discarded.

When Es starts, check the translog file and recover data from it

Flush

Responsible for writing segmet in memory to disk

Empty the index buffer and generate a new segment from the document, which is equivalent to a refresh operation

Update commit point and write to disk

Perform a fsync operation to write the segment in memory to disk

Delete the old translog log

Delete and update documents

Once segment is created, it cannot be changed. How to delete and update documents?

Segment merge

Es will regularly perform segment merge operations in the background to reduce the number of segment.

To force segment merge manually through force_merge api

Thank you for reading! This is the end of this article on "distributed example Analysis in Elasticsearch". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.