What is the configuration of the ElasticSearch production environment 07/03 Update SLTechnology News&Howtos

What is the configuration of the ElasticSearch production environment

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "what is the configuration of ElasticSearch production environment". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the configuration of ElasticSearch production environment".

ElasticSearch concept shift:

Index: can be thought of as a Schema in the database.

Type: a collection of documents that logically have the same format and can be similar to the concept of tables in a database.

Document: a specific instantiation of the described entity object, which can correspond to the rows in the database.

Field: multiple fields organized as key-value pairs of Json. Fields can be object types, array types, or core data types, and fields can correspond to database columns.

Cluster node type:

Master node: responsible for the coordination of changes between clusters between nodes. These changes include indexing, mapping management, adding and deleting nodes, fragment redistribution and so on.

Data node: used to store Lucene indexes, which is responsible for data insertion in ElasticSearch and meets users' query requests.

Client node: the equivalent of a load balancer that parses the HTTP request and forwards it to the appropriate data node. It separates the parsing and forwarding of requests from the master node and the data node. In addition, the client node is also responsible for summarizing the intermediate results of each node operation and returning the final results to the user. Client nodes are not required in an es cluster, but if you use http that must be disabled on other nodes, the es intercom protocol is enforced.

Tribal nodes: tribal nodes can bridge multiple clusters. It can be used as a load balancer between two clusters. It provides unified access clients for multiple clusters at the back end.

Data distribution:

Sharding: es indexes are allowed to be separated into different data subsets and stored on different nodes. Sharding is a data subset of es index partial data, sharding is an independent storage unit on a single Es data node, and sharding is actually a pure fragmented Lucene index.

Replica: the failover mechanism provided by es replicas. In addition to failover, replicas can also participate in the process of executing queries. If your application is under heavy traffic load, adding hardware to extend more nodes and assigning each shard and replica to an exclusive node will improve query efficiency with greater parallelism. Note: (the more copies you have, the less efficient it is to insert documents into the index).

Production environment configuration (hardware):

Memory: es consumes a lot of memory. In order to avoid OutOfMemory and other potential problems caused by OutOfMemory, and to ensure the efficiency of queries, memory is the primary consideration in our hardware planning. It is recommended to store between 16G and 64G.

The clock rate of CPU:cpu does not have much impact on es performance, but the performance of document insertion and document search depends on the number of concurrent threads, while the number of concurrent threads depends on the number of CPU cores. It is better to have 4 to 8 cup cores in a production environment, and for es, more cup cores are more important than faster cup speeds.

Disk: es is extremely sensitive to io when inserting or querying documents, especially when we bulk load data into es, io has a greater impact on loading performance. SSD hard drives with high IOPS have become the best choice. (note: you may think that frequent updates will shorten the life of the SSD hard drive, but the lucene index is completely unmodifiable, the update operation is to delete and then insert, and there is no update problem for es.)

Network: it's best not to collapse the data center. Note: cluster query performance is determined by the worst performing host in the cluster, and you can consider idle low configuration as a client, or (client + master node).

Other parameter configuration

Memory configuration: by default, the es node's heap memory is 1G memory, which can greatly improve the speed of filtering, sorting and aspect after automatic data caching into memory, so this parameter must be set. Generally, the memory allocated by es is at least half of the physical memory, and the other half is used for lucence cache field information. It uses operating system cache instead of es heap memory.

Thank you for your reading, the above is the content of "what is the configuration of ElasticSearch production environment". After the study of this article, I believe you have a deeper understanding of what the configuration of ElasticSearch production environment has, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.