How to configure files for Elasticsearch 02/07 Update SLTechnology News&Howtos

How to configure files for Elasticsearch

2026-02-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

Most people do not understand the knowledge points of this "Elasticsearch how to configure file" article, so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "Elasticsearch how to configure file" article.

A preface

In the elasticsearch\ config directory, there are three core configuration files:

Elasticsearch.yml,es-related configuration.

Configuration of parameters related to jvm.options,Java jvm.

Log4j2.properties, log-related configuration, because es uses log4j's logging framework.

Here take the elasticsearch7.5.4 version as an example, and due to different versions, the configuration is not quite the same, just for reference!

Two elasticsearch.yml2.1 Cluster

Configure the cluster name, a cluster consisting of multiple es instances, with a common name.

Cluster.name: my-application

Cluster port settings.

Transport.tcp.port: 9300

Prevent the master copy of the same shard from existing on the same physical machine.

Cluster.routing.allocation.same_shard.host:true

When initializing data recovery, the number of concurrent recovery threads is 4 by default.

Cluster.routing.allocation.node_initial_primaries_recoveries: 4

The number of concurrent recovery threads when adding and deleting nodes or load balancers. The default is 4.

Cluster.routing.allocation.node_concurrent_recoveries: 42.2 Node

Node name configuration, an es instance is actually an es process, which is called a node in the cluster. If a cluster is configured on a server, the names of each node cannot be duplicated.

Node.name: node-1

Add custom attributes to the node

Node.attr.rack: r1

Whether the node is eligible to be the primary node, the default is true.

Node.master: true

Sets whether the node stores data.

Node.data: true

Set the default number of master shards, which defaults to 5. It should be noted that once the master shard is assigned, it cannot be changed.

Index.number_of_shards: 5

Set the number of replication shards by default. By default, one primary shard corresponds to one replication shard. It is important to note that replication shards can be adjusted manually.

Index.number_of_replicas: 1

Set the limited bandwidth for data recovery. Default is 0 and no limit.

Indices.recovery.max_size_per_ser: 0

Set this parameter to limit the maximum number of concurrent streams that can be opened at the same time when recovering data from other shards. The default is 5.

Indices.recovery.concurrent_streams: 5

Set the limited bandwidth for data recovery. Default is 0 and no limit.

Indices.recovery.max_size_per_ser: 0

Set this parameter to limit the maximum number of concurrent streams that can be opened at the same time when recovering data from other shards. The default is 5.

Indices.recovery.concurrent_streams: 52.3 Paths

Stores data path settings, multiple paths separated by commas in English status, and the conf directory under the default root directory.

Path.data: / path/to/data# path.data: / path/to/data1,/path/to/data1

Set the temporary file storage path. The default is the work directory under the es directory.

Path.work: / path/to/work

The log file path, which defaults to the logs directory under the root directory.

Path.logs: / path/to/logs

Set the storage path for log files. The default is the logs directory under the es directory.

Path.logs: / path/to/logs

Set the storage path of the plug-in. The default is the plugins directory under the es directory.

Path.plugins: / path/to/plugins2.4 Network

Bind a specific IP address for the es instance.

Network.host: 192.168.0.1

The above settings can be split into two parameters.

Network.bind_host: 192.168.0.1 # set the bound ip address. Either ipv4 or ipv6 can network.publish_host: 192.168.0.1 # set the ip address for other nodes to interact with this node. If it is not set, it will automatically judge, and the value must be a real ip address.

Set a specific port for the es instance, which defaults to port 9200.

Http.port: 92002.5 Discovery

Sets whether to turn on the multicast discovery node. The default is true.

Discovery.zen.ping.multicast.enabled: true

Configure the es unicast discovery list to discover other es instances when es starts and join the cluster.

Discovery.zen.ping.unicast.hosts: ["host1", "host2"] discovery.zen.ping.unicast.hosts: ["10.0.0.1", "10.0.0.3 virtual 9300", "10.0.0.6 [9300-9400]"]

The discovery.zen.minimum_master_nodes setting tells the cluster how many nodes are eligible to be primary nodes. The general rule is that the number of cluster nodes is divided by 2 (rounded down) plus one. For example, 3 node clusters should be set to 2, which is an attempt to prevent brain fissure.

Set the ping connection timeout when other nodes are automatically discovered in the cluster. The default is 3 seconds. For poor network environments, you can set a higher value to prevent errors during automatic discovery.

Discovery.zen.ping.timeout: 3s2.6 Memory

Lock memory at startup, default to true, because es is less efficient when jvm starts swapping, so to ensure that it is not swap, you can set the two environment variables ES_MIN_MEM and ES_MAX_MEM to the same value, and make sure that the machine has enough memory allocated to es. At the same time, it is also necessary to allow elasticsearch processes to lock memory, and under linux, you can use the ulimit-l unlimited command

Bootstrap.memory_lock: true

Swapping exchange is prohibited.

Bootstrap.mlockall: true2.7 Gateway

Sets whether to compress the data transmitted by tcp. The default is that false is not compressed.

Transport.tcp.compress: true

Sets the maximum capacity of the content. The default is 100mb.

Http.max_content_length: 100mb

Whether to use http protocol to provide services. The default is true.

Http.enabled: false

Set the type of gateway, default to the local file system, or you can set the distributed file system, Hadoop's HDFS or AWS's.

Gateway.type: local

At the end

Block the initial recovery after a full restart of the cluster until N nodes are started, see Recovery for details

Gateway.recover_after_nodes: 3

Sets the timeout for initializing the data recovery process. The default is 5 minutes.

Gateway.recover_after_time: 5m

Set the number of nodes in the cluster. The default is 2. As soon as these N nodes are started, the data will be recovered immediately.

Gateway.expected_nodes: 22.8 Various

An explicit name is required when deleting an index.

Action.destructive_requires_name: true three jvm.options

Setting the size, maximum and minimum values of the jvm heap should be consistent and should be based on your physical memory.

-Xms1g # set the minimum heap to 1g-Xmx1g # set the maximum heap to 1g four log4j2.properties

We generally do not modify the configuration of this configuration file.

The above is the content of this article on "how to configure Elasticsearch file". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.