Humble opinion on the Optimization of Elasticsearch use 10/30 Update SLTechnology News&Howtos

Humble opinion on the Optimization of Elasticsearch use

2025-10-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Elasticsearch is often used as a tool for log storage and analysis, and is often used in enterprise applications. Elasticsearch provides powerful search and analysis functions, which is an indispensable part of the back-end technology stack.

In the maintenance of the ElastciSearch cluster, the Elasticsearch has been tuned and analyzed, and now it is a humble opinion. If there is anything unreasonable, you are welcome to point it out and discuss it. The version of Elasticsearch I use is 5.x.

Elasticsearch has a large number of requests for querying data and inserting data, requiring a large number of file handles, and the default number of file handles for the centos system is 1024. If the file handle runs out, this means that the operating system will reject the connection, which means that data may be lost, which is a catastrophic consequence

Can't be accepted. For startup users who log in to Elasticsearch, use the following command to check:

Ulimit-a

View the results:

Core file size (blocks,-c) 0data seg size (kbytes,-d) unlimitedscheduling priority (- e) 0file size (blocks,-f) unlimitedpending signals (- I) 127673max locked memory (kbytes,-l) unlimitedmax memory size (kbytes,-m) unlimitedopen files (- n) 1024pipe size (512bytes,-p) 8POSIX message queues (bytes,-Q) 819200real-time priority (- r) 0stack size (kbytes,-s) 8192cpu time (seconds,-t) unlimitedmax user processes (- u) 2056474virtual memory (kbytes) -v) the number of file open files on unlimitedfile locks (- x) unlimited is 1024 In the case of a large number of ElasticSearch requests, the number of handles is not enough and can be changed to 655360.

Temporary changes can take effect immediately by executing the following command, but will fail after the machine is rebooted:

Ulimit-n 655360 takes effect permanently. Modification / etc/security/limits.conf requires rebooting the machine to take effect:

U_es-nofile 655360 in the above configuration, u_es is the user who starts ElasticSearch, and the file handle of the user's ElasticSearch is set to 655360,

JVM parameter optimization

Elasticsearch runs on JVM, so it is very important to tune its JVM parameters. The most common tuning is the allocation of Java memory. The following is the memory model of JVM, and the role of each block is not described here.

What is the proportion of memory allocated between the new generation and the old age?

Jvm memory is divided into the new generation and the old age.

The new generation (or the Garden of Eden)

Space allocated by the newly instantiated object. The Cenozoic space is usually very small, usually at 100 MB-500 MB. The Cenozoic also contains two survival spaces.

Old age

The space that older objects store. These objects are expected to remain for a long time and last for a long time. The old generation is usually much older than the new generation.

Both the new generation and the old generation of garbage collection have a stage of "stop the world". During this time, JVM stopped running the program to analyze the reachability of objects and collect dead objects. At this time when it stops, nothing will happen. Requests are not served, ping are not answered, and shards are not allocated. The whole world really stopped.

For the new generation, this is not a big problem; such a small space means that the GC implementation will be finished quickly. But the older generation is much older, and a slow GC can mean a 1-second or even 15-second pause-which is unacceptable for server software.

How much memory do we usually allocate to the new generation and the old age? What is their proportion?

Generally speaking, it is appropriate to have a memory ratio of 2:1 for the old and new generations. For example, if 3G is allocated to heap memory, 1G is allocated to the new generation, and the rest is allocated to the old age. Configure the jvm.options file in the configuration file of ElasticSearce:

-Xms3g / / configure heap initialization size-Xmx3g / / configure the maximum memory of the heap-Xmn1g / / configure the new generation of memory. How much memory should be allocated to Elasticesearch?

When using Elasticesearch, we upgraded the Elasticesearch machine from a minimum of 8 GB of memory to 16 GB of memory, and then to the current 32 GB of memory. A machine is equipped with an Elasticesearch node. How should we allocate the memory of the machine?

Officials have come up with a solution to allocate half (less) of the memory to Luence and the rest to ElasticSearch.

Memory is absolutely important for Elasticsearch, and it can be used by many in-memory data structures to provide faster operations. But at this point, there is another off-heap that consumes a lot of memory: Lucene.

Lucene is designed to make use of the underlying mechanism of the operating system to cache in-memory data structures. Segments of Lucene are stored separately in a single file. Because segments are immutable, none of these files will change, which is cache-friendly, and the operating system will cache these segments for faster access.

The performance of Lucene depends on its interaction with the operating system. If you allocate all the memory to Elasticsearch's heap memory, there will be no remaining memory for Lucene. This will seriously affect the performance of full-text retrieval.

The standard recommendation is to use 50% of the available memory as Elasticsearch heap memory and retain the remaining 50%. Of course, it won't be wasted, and Lucene will be happy to use the rest of the memory.

Our practical solution is to allocate half of the machine to the Elasticesearch heap, with stack memory, method area, constant pool, and non-heap memory taking up the other half.

The maximum memory allocated to the heap should be less than 32766 mb (~ 31.99 gb)

JVM uses a memory object pointer compression technique when memory is less than 32 GB.

For a 32-bit system, this means that the heap memory size is up to 4 GB. For 64-bit systems, you can use more memory, but 64-bit pointers mean more waste because your pointers themselves are larger. To make matters worse, larger pointers take up more bandwidth when moving data between main memory and all levels of cache (such as LLC,L1, etc.).

Java uses a technique called memory pointer compression (compressed oops) to solve this problem. Its pointer no longer represents the exact location of the object in memory, but represents the offset. This means that a 32-bit pointer can refer to 4 billion objects instead of 4 billion bytes. Eventually, heap memory grows to 32 GB of physical memory, which can also be represented by a 32-bit pointer.

Once you cross the magical boundary of 32 GB, the pointer will switch back to the pointer of the normal object. The longer the pointer to each object, the more CPU memory bandwidth will be used, which means you actually lose more memory. In fact, when memory reaches 40-50 GB, effective memory is equivalent to 32 GB memory when memory object pointer compression technology is used.

This description means: even if you have enough memory, try not to exceed 32 GB. Because it wastes memory, degrades the performance of CPU, and makes GC cope with a lot of memory.

Turn off swap

Swapping memory to disk is fatal to server performance.

If memory is swapped to disk, a 100 microsecond operation may become 10 milliseconds. And think about how many 10 microsecond operation delays add up. It's not hard to see how terrible swapping is for performance.

Turn off swap with the following command:

Sudo swapoff-a

Do not touch the following configuration

All the adjustments are for optimization, but you really don't need to pay attention to them. Because they are often misused, resulting in system instability or poor performance, or even both.

Thread pool configuration

Many people like to adjust the thread pool. Whatever the reason, people can't resist increasing the number of threads. Too many indexes? Add threads! Too many searches? Add threads! The node idle rate is less than 95%? Add threads!

The default threading settings for Elasticsearch are already reasonable. For all thread pools (except search), the number of threads is set based on the number of CPU cores. If you have eight cores, you can run only eight threads at the same time, and it makes sense to assign only eight threads to any particular thread pool.

The search thread pool setting is larger and configured as int ((number of cores * 3) / 2) + 1.

Garbage collector

The default garbage collector (GC) for Elasticsearch is CMS. This garbage collector can be processed in parallel with the application so that it can minimize pauses. However, it has two stop-the-world phases, and it is a bit difficult to deal with large memory.

Despite these shortcomings, it is currently the best garbage collector for software with low latency requirements such as Elasticsearch. The official recommendation is to use CMS.

Reasonable setting of the minimum primary node

The minimum_master_nodes setting is so important that in order to prevent the cluster from splitting, this parameter should be set to the legal number of (number of master candidate nodes / 2) + 1.

Slicing evenly, disk optimization, eliminating high-load Master campaign?

In the actual production environment, the author encountered that the load of one node is several times that of other nodes. From the perspective of virtual machine monitoring, the qps of all nodes is similar. The configuration of the machine is the same, why is there such a big difference in load?

First of all, we suspected that the data was unevenly distributed, and we checked that there was no such phenomenon.

Then, we monitored that the high-load node disk IO is very high, often reaching 100%, and we suspect that the virtual machine disk performance is not good. But we didn't have a better disk at the time.

We found a moderate solution is to remove the high-load node from the Master campaign, change the node.master in the elasticsearch.yml file to false and then restart, and the load dropped a little.

Optimization of data storage days

Optimization of storage days, which needs to be based on the actual business. Here is a script to delete expired data, which comes from https://stackoverflow.com/questions/33430055/removing-old-indices-in-elasticsearch#answer-39746705.

#! / bin/bashsearchIndex=logstash-monitorelastic_url=logging.core.k94.kvk.nlelastic_port=9200date2stamp () {date-- utc-- date "$1" +% s} dateDiff () {case $1 in-s) sec=1; shift;;-m) sec=60; shift;;-h) sec=3600; shift;;-d) sec=86400; shift;; *) sec=86400;; esac dte1=$ (date2stamp $1) dte2=$ (date2stamp $2) diffSec=$ ((dte2-dte1)) if ((diffSec < 0)) Fi echo $((diffSec/sec*abs))} for index in $(curl-s "${elastic_url}: ${elastic_port} / _ cat/indices?v" | grep-E "${searchIndex}-20 [0-9] [0-9]\. [0-1] [0-9]\. [0-3] [0-9]" | awk'{print $3}') Do date=$ (echo ${index:-10} | sed's /\. /-/ g') cond=$ (date +% Y-%m-%d) diff=$ (dateDiff-d $date $cond) echo-n "${index} (${diff})" if [$diff-gt 1]; then echo "/ DELETE" # curl-XDELETE "${elastic_url}: ${elastic_port} / ${index}? pretty" else echo "fidone then use crontab to execute the script regularly once a day.

Cluster sharding setting

Once ES has created an index, it cannot adjust the setting of shards. In ES, a shard actually corresponds to a lucene index, and the reading and writing of lucene index will take up a lot of system resources. Therefore, the number of shards cannot be set too large. Therefore, it is very important to configure the number of shards reasonably when creating an index. Generally speaking, we follow some principles:

Control that the hard disk capacity occupied by each shard does not exceed the maximum JVM heap space setting of ES (generally, the setting does not exceed 32G, as part of the JVM setting principle above). Therefore, if the total capacity of the index is about 500g, then the shard size is about 16; of course, it is best to consider principle 2 at the same time.

Consider the number of node. Generally speaking, a node is sometimes a physical machine. If the number of shards is too large, which greatly exceeds the number of nodes, it is likely to result in multiple shards on a node. Once the node fails, even if more than one copy is maintained, it may also result in data loss and the cluster cannot recover. Therefore, it is generally set that the number of fragments does not exceed 3 times the number of nodes.

Index optimization

1. Modify the setting of index_buffer_size, which can be set to a percentage or a specific size, which can be tested according to the size of the cluster.

Indices.memory.index_buffer_size:10% (default) indices.memory.min_index_buffer_size:48mb (default) indices.memory.max_index_buffer_size

For the use of the _ id field, you should avoid customizing _ id as much as possible to avoid version management for ID. It is recommended to use ES's default ID generation policy or numeric type ID as the primary key.

For the use of _ all field and _ source field, you should pay attention to the scenario and need. The _ all field contains all the index fields for full-text retrieval. If this is not required, you can disable it. _ source stores the original document content. If there is no need to obtain the original document data, you can define the field to be put into _ source by setting includes and excludes attributes.

Reasonable configuration uses index attributes, analyzed and not_analyzed to control whether or not fields are segmented or not according to business requirements. Only the fields required by groupby are set to not_analyzed when configured to improve the efficiency of query or clustering.

Query optimization

Query optimization, adjusting the order of filter filtering

If the filtering effect is not obvious in front of the conditions, resulting in a large number of unneeded data query, resulting in slow query.

Advance the conditions with obvious filtering effect, and sort the filtering conditions according to the filtering effect.

Optimization of index time accuracy

The study of the working principle of Filter can be seen that it traverses the entire index every time, so the larger the time granularity, the faster the comparison, the shorter the search time. Without affecting the function, the lower the time accuracy, the better, sometimes it is worth sacrificing a little accuracy, of course, the best case is that there is no time limit at all.

Es refreshes the index and adds redundant time fields that are accurate to days. Queries with time ranges use this field to query

Query Fetch Source optimization

The data set obtained by the business query statement is relatively large, and unnecessary fields are obtained from the source, resulting in a slow query.

Example: you only need to query the id field from es, but query all the fields.

Pre-indexed data

Using index to query data is the best way. For example, if all documents have price fields and most queries run scope aggregations in a fixed scope list, aggregation can be achieved faster by pre-referencing index to index and using terms aggregations.

For example, something like this:

PUT index/type/1 {"designation": "spoon", "price": 13} queries like this:

GET index/_search {"aggs": {"price_ranges": {"range": {"field": "price", "ranges": [{"to": 10}, {"from": 10, "to": 100}, {"from": 100}]}} documents should be indexed using price_range, which should be mapped as keywords:

PUT index {"mappings": {"type": {"properties": {"price_range": {"type": "keyword"}} PUT index/type/1 {"designation": "spoon", "price": 13, "price_range": "10-100"} then the request aggregates the new field directly instead of running the scope query in the price field:

GET index/_search {"aggs": {"price_ranges": {"terms": {"field": "price_range"}} Summary

Generally speaking, the optimization of ElasticSearch can be considered from the following aspects:

Hardware optimization: machine allocation, machine configuration, machine memory, machine CPU, machine network, machine disk performance

Operating system settings optimization: file handle optimization, swap shutdown

ElasticSearch rationally allocates nodes, rationally allocates nodes running for Master

Optimization of ElasticSearch storage, number of replicas, number of indexes, number of fragments

Use optimization of ElasticSearch, optimization of index, optimization of query

Finally, share an interview book "Java Core knowledge Point arrangement. Pdf", covering JVM, locks, high concurrency, reflection, Spring principles, micro services, Zookeeper, databases, data structures, and so on.

Friends in need can join the Java architecture technology exchange Q group 328993819 exchange, discussion,

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.