How to tune Elasticsearch 04/18 Update SLTechnology News&Howtos

How to tune Elasticsearch

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to tune Elasticsearch". The explanation in this article is simple and clear, easy to learn and understand. Please follow the ideas of Xiaobian and go deep into it slowly to study and learn "how to tune Elasticsearch" together!

1. amount of data

A considerable amount of news and tweets are generated every day; at peak times, there are about 3 million editorial articles to index, and nearly 100 million social post data. Editorial data is stored for long periods of time for retrieval (dating back to 2009), and social post data is stored for nearly 15 months. The current primary shard data uses approximately 200 TB of disk space and the replica data approximately 600 TB.

Our business gets 3,000 requests per minute. All requests go through a service called search-service, which in turn performs all interactions with the Elasticsearch cluster. Most of the search rules are complex and included in panels and news streams. For example, a customer might be interested in Tesla and Elon Musk but want to exclude all information about SpaceX or PayPal. Users can use a flexible syntax similar to Lucene query syntax, as follows:

Tesla AND "Elon Musk" NOT (SpaceX OR PayPal)

Our longest such query is more than 60 pages. The point is: no query is as simple as "Barack Obama" in Google, except for 3,000 requests per minute; it's a terrible beast, but ES nodes have to struggle to find a matching document set.

2. version

We are running a custom version of Elasticsearch 1.7.6. The only difference between this release and the 1.7.6 backbone release is that we backported roaming bitsets/bitmaps as caches. This feature was ported from Lucene 5 to Lucene 4 and ported to ES 1.X. Elasticsearch 1.X uses the default bitset as a cache, which is expensive for sparse results, but has been optimized in Elasticsearch 2.X.

Why not use a newer version of Elasticsearch? The main reason is the difficulty of upgrading. Rolling upgrades between major releases only works from ES 5 to 6 (rolling upgrades from ES 2 to 5 should also be supported, but not tried). Therefore, we can only upgrade by restarting the entire cluster. Downtime is almost unacceptable to us, but maybe we can handle about 30-60 minutes of downtime from a restart; what's really worrying is that there's no real rollback once a failure occurs.

So far we have chosen not to upgrade the cluster. Of course we hope to upgrade, but there are more urgent tasks at the moment. In fact, how to implement the upgrade is not yet decided, and it is likely that you will choose to create another new cluster rather than upgrade the existing one.

3. Node configuration

We have been running primary clusters on AWS since June 2017, using i3.2xlarge instances as data nodes. We previously ran clusters in COLO (Co-located Data Center), but later migrated to AWS Cloud to gain time when new machines went down, making us more resilient when scaling up and down.

We run 3 candidate master nodes in different availability zones and set discovery.zen.minimum_master_nodes to 2. This is a very common strategy for avoiding the split-brain problem.

Our dataset required 80% capacity and more than 3 replicas in terms of storage, which allowed us to run 430 data nodes. The original intention was to use different tiers of data, storing older data on slower disks, but since we only had relevant lower-order data older than 15 months (only edited data because we discarded older social data), this didn't work. The hardware overhead per month is much higher than running in COLO, but cloud services support scaling clusters up to 2x in almost no time.

You may ask why you chose to manage and maintain ES clusters yourself. In fact, we considered hosting solutions, but finally chose to install it ourselves, the reason is: AWS Elasticsearch Service

The controllability of exposure to users is so poor that Elastic Cloud costs 2-3 times more than running clusters directly on EC2.

To protect ourselves in the event of an availability zone outage, nodes are scattered across all three availability zones of eu-west-1. We use the AWS plugin to complete this configuration. It provides a node attribute called aws_availability_zone, and we set cluster.routing.allocation.awareness.attributes to aws_availability_zone. This ensures that copies of ES are stored in different availability zones as much as possible, and queries are routed to nodes in the same availability zone as much as possible.

These instances run Amazon Linux, are temporarily mounted as ext4, and have about 64GB of memory. We allocated 26GB of heap memory for ES nodes and the rest for disk cache. Why 26GB? Because JVM is built on top of a black magic.

We also use Terraform auto-scaling groups to provide instances and Puppet to complete all installation configuration.

4. index structure

Because our data and queries are time-series based, we use time-based indexing, similar to ELK (elasticsearch, logstash, kibana) stack. It also allows different types of data to be stored in different index repositories, so that data such as editorial documents and social documents end up in different daily index repositories. This allows you to discard only social indexes when needed and add some query optimization. Each day index runs on one of two shards.

This setting produces a large number of fragments (close to 40k). With so many shards and nodes, cluster operations sometimes become more ad hoc. For example, dropping indexes seems to be a bottleneck for the cluster master, which needs to push cluster state information to all nodes. Our cluster state data is about 100 MB, but TCP compression reduces it to 3 MB (you can view your own cluster state data by curling localhost:9200/_cluster/state/_all). Master nodes still need to push 1.3 GB of data per change (430 nodes x 3 MB state size). In addition to this 1.3 GB of data, there are about 860 MB that must be transferred between available areas, such as the most basic over the public Internet. This can be time-consuming, especially if hundreds of indexes are deleted. We hope the new version of Elasticsearch optimizes this, starting with ES 2.0's support for sending only differential data for cluster state.

5. performance

As mentioned earlier, our ES cluster needs to handle some very complex queries in order to meet customer retrieval needs.

We've done a lot of performance work over the last few years to cope with query loads. We must try to share the ES cluster performance test fairly, as can be seen from the following quotation.

Unfortunately, less than one-third of queries complete successfully when the cluster goes down. We believe the testing itself caused the cluster outage.

--Excerpt from the first performance test on the new ES cluster platform using real queries

To control query execution, we developed a plug-in that implements a number of custom query types. These query types are used to provide features and performance optimizations that are not supported in the official version of Elasticsearch. For example, we implemented wildcard queries in phrases to support execution in SpanNear queries; another optimization is to support "*" instead of match-all-query ; and a number of other features.

The performance of Elasticsearch and Lucene is highly dependent on specific queries and data, and there is no silver bullet. Even so, there are still some references from basics to advancements:

Limit your search to relevant data only. For example, for a daily index library, only the relevant date range is searched. Avoid range queries/filters for index retrieval in the middle of a range.

Ignore the prefix wildcards when using wildcards -unless you can index term inversely. Double-ended wildcards are difficult to optimize.

Watch for signs of resource consumption Is CPU usage on the data node continuing to spike? Is IQ waiting to go high? Look at GC statistics. These can be obtained from profilers tools or through JMX agents. If ParNewGC takes more than 15% of the time, check the memory log. If there are any Serial GC pauses, you may have a real problem. Don't know much about this stuff?

Never mind, this series of blog posts is a good introduction to JVM performance. Keep in mind that ES and G1 garbage collectors together are not optimal.

If you experience garbage collection problems, do not try to adjust GC settings. This often happens because the default settings are reasonable. Instead, focus on reducing memory allocation. How exactly? See below.

If you have a memory problem that you don't have time to resolve, consider querying Azul Zing. This is an expensive product, but the JVMs that use them alone can increase throughput by a factor of 2. But ultimately we didn't use it because we couldn't prove value for money.

Consider using caches, including Elasticsearch external caches and Lucene-level caches. In Elasticsearch 1.X you can control caching by using filter. Later versions look a bit more difficult, but it seems possible to implement your own query type for caching. We may do something similar in the future when we upgrade to 2.X.

Check to see if there is hot data (e.g., a node carrying all the load). You can try load balancing, using shard allocation filtering, or try migrating shards yourself by cluster rerouting. We already use linear optimization for automatic rerouting, but using simple automated strategies can also help.

Set up a test environment (I prefer a laptop) to load some representative data from an online environment (at least one shard is recommended). Pressurize (harder) with inline query playback. Use local settings to test the resource consumption of the request.

Combining the above points, enable a profiler on the Elasticsearch process. This is the most important item on this list.

We use Flight Recorder both via Java Mission Control and VisualVM. People who try to speculate (including paid consultants/tech support) on performance issues are wasting their (and your) time. Investigate what parts of the JVM consume time and memory, and then explore the Elasticsearch/Lucene source code to see what parts of the code are executing or allocating memory.

Once you know which part of the request is causing the slow response, you can optimize it by trying to modify the request (e.g., changing the execution hints for term aggregations, or switching query types). Changing the query type or query order can have a greater impact. If that doesn't work, try optimizing ES/Lucene code. This may seem exaggerated, but it can reduce CPU consumption by 3 to 4 times and memory usage by 4 to 8 times. Some of the modifications are minor (such as indices query), but others may require us to rewrite query execution entirely. The resulting code depends heavily on our query patterns, so it may or may not be suitable for others to use. So far we haven't open-sourced this code. But it might be good material for the next post.

Chart Description: Response Time. Lucene query execution with/without rewrite. It also means that there are no more nodes that run out of memory multiple times a day.

Thank you for reading, the above is "how to optimize Elasticsearch" content, after the study of this article, I believe we have a deeper understanding of how to optimize Elasticsearch this problem, the specific use of the situation also needs to be verified by practice. Here is, Xiaobian will push more articles related to knowledge points for everyone, welcome to pay attention!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.