How to optimize search performance in ElasticSearch 07/06 Update SLTechnology News&Howtos

How to optimize search performance in ElasticSearch

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

I believe that many inexperienced people don't know what to do about the search performance tuning in ElasticSearch. Therefore, this article summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Filesystem cache

The data you write to es is actually written to the disk file. When querying, the operating system will automatically cache the data in the disk file to filesystem cache.

Es's search engine relies heavily on the underlying filesystem cache. If you give filesystem cache more memory and try to make the memory accommodate all the idx segment file index data files, then you will basically go to memory when you search, and the performance will be very high.

How big can the performance gap be? Many of our previous tests and pressure tests, if you go to the disk must be on the second, the search performance is absolutely other seconds, 1 second, 5 seconds, 10 seconds. But if you go to filesystem cache and pure memory, then generally speaking, the performance is an order of magnitude higher than that of disk, which is basically millisecond, ranging from a few milliseconds to hundreds of milliseconds.

Here's a real case. A company's es node has three machines, each of which seems to have a lot of memory, 64 GB, and the total memory is 64 * 3 = 192 GB. Each machine is given 32 gigabytes to es jvm heap, so what is left for filesystem cache is only 32 gigabytes per machine, and the total amount of memory given to filesystem cache in the cluster is 32 * 3 = 96 gigabytes. At this time, the index data file on the whole disk occupies a total of 1T of disk capacity on the three machines, and the amount of es data is 1T, then the data amount of each machine is 300G. Is this good? Filesystem cache has only 100 gigabytes of memory, 1/10 of the data can be stored in memory, and the rest is on disk, and then you perform search operations, most of which are on disk, which must have poor performance. In the final analysis, you want es to perform well, and at best, your machine's memory can hold at least half of your total data.

According to our own practical experience in the production environment, the best case is to store a small amount of data in es, that is, the indexes you want to search. If the memory left to filesystem cache is 100G, then you will control the index data within 100G. In this way, almost all your data will go to memory to search, and the performance is very high, generally within 1 second.

For example, you now have a line of data. Id,name,age.... 30 fields. But if you search now, you only need to search according to the three fields of id,name,age. If you foolishly write all the fields of a row of data into the es, you will say that 90% of the data is not used for search, and the result will occupy the space of the filesystem cache on the es machine. The larger the amount of data in a single piece of data, the less data filesystem cahce can cache. In fact, only write a few fields in es to be used for retrieval, for example, write three fields in es id,name,age, and then you can store other field data in mysql/hbase. We generally recommend a framework like es + hbase.

The characteristic of hbase is suitable for online storage of massive data, that is, hbase can write massive data, but do not do complex search, do some very simple query according to the id or scope of such an operation. Search according to name and age from es, and the result may be 20 doc id. Then query the complete data of each doc id in hbase according to doc id, check it out, and return it to the front end.

The data written to es is preferably less than or equal to, or slightly larger than the memory capacity of es's filesystem cache. Then you may spend 20ms to retrieve from es, and then according to the id returned by es to hbase query, check 20 pieces of data, it may also cost a 30ms, maybe you used to play like that, 1T of data put es, will be 5 seconds 10s for each query, now the performance may be very high, every query is 50ms.

Data preheating

Even if you follow the above plan, each machine in the es cluster still writes twice as much data as filesystem cache. For example, if you write 60 gigabytes of data to a machine, the filesystem cache is 30 gigabytes, and there is still 30 gigabytes of data left on disk. In fact, you can warm up the data:

For example, Weibo, for example, you can put some big V, usually watch a lot of data, you build a system backstage in advance, every once in a while, your own background system to search for hot data, brush into the filesystem cache, the back users actually look at this hot data, they will search directly from memory, very quickly.

Or e-commerce, you can usually view some of the most products, such as iphone 8, hot data in advance to make a background program, every 1 minute to take the initiative to visit, brush into the filesystem cache.

For those data that you think is hot and often accessed, it is best to make a special cache warm-up subsystem, that is, access the hot data in advance at regular intervals to let the data enter the filesystem cache. In this way, the performance will be much better the next time someone visits.

Hot and cold separation

Es can do horizontal splitting similar to mysql, that is, write a separate index for a large amount of rarely accessed and infrequently accessed data, and then write a separate index for frequently accessed hot data. It is best to write cold data to one index and then hot data to another index to ensure that after the hot data is warmed up, try to keep them in the filesystem os cache so that the cold data is not washed away.

You see, suppose you have six machines, two indexes, one cooling data, one exothermic data, each index 3 shard. Three machines exothermic data index, the other three machines release cooling data index. Then, you spend a lot of time visiting the hot data index, which may account for 10% of the total data. At this time, the amount of data is very small, and almost all of it is retained in the filesystem cache, which can ensure that the access performance of the hot data is very high. But for cold data, it is in other index, and it is not on the same machine as hot data index, so there is no connection between them. If someone accesses cold data, a large amount of data may be on disk, at this time the performance is poor, only 10% of people access cold data, 90% of people are accessing hot data, it doesn't matter.

Document model design

For MySQL, we often have some complex relational queries. How to play in es, try not to use complex associated queries in es. Once used, the performance is generally not very good.

It is best to complete the association in the Java system first, and write the associated data directly into the es. When searching, you don't need to use es's search syntax to complete associated searches such as join.

Document model design is very important, a lot of operations, do not want to perform a variety of complex operations during the search. There are only so many operations that es can support, so don't think about using es to do things that are not easy to operate. If there is such an operation, try to finish it when the document model is designed and written. In addition, for some too complex operations, such as join/nested/parent-child search should be avoided as far as possible, the performance is very poor.

Paging performance optimization

The paging of es is more sinister, why? For example, if you have 10 pieces of data per page, if you want to query page 100 now, you will actually check the first 1000 pieces of data stored on each shard to a coordination node. If you have five shard, then there are 5000 pieces of data, and then the coordination node will merge and process the 5000 pieces of data, and then get the 10 pieces of data on page 100.

Distributed, you want to check the 10 pieces of data on page 100, it is impossible to say that from 5 shard, each shard to check 2 pieces of data, and finally to the coordination node merged into 10 pieces of data, right? You have to look up 1000 pieces of data from each shard, sort, filter, and so on according to your needs, and finally paginate again to get the data on page 100. When you turn the page, the deeper you turn, the more data each shard returns, and the longer it takes for the coordination node to process, which is very tricky. So when you use es for paging, you will find that the more you turn to the back, the slower you will be.

We have also encountered this problem before, using es as paging, the first few pages are tens of milliseconds, when you turn to 10 or dozens of pages, it will basically take 5 to 10 seconds to find out a page of data.

Is there a solution?

Deep paging is not allowed (the default deep paging performance is poor)

Tell the product manager that your system is not allowed to turn pages that deep. By default, the deeper the page, the worse the performance.

Similar to the recommended products in app that keep coming down page by page.

Similar to Weibo, the drop-down browsing Weibo, browsing out a page, you can use scroll api, about how to use, your own Internet search.

Scroll will generate a snapshot of all the data for you at one time, and then every time you slide back to turn the page, you will move through the cursor scroll_id to get the next page. The performance will be much higher than the paging performance mentioned above, basically in milliseconds.

However, the only point is that this is suitable for the kind of similar Weibo drop-down page-flipping scene, can not jump to any page at will. In other words, you can't advance to page 10, then go to page 120, and then go back to page 58. You can't skip the page at will. So now many products, do not allow you to turn the page at will, app, there are also some websites, what you can do is that you can only pull down and turn the page.

The scroll parameter must be specified during initialization to tell es how long to save the context of this search. You need to make sure that users don't keep turning pages for hours, otherwise they may fail because of timeouts.

In addition to using scroll api, you can also do it with search_after. The idea of search_after is to use the results of the previous page to help retrieve the data on the next page. Obviously, this way does not allow you to turn the page at will, you can only turn back page by page. When initializing, you need to use a field with a unique value as the sort field.

After reading the above, have you mastered how to tune the search performance in ElasticSearch? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.