Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize aggregation Optimization in Elasticsearch

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "Elasticsearch how to achieve aggregation optimization", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "Elasticsearch how to achieve aggregation optimization" this article.

1. Why is aggregation slow?

Most of the time, the aggregate query for a single field is very fast, but when you need to aggregate multiple fields at the same time, it is possible to produce a large number of groups, and the end result is to consume a lot of memory in Elasticsearch, which leads to the occurrence of OOM. In practice, it is found that the following situations are slow:

1) there are a large number of documents to be aggregated (10 million, 100 million, 1 billion or more)

2) the polymerization conditions are complex (multiple conditional polymerization)

3) full aggregation (for paging scenarios).

2. Discussion on aggregation optimization scheme 1: the default depth-first polymerization is changed to breadth-first polymerization. "collect_mode": "breadth_first"

Depth_first directly calculates sub-aggregations

Breadth_first first calculates the result of the current aggregation, and calculates the sub-aggregation for this result.

Optimization plan 2: add a "execution_hint": "map" inside each layer of terms aggregation. "execution_hint": "map"

The most detailed version of the domestic explanation comes from Uncle Wood: the conclusions of the Map method can be briefly summarized as follows: 1) the query results are directly put into memory to build map, which is extremely fast in the scenario where the query result set is small; 2) but if the result set is large, the map method may not be fast.

Optimization scheme N:

Need to be further in-depth practice.

3. Do an experiment

What is the equilibrium point of aggregation?

3.1 Experimental scenario

Scenario 1: in nearly 100 million document, retrieve the data that meet the given conditions, and aggregate the aggregate results. Scenario 2: in a million-level document, full aggregation. Scenario 3: in a document of nearly 100 million, full aggregation.

3.2 aggregation operation

POST index_*/_search

{

"sort": [

{

"nrply": "desc"

}

]

"aggs": {

"count_over_sin": {

"terms": {

"field": "sin_id"

"execution_hint": "map"

"size": 1000

"collect_mode": "breadth_first"

}

}

}

"size": 0

}

1) modify the index name to get more documents. 2) add "execution_hint": "map" to map mode, which defaults to global_ordinals mode. 3) "size": 1000, set the aggregate value.

3.3 aggregation result

The above is all the content of the article "how to achieve aggregation Optimization in Elasticsearch". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report