Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Key state index of ES cluster

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

ES monitoring status metrics are divided into three levels:

1: cluster level: cluster-level monitoring is mainly aimed at the entire ES cluster, including the health status and status of the cluster.

2: node level: node-level monitoring is mainly for each ES instance, including query index metrics and physical resource usage metrics for each instance.

3: index level: index level monitoring is mainly for each index, including the performance indicators of each index.

1 Cluster level:

View method:

Api acquisition: execute in Dev Tools, the development tool of http://ip:9200/_cluster/health?pretty or Kibana:

View cluster health status

GET _ cluster/health

{"cluster_name": "jp-pte-es-hot", "status": "green", "timed_out": false, "number_of_nodes": 46, "number_of_data_nodes": 30, "active_primary_shards": 4857, "active_shards": 12674, "relocating_shards": 0, "initializing_shards": 0, "unassigned_shards": 0, "delayed_unassigned_shards": 0 "number_of_pending_tasks": 0, "number_of_in_flight_fetch": 0, "task_max_waiting_in_queue_millis": 0, "active_shards_percent_as_number": 100} indicator description: status: cluster status It is divided into green, yellow and red. Number_of_nodes/number_of_data_nodes: the number of nodes and data nodes of the cluster. Active_primary_shards: the number of all active primary shards in the cluster. Active_shards: the number of all active shards in the cluster. Relocating_shards: the number of shards that the current node moves to other nodes, usually 0, which increases when a node joins or exits. Initializing_shards: the shard being initialized. Unassigned_shards: the number of unallocated shards, usually 0, which increases when a node's replica shards are lost. Number_of_pending_tasks: it means that the master node creates an index and assigns tasks such as shards. If the value of this metric does not decrease, it indicates that there are unstable factors in the cluster: the health of cluster shards, and the proportion of active shards to the total number of shards. Number_of_pending_tasks:pending task can only be handled by the primary node, and these tasks include creating an index and assigning shards to the node. View cluster status information

# Cluster status information, some statistical information of the entire cluster, such as the number of documents, shards, resource usage and other information. All the key metrics of the cluster can be obtained from this API:

GET _ cluster/stats?pretty outputs a large amount of data, omitting the key indicator description: indices.count: the total number of indexes. Indices.shards.total: the total number of fragments. Indices.shards.primaries: the number of main shards. Docs.count: total number of documents. Store.size_in_bytes: total data storage capacity. Segments.count: total number of segments. Nodes.count.total: total number of nodes. Nodes.count.data: number of data nodes. Nodes. Process. Cpu.percent: node CPU utilization. Fs.total_in_bytes: total capacity used by the file system. Fs.free_in_bytes: total remaining capacity of the file system. 2: node-level node monitoring

GET _ nodes/stats/thread_pool?pretty

Node thread group status

GET _ nodes/stats/thread_pool?pretty

Most of the output information is omitted. "indices": {"docs": {"count": 8111612, # shows how many documents are on the node "deleted": 16604 # how many deleted documents have not been deleted from the data segment} "store": {"size_in_bytes": 2959876263 # shows how much physical storage the node consumes}, and "indexing": {# indicates the number of times the document is indexed This is accumulated by a counter. When a document is deleted, it does not decrease. Note that this value is always incremented, which occurs when internal index data, including those update operations "index_total": 17703152, "is_throttled": false, "throttle_time_in_millis": 0 # this value is high, indicating that the disk traffic setting is too low. }, "search": {"open_contexts": 0, # number of active searches, "query_total": 495447, # total number of queries "query_time_in_millis": 298344, the total time taken by # nodes to start this query, and the ratio of query_time_in_millis / query_total can be used as a rough indicator of your query efficiency. The higher the ratio, the more time each query takes, and you need to consider adjustment or optimization. "query_current": the statistics on fetch after 0, # describe the second process of the query (that is, fetch in query_the_fetch). The more time fetch takes than query, it means that your disk is slow, or you want too many documents for fetch. Or your query parameter paging condition is too large, (for example, size equals 10, 000 "fetch_total": 130194, "suggest_current": 0}, "merges": {# contains lucene segment merging information, it will tell you how many segments merging is in progress, the number of documents involved, the total size of these segments being merged, and the total time spent on merge. If your cluster has more writes, the statistics of this merge are very important. Merge operations consume a large amount of disk io and cpu resources. If you write a lot of indexes, you will see a lot of merge operations. }, "fielddata": {# shows the memory used by fielddata, and fielddata is used for aggregation, sorting, etc. There is also a number of elimination, unlike filter_cache, the number of elimination here is very useful, it must be 0 or close to 0, because fielddata is not a cache, the cost of any elimination is high and must be avoided. If you see elimination, you must re-evaluate your memory situation, limitations on fielddata, and queries, or all three. }, "segments": {tells you the number of lucene segments of the current node, which may be a very important number. Most indexes should be about 50 to 150 segments, even billions of documents with a few terabytes. A large number of segments will lead to the problem of merging (for example, merging can not catch up with the generation of segments). Note that this statistic is the memory statistics for all the indexes on a node, which can tell you how much memory is needed for the Lucene segment itself. This includes basic data structures, including submission lists, dictionaries, bloom filters, etc. The large number of segments increases the overhead of hosting these data structures, and the use of this memory is a measure of this overhead. Key indicator description: indices.docs.count: number of indexed documents. Segments.count: total number of segments. Jvm.heap_used_percent: percentage of memory usage. Thread_pool. {bulk, index, get, search}. {active, queue, rejected}: some information about the thread pool, including bulk, index, get and search thread pool. The main metrics are the number of active (active) threads, thread queue (queue) and rejected (rejected) threads. Some of the following metrics are cumulative values that are cleared when the node is restarted. Indices.indexing.index_total: the number of indexed documents. Indices.indexing.index_time_in_millis: indexing is always time consuming. Number of indices.get.total:get requests. Indices.get.time_in_millis:get requests are always time consuming. Total number of indices.search.query_total:search requests. Indices.search.query_time_in_millis:search requests are always time consuming. Total number of indices.search.fetch_total:fetch operations. Indices.search.fetch_time_in_millis:fetch requests are always time consuming. Jvm.gc.collectors.young.collection_count: the number of garbage collections of the younger generation. Jvm.gc.collectors.young.collection_time_in_millis: the younger generation of garbage collection always takes time. Jvm.gc.collectors.old.collection_count: the number of garbage collections in the old days. Jvm.gc.collectors.old.collection_time_in_millis: garbage collection in the old days was always time-consuming.

Reference document: https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html

3: index level

You can view information about all index as follows:

GET _ stats

Most of the output information is omitted. Indexname.primaries.docs.count: the number of documents indexed. Some of the following metrics are cumulative values that are cleared when the node is restarted. Indexname.primaries.indexing.index_total: the number of indexed documents. Indexname.primaries.indexing.index_time_in_millis: indexing is always time consuming. Number of indexname.primaries.get.total:get requests. Indexname.primaries.get.time_in_millis:get requests are always time consuming. Total number of indexname.primaries.search.query_total:search requests. Indexname.primaries.search.query_time_in_millis:search requests are always time consuming. Total number of indices.search.fetch_total:fetch operations. Indexname.primaries.search.fetch_time_in_millis:fetch requests are always time consuming. Total number of indexname.primaries.refresh.total:refresh requests. Indexname.primaries.refresh.total_time_in_millis:refresh requests are always time consuming. Total number of indexname.primaries.flush.total:flush requests. Indexname.primaries.flush.total_time_in_millis:flush requests are always time consuming.

Reference Information:

Https://blog.csdn.net/joez/article/details/52171219

Https://blog.csdn.net/u010824591/article/details/78614505

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report