How to parse and optimize MySQL 8.0 PFS histogram 07/19 Update SLTechnology News&Howtos

How to parse and optimize MySQL 8.0 PFS histogram

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "how to parse and optimize MySQL 8.0 PFS histogram". In daily operation, I believe many people have doubts about how to analyze and optimize MySQL 8.0 PFS histogram. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts of "how to analyze and optimize MySQL 8.0 PFS histogram". Next, please follow the editor to study!

Introduction to Part Ⅰ

The operation and maintenance of online database often can not avoid the monitoring of statement execution time. If there are obvious and frequent slow queries or slow writes in the actual business operation, we need to pay special attention to locate the cause of the problem in time.

At this time, if the database itself can provide statistics on the execution time of statements on the instance, so that it can be macro (can observe the overall distribution of execution time) and micro (can locate slow statements), it can naturally get twice the result with half the effort.

As early as before MySQL 8.0, the performance_schema table already had multiple Statement Summary tables to record current or recent statement execution events.

Mysql > show tables from performance_schema like 'events_statements_summary%' +-+ | Tables_in_performance_schema (events_statements_summary%) | + -+ | events_statements_summary_by_account_by_event_name | | events_statements_summary_by_digest | | events_statements_summary_by_host_by_event_name | | events_statements_summary_by_program | | events_statements_summary_by_thread_by_event_name | | events_ Statements_summary_by_user_by_event_name | | events_statements_summary_global_by_event_name | +-+ 7 rows in set (0.00 sec)

According to the different grouping policies for events, these statement event summary tables are divided into seven. Events_statements_summary_by_digest stores data in groups according to the digest value of the statement and the name of the schema the statement operates on. Each row of data in the table summarizes the execution information of a set of statements of the same nature (specific values can be different) executed on a schema.

Among them, the digest of a statement refers to a statement that removes some specific values from the statement (such as the specific values inserted into the table), templates the statement, and then obtains the unique value through the hashing algorithm. For example, a statement INSERT INTO d1.t1 VALUES (1024, "hello world") is templated to get the template statement INSERT INTO d1.t1 VALUES (...), and then hashed to get a unique digest value.

For example, I inserted four rows of data into the texts table in the test library:

Mysql > use test;Database changedmysql > insert into texts values ("hello"); Query OK, 1 row affected (0.00 sec) mysql > insert into texts values ("hi"); Query OK, 1 row affected (0.01 sec) mysql > insert into texts values ("how are you"); Query OK, 1 row affected (0.01 sec) mysql > insert into texts values ("goodbye"); Query OK, 1 row affected (0.01 sec)

Then you can find the statistics performed by the above four INSERT in the table:

Mysql > select * from performance_schema.events_statements_summary_by_digest where schema_name='test'\ G.. row * * SCHEMA_NAME: test DIGEST: 894869beecac725bf46aa9c43778d476252a5b1c85ecd0139287ab15b2bd3c0b DIGEST _ TEXT: INSERT INTO `texts` VALUES (?) # templated statement COUNT_STAR: 4 # actual number of statement execution belonging to this digest SUM_TIMER_WAIT: 29030259000 # Total statement execution time MIN_TIMER_WAIT: 6432990000 # minimum statement execution time AVG_TIMER_WAIT: 7257564000 # Average statement execution time MAX_TIMER_WAIT: 8168797000 # maximum statement execution time SUM_LOCK_TIME: 495000000 SUM_ERRORS: 0SUM _ WARNINGS: 0SUM _ ROWS_AFFECTED: 4 SUM_ROWS_SENT: 0SUM _ ROWS_EXAMINED: 0SUM_CREATED_TMP_DISK_TABLES: 0 SUM_CREATED_TMP_TABLES: 0 SUM_SELECT_FULL_JOIN: 0 SUM_SELECT_FULL_RANGE_JOIN: 0 SUM_SELECT_RANGE: 0 SUM_SELECT_RANGE_CHECK: 0 SUM_SELECT_SCAN: 0 SUM_SORT_MERGE_PASSES: 0 SUM_SORT_RANGE: 0 SUM_SORT_ROWS: 0 SUM_SORT_SCAN: 0 SUM_NO_INDEX_USED: 0 SUM_NO_GOOD_INDEX_USED: 0 FIRST_SEEN: 2020-07-09 16 FIRST_SEEN 08 FIRST_SEEN 33.329338 # in the executed statements belonging to the digest Time of execution of the first article LAST_SEEN: 2020-07-09 16 08 47.193867 # in the executed statements belonging to the digest The last execution time QUANTILE_95: 8317637711 # 95% of the same digest statements execution time is less than this value QUANTILE_99: 8317637711 # 99% of the same digest statements execution time is less than this value QUANTILE_999: 8317637711 # 99.9% of the same digest statements execution time is less than this value QUERY_SAMPLE_TEXT: insert into texts values (" Hi ") # belongs to a sample of the executed statements of the digest QUERY_SAMPLE_SEEN: 2020-07-09 16 QUERY_SAMPLE_SEEN 08QUERY_SAMPLE_SEEN 37.642837 # time of sample execution QUERY_SAMPLE_TIMER_WAIT: 8168797000 # sample execution time 4 rows in set (0.00 sec)

Here the COUNT_STAR column indicates that there are four statements of the same digest executed, which is consistent with the actual insert operation above. This is followed by several statistics that we are more concerned about. SUM_TIMER_WAIT represents a total of 29030259000 ps for the four inserts, that is, about 29.03 ms;MIN_TIMER_WAIT for the shortest of the four inserts, 6.43 ms;AVG_TIMER_WAIT for the average 7.25 ms;MAX_TIMER_WAIT of the four inserts, and 8.16 ms for the longest of the four inserts.

The three QUANTILE_xx fields are also one of the data we are concerned about. In the above example, the value of QUANTILE_ 95 is 8317637711, which means that 95% of the same digest statements take less than 8.31 ms to execute.

And so on, the following QUANTILE_99 and QUANTILE_999 represent 99% and 99.9%, respectively.

It is important to note that the last three columns are the three new fields added to the table by MySQL 8.0, which gives a sample (sample) of the four inserts performed above, and provides two information about the (last) execution time and time consuming of the sample.

Introduction to Part Ⅱ MySQL 8.0 histogram

The summary table mentioned above generally records the information such as the execution time of the statement, including the maximum value, the minimum value, the average value, etc., but these information may not be enough to directly see the time distribution of the same statement.

In MySQL 8.0, two new histogram tables have been added to performance_schema to enrich the statistical information of statement execution time, namely events_statements_histogram_by_diges and events_statements_histogram_global

Histogram means "histogram" in Chinese, and as the name implies, these two tables provide statistical records in the form of a histogram of the execution time of statement execution events.

2.1 events_statements_histogram_by_digest introduction

In events_statements_histogram_by_digest, the row is identified by the schema name and digest value operated by the statement, and the histogram statistics are found in events_statements_histogram_by_digest using the four inserted digest values of the above example You can get: mysql > select * from performance_schema.events_statements_histogram_by_digest-> where DIGEST='894869beecac725bf46aa9c43778d476252a5b1c85ecd0139287ab15b2bd3c0b'-> and SCHEMA_NAME='test' and COUNT_BUCKET > 0\ gateway * 1. Row * * SCHEMA_NAME: test DIGEST: 894869beecac725bf46aa9c43778d476252a5b1c85ecd0139287ab15b2bd3c0b BUCKET_NUMBER: 141BUCKET_TIMER_LOW: 6309573444 BUCKET_TIMER_HIGH: 6606934480 COUNT_BUCKET: 1COUNT_BUCKET_AND_LOWER: 1 BUCKET_QUANTILE: 0.250000mm * 2. Row * * SCHEMA_NAME: test DIGEST: 894869beecac725bf46aa9c43778d476252a5b1c85ecd0139287ab15b2bd3c0b BUCKET_NUMBER: 143BUCKET_TIMER_LOW: 6918309709 BUCKET_TIMER_HIGH: 7244359600 COUNT_BUCKET: 1COUNT_BUCKET_AND_LOWER: 2 BUCKET_QUANTILE: 0.500000cycles * 3. Row * * SCHEMA_NAME: test DIGEST: 894869beecac725bf46aa9c43778d476252a5b1c85ecd0139287ab15b2bd3c0b BUCKET_NUMBER: 144BUCKET_TIMER_LOW: 7244359600 BUCKET_TIMER_HIGH: 7585775750 COUNT_BUCKET: 1COUNT_BUCKET_AND_LOWER: 3 BUCKET_QUANTILE: 0.750000miles * 4. Row * * SCHEMA_NAME: test DIGEST: 894869beecac725bf46aa9c43778d476252a5b1c85ecd0139287ab15b2bd3c0b BUCKET_NUMBER: 146BUCKET_TIMER_LOW: 7943282347 BUCKET_TIMER_HIGH: 8317637711 COUNT_BUCKET: 1COUNT_BUCKET_AND_LOWER: 4 BUCKET_QUANTILE: 1.0000004 rows in set (0.00 sec)

In events_statements_histogram_by_digest, statement events with the same digest value are included in different bucket according to the size of execution time. The BUCKET_TIMER_LOW field represents the lower limit of execution time of statement events in the bucket, and the BUCKET_TIMER_HIGH field represents the upper limit, such as the four inserts I executed above. The execution time of one of them is between 6309573444 ps = 6.30ms and 6606934480 ps = 6.60ms. Therefore, the statement event is included in the bucket with BUCKET_NUMBER (the identification number of bucket) of141.

The numerical value of the COUNT_BUCKET field indicates how many statement events are included in the bucket. In this example, the execution time gap of the four inserts is relatively large, so it is included in four different bucket, and I specified the filter results with a COUNT_BUCKET greater than 0 when querying, so the data of "empty bucket" is not shown. COUNT_BUCKET_AND_LOWER indicates how many statement events have execution time less than the upper limit of the bucket, while BUCKET_QUANTILE indicates how many percent of the statement events have execution time less than the upper limit of the bucket. In this example, the execution time limit of the 144th bucket (in line 3) is 7.58 ms, and 3 of the 4 inserts have execution time less than this value, that is, 75% of the execution time is less than this value.

You can also visually show the execution time distribution of the digest statement through the following query: mysql > SELECT DIGEST_TEXT,-> CONCAT ('

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.