Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to deal with MySQL alarm

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to handle MySQL alarms". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "How to handle MySQL alarm"!

Recently, there was an alarm on a service that had reached the point where I couldn't bear it. The alarm message was as follows:

Metric:mysql.innodb_row_lock_waits Tags:port=4306,service=xxxx diff(#1): 996>900

There is a database monitoring metric innodb_row_lock_waits that currently exceeds the threshold of 900

However, the embarrassing thing is that every time I go to the environment to check after the alarm, the information I get is very limited. There is no sufficient information to analyze in the slow log and error log. After one to two, I began to calm down and analyze the cause of this problem.

First of all, the time point of this alarm message seems to be some regular, I took the alarm time of the last few days to do a comparison, found that it is still relatively regular, then what tasks may be triggered at the system level, I looked up the relevant task configuration, found that there is a timed task will be executed every 1 minute, but here the question comes, if executed every 1 minute, why will there be a large difference in the processing results at a specific time? Of course, the explanation for this phenomenon is the beginning.

In fact, it is quite easy to prove this point. Today, I adopted the mode of waiting for rabbits. I opened the general log before and after the time of approaching the alarm. From the log output, the frequency of operation is relatively limited.

I quickly got regular alarms, so I started to grab the relevant general log records, such as 11:18 minutes, we can use the following pattern to get the relevant log, first get a temporary general log file, all kinds of DML and execution operations are captured.

cat general.log|grep -E "insert|delete|update|select|exec" > general_tmp.log

We take 11:18 as an example, we can compare it in 1 minute before and after, and the result is as follows:

# less general_tmp.log |grep "11:18"|wc -l 400 # less general_tmp.log |grep "11:17"|wc -l 666 # less general_tmp.log |grep "11:16"|wc -l 15

It was found that the number was comparable one minute before and after the alarm.

The table contains more than 2 million data and is structured as follows:

CREATE TABLE `task_queue` ( `AccID` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'self incrementing ID', `TaskStepID` bigint(20) DEFAULT NULL COMMENT 'task step ID task_step_conf', `QOrder` int(11) DEFAULT NULL COMMENT 'queue sort task_step_confi.Step_ID', `QState` tinyint(4) DEFAULT '1' COMMENT 'Queue Status 1: Pending Execution2: Executing 3: Execution Succeeded 4: Execution Failed', `QExcCount` int(11) DEFAULT '1' COMMENT 'EXECUTIONS', `CrtTime` datetime DEFAULT NULL COMMENT 'creation time', `ModTime` datetime DEFAULT NULL COMMENT 'modification time', PRIMARY KEY (`AccID`), KEY `idx_taskstepid` (`TaskStepID`), KEY `idx_qstate` (`QState`) ) ENGINE=InnoDB AUTO_INCREMENT=3398341 DEFAULT CHARSET=utf8

According to the analysis and comparison in the log, SQL can basically be locked in a class of Update operations. The execution plan of SQL is as follows:

>>explain update task_queue set QState=1,QExcCount=QExcCount+1,modtime=now() where QState=0 and taskstepid =411\G *************************** 1. row *************************** id: 1 select_type: UPDATE table: task_queue partitions: NULL type: index_merge possible_keys: idx_taskstepid,idx_qstate key: idx_qstate,idx_taskstepid key_len: 2,9 ref: NULL rows: 11 filtered: 100.00 Extra: Using intersect(idx_qstate,idx_taskstepid); Using where; Using temporary

The key_len in this execution result is 2, 9, which is different from the previous ken_len calculation method. The Extra column has given explicit hints that this is an intersect process, specifically that it is based on secondary index level processing, and that there is an associated parameter index_merge_intersection at the optimizer level.

We know that in MySQL the primary key is a first-class citizen, and the secondary index will eventually be mapped to the primary key level processing, and the index level intersect is actually a bit of our left and right hand, the left hand corresponds to some data results mapped to a batch of primary key id, the right hand corresponds to some data results mapped to another batch of primary key id, the two primary key id values for intersect intersection calculation, so in the current scenario, the index level intersect in the end is not good?

Here I imagine three contrast scenarios for analysis. First, this is an update statement. In order to ensure the repeatability of subsequent tests, we can convert it into a select statement.

select * from task_queue where QState=0 and taskstepid =411;

So our comparison tests are based on query statements.

Scenario 1: The optimizer keeps the default index_merge_intersection on, extracts execution details based on profile>explain select * from task_queue where QState=0 and taskstepid =411\G *************************** row *************************** id: 1 select_type: SIMPLE table: task_queue partitions: NULL type: index_merge possible_keys: idx_qstate,idx_taskstepid key: idx_qstate,idx_taskstepid key_len: 2,9 ref: NULL rows: 11 filtered: 100.00 Extra: Using intersect(idx_qstate,idx_taskstepid); Using where 1 row in set, 1 warning (0.00 sec)

The profile information is:

Scenario 2: The optimizer turns off index_merge_intersection and compares based on profile>set session optimizer_switch ='index_merge_intersection= off';>explain select * from task_queue where QState=0 and taskstepid =411\G *************************** row *************************** id: 1 select_type: SIMPLE table: task_queue partitions: NULL type: ref possible_keys: idx_qstate,idx_taskstepid key: idx_qstate key_len: 2 ref: const rows: 1451 filtered: 0.82 Extra: Using where 1 row in set, 1 warning (0.00 sec)

The profile information is:

Scenario 3: Reconstruct the index and perform comparison analysis

According to the business logic, if you create a composite index, you can greatly reduce the magnitude of the result set, while still retaining the idx_qstate index, so that some businesses can still use it.

>alter table task_queue drop key idx_taskstepid; >alter table task_queue add key `idx_taskstepid` (`TaskStepID`,QState); explain select * from task_queue where QState=0 and taskstepid =411\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: task_queue partitions: NULL type: ref possible_keys: idx_qstate,idx_taskstepid key: idx_taskstepid key_len: 11 ref: const,const rows: 1 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.00 sec)

The profile information is:

You can clearly see that with index refactoring, there are two orders of magnitude less "Sending data"

Therefore, the next thing is to further analyze and verify, reasonable and well-founded, waiting process is no longer hesitant, one day has passed, no more received an alarm, once again explain not to underestimate these alarms in the work.

At this point, I believe that everyone has a deeper understanding of "how to handle MySQL alarms". Let's actually operate it! Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report