In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains why a simple one-line query in MySQL is slow. The explanation in this article is simple and clear and easy to learn and understand. Please follow the editor's train of thought to study and learn why a simple one-line query in MySQL is slow.
In MySQL, there are many SQL statements that look the same in logic but vary greatly in performance. Improper use of these statements will inadvertently increase the pressure on the entire database.
Case 1: conditional field function operation
Suppose you now maintain a trading system in which the transaction record table tradelog contains fields such as trading serial number (tradeid), trader id (operator), trading time (t_modified), and so on. For ease of description, let's ignore the other fields first. The table-building statement for this table is as follows:
Mysql > CREATE TABLE `tradelog` (`id` int (11) NOT NULL, `tradeid` varchar (32) DEFAULT NULL, `operator` int (11) DEFAULT NULL, `tmodified` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `tradeid` (`tradeid`), KEY `tmodified` (`tmodified`) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
Assuming that all the data from the beginning of 2016 to the end of 2018 have been recorded, there is a need for operations to count the total number of transactions that took place in July in all years. This logic doesn't look complicated, and your SQL statement might say:
Mysql > select count (*) from tradelog where month (t_modified) = 7
Because there is an index on the t_modified field, you can safely execute this statement in the production repository, only to find that it took a long time to return the result.
If you ask your colleague DBA why this happens, he will probably tell you that if you do a function calculation on the field, you don't need an index, which is the rule of MySQL.
Now that you have learned the index structure of InnoDB, you can ask again why? Why can the index be used when the condition is where t_modified='2018-7-1', but not when where month (t_modified) = 7?
Below is a schematic diagram of the t_modified index. The number above the box is the value corresponding to the month () function.
Fig. 1 schematic diagram of t_modified index
If your SQL statement condition is where t_modified='2018-7-1', the engine will quickly locate the desired result of t_modified='2018-7-1' by following the route of the green arrow above.
In fact, the fast positioning ability provided by B+ tree comes from the ordering of sibling nodes on the same layer.
However, if you calculate the month () function, you will see that when 7 is passed in, you don't know what to do on the first layer of the tree.
In other words, functional operations on index fields may destroy the order of index values, so the optimizer decides to abandon the tree search function.
It is important to note that the optimizer is not abandoning the use of this index.
In this example, instead of tree search, the optimizer can choose to traverse the primary key index or traverse the index t_modified. After comparing the size of the index, the optimizer finds that the index t_modified is smaller and traversing the index is faster than traversing the primary key index. So the index t_modified will eventually be chosen.
Next, we use the explain command to see the execution result of this SQL statement.
Figure 2 explain results
Key= "t_modified" means that the index t_modified is used; I inserted 100000 rows of data in the test table, rows=100335, indicating that this statement scans all the values of the entire index; and the Using index of the Extra field indicates that an overlay index is used.
That is, the month () function operation is added to the t_modified field, resulting in a full index scan. In order to take advantage of the fast positioning ability of the index, we need to change the SQL statement to a range query based on the field itself. According to the following writing, the optimizer will be able to use the fast positioning capabilities of t_modified indexes as we expected.
Mysql > select count (*) from tradelog where-> (t_modified > = '2016-7-1' and t_modified (t_modified > = '2017-7-1' and t_modified (t_modified > = '2018-7-1' and t_modified select * from tradelog where tradeid=110717)
There is already an index on the transaction number tradeid field, but the result of explain shows that this statement needs to be scanned all over the table. You may also find that the field type of tradeid is varchar (32), while the input parameter is integer, so you need to do type conversion.
So, now there are two questions:
What are the rules for data type conversion?
Why do you need a full index scan when there is a data type conversion?
First look at the first question, you might say, there are so many types in the database, this kind of data type conversion rules are more, I can not remember, what should I do?
Here's a simple way to look at the results of select "10" > 9:
If the rule is "convert a string to a number", then make a number comparison, and the result should be 1
If the rule is "convert a number to a string", then do a string comparison, and the result should be 0.
The verification results are shown in figure 3.
Fig. 3 schematic diagram of the effect of string and number conversion in MySQL
As you can see from the figure, select "10" > 9 returns 1, so you can confirm the conversion rules in MySQL: in MySQL, comparing a string with a number is to convert a string into a number.
At this point, you look at the full table scan statement:
Mysql > select * from tradelog where tradeid=110717
I knew that for the optimizer, this statement is equivalent to:
Mysql > select * from tradelog where CAST (tradid AS signed int) = 110717
In other words, this statement triggers the rule we mentioned above: do a functional operation on the index field, and the optimizer will abandon the tree search function.
Now, I leave you a small question, the type of id is int, if you execute the following statement, will it cause a full table scan?
Select * from tradelog where id= "83126"
You can first analyze it yourself, and then go to the database to verify and confirm.
Next, let's look at a slightly more complicated example.
Case 3: implicit character coding conversion
Suppose there is another table in the system, trade_detail, that records the operational details of the transaction. In order to facilitate quantitative analysis and reproduction, I insert some data into the transaction log table tradelog and the transaction details table trade_detail.
Mysql > CREATE TABLE `tradeid` (`id` int (11) NOT NULL, `tradeid` varchar (32) DEFAULT NULL, `trade_ step` int (11) DEFAULT NULL, / * procedure * / `step_ info` varchar (32) DEFAULT NULL, / * step information * / PRIMARY KEY (`id`), KEY `tradeid` (`tradeid`) ENGINE=InnoDB DEFAULT CHARSET=utf8;insert into tradelog values (1, 'aaaaaaaa', 1000, now ()); insert into tradelog values (2,' aaaaaaab', 1000, now ()) Insert into tradelog values (3, 'aaaaaaac', 1000, now ()); insert into trade_detail values (1,' aaaaaaaa', 1, 'add'); insert into trade_detail values (2,' aaaaaaaa', 2, 'update'); insert into trade_detail values (3,' aaaaaaaa', 3, 'commit'); insert into trade_detail values (4,' aaaaaaab', 1, 'add'); insert into trade_detail values (5,' aaaaaaab', 2, 'update') Insert into trade_detail values (6, 'aaaaaaab', 3,' update again'); insert into trade_detail values (7, 'aaaaaaab', 4,' commit'); insert into trade_detail values (8, 'aaaaaaac', 1,' add'); insert into trade_detail values (9, 'aaaaaaac', 2,' update'); insert into trade_detail values (10, 'aaaaaaac', 3,' update again') Insert into trade_detail values (11, 'aaaaaaac', 4,' commit')
At this point, if you want to query all the procedure information for a transaction for id=2, the SQL statement can write:
Mysql > select d.* from tradelog l, trade_detail d where d.tradeid=l.tradeid and l.idroom2; / * statement Q1stores /
Figure 5 the execution process of statement Q1
In the figure:
The first step is to find the line L2 in the tradelog table based on id
The second step is to extract the value of the tradeid field from L2
The third step is to look for rows that match the criteria in the trade_ detail table based on the tradeid value. The key=NULL in the second line of the explain result indicates that the process is to determine whether the values of tradeid match one by one by traversing the primary key index.
At this point, you will find that step 3 does not meet our expectations. Because there is an index on the tradeid field in the table trade_detail, we originally hoped that we could quickly locate the equivalent row by using the tradeid index. But it's not here.
If you ask DBA students, they may tell you that because the character sets of the two tables are different, one is utf8 and the other is utf8mb4, so the index of the associated field is not used in the table join query. This answer is also the answer you usually get when you search for this question.
But you should ask again, why can't you use an index because of different character sets?
We say that the problem lies in step 3 of the execution step, and if you change this step to a SQL statement alone, that is:
Mysql > select * from trade_detail where tradeid=$L2.tradeid.value
Where the character set of $L2.tradeid.value is utf8mb4.
With reference to the previous two examples, you must have thought that the character set utf8mb4 is a superset of utf8, so when comparing these two types of strings, the operation within MySQL is to first convert the utf8 string to the utf8mb4 character set, and then compare.
This setting is easy to understand. Utf8mb4 is a superset of utf8. Similarly, in programming languages, automatic type conversion is done "in the direction of increasing data length" in order to avoid data errors caused by truncation in the conversion process.
Therefore, when executing the above statement, you need to convert the fields in the driven data table to utf8mb4 one by one and compare them with L2.
In other words, this statement is actually equivalent to the following:
Select * from trade_detail where CONVERT (traideid USING utf8mb4) = $L2.tradeid.value
The CONVERT () function, here, means to convert the input string to the utf8mb4 character set.
This once again triggers the principle we mentioned above: do functional operations on the index field, and the optimizer will give up the walking tree search function.
At this point, you finally make it clear that the different character set is only one of the conditions, and the connection process requires a function operation on the index field of the driven table, which directly leads to the full table scan of the driven table.
As a comparative verification, I'll give you another requirement, "find the operation of id=4 in the trade_detail table, and who is the corresponding operator?" let's take a look at this statement and its execution plan.
Mysql > select l.operator from tradelog l, trade_detail d where d.tradeid=l.tradeid and d.id=4
Figure 7 explain result after SQL statement optimization
Here, I take the initiative to convert l.tradeid to utf8 to avoid character encoding conversion on the driven table, and as you can see from the explain results, the index is right this time.
Summary
Today I gave you three examples, which are actually talking about the same thing, that is, functional operations on index fields may destroy the order of index values, so the optimizer decided to give up the walking tree search function.
The second example is implicit type conversion, and the third example is implicit character encoding conversion, which, like the first example, results in a full index scan because a function is required on the index field.
MySQL's optimizer is really suspected of being "lazy". Even if you simply rewrite where id+1=1000 to where id=1000-1, you can use the index to find it quickly, and you won't take the initiative to rewrite this statement.
Therefore, it is a good habit to explain any new SQL statements that may appear every time your business code is upgraded.
It's slow to execute one line. Why?
For ease of description, I will construct a table based on which to illustrate today's problem. This table has two fields id and c, and I insert 100000 rows of records into it.
Mysql > CREATE TABLE `t` (`id` int (11) NOT NULL, `c` int (11) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB;delimiter;; create procedure idata () begin declare i int; set iTun1; while (I select * from t where id=1
The query results are not returned for a long time.
Fig. 2 schematic diagram of Waiting for table metadata lock status
This state indicates that there is now a thread requesting or holding a MDL write lock on table t, blocking the select statement.
It is also easy to reproduce this scene in MySQL version 5.7. As shown in figure 3, I give a simple reproduction step.
Figure 4 finding the table-locked thread id
Wait for flush
Next, I'll give you another case where the query is blocked.
I execute the following SQL statement on table t:
Mysql > select * from information_schema.processlist where id=1
Here, I'll sell it first.
You can take a look at figure 5. I found out that the status of this thread is Waiting for table flush, and you can imagine why.
Fig. 5 schematic diagram of Waiting for table flush status
This state indicates that there is now a thread about to flush table t. Generally speaking, there are two ways to do flush operations on tables in MySQL:
Flush tables t with read lock;flush tables with read lock
These two flush statements, if you specify table t, means to close only table t; if you do not specify a specific table name, it means to close all open tables in MySQL.
But normally both statements execute quickly unless they are also blocked by other threads.
So, the possibility of a Waiting for table flush state is that a flush tables command is blocked by another statement, and then it blocks our select statement.
Now, let's reproduce this situation, as shown in figure 6:
Fig. 6 steps to reproduce Waiting for table flush
In session A, I deliberately call sleep (1) once per line, so that the statement is executed by default for 100000 seconds, during which time the table t has been "opened" by session A. Then, to close table t again, session B's flush tables t command needs to wait for session A's query to finish. In this way, if session C wants to query again, it will be blocked by the flush command.
Figure 7 is the show processlist result of this reproduction step. The troubleshooting of this example is also very simple. When you see the results of this show processlist, you must know what to do.
Fig. 8 Line lock reproduction
Figure 10 look up row locks through sys.innodb_lock_waits
As you can see, this information is very complete, and thread 4 is the main culprit for the blockage. And the way to kill this culprit is KILL QUERY 4 or KILL 4.
However, "KILL QUERY 4" should not be displayed here. This command stops the statement currently being executed on thread 4, but this method is actually useless. Because the row lock is occupied by the update statement, which has been executed before, executing KILL QUERY now cannot cause the transaction to remove the row lock from the id=1.
In fact, KILL 4 works, which means disconnecting the connection directly. The implicit logic here is that when the connection is disconnected, the executing thread in the connection is automatically rolled back, thus releasing the row lock on the id=1.
The second category: slow query
After many "locks", let's take a look at some examples of slow queries.
Let's start with a SQL statement that you must know why:
Mysql > select * from t where 50000 limit 1
Since there is no index on field c, this statement can only be scanned in id primary key order, so 50, 000 rows need to be scanned.
As confirmation, you can take a look at the slow query log. Notice that in order to record all statements in slow log here, I first execute set long_query_time=0 after connecting, setting the time threshold of the slow query log to 0.
Figure 12 scan a line but execute very slowly
Isn't it a little strange where all this time is spent?
If I pull down the screenshot of this slow log a little bit, you can see the next statement, select * from t where id=1 lock in share mode, the number of lines scanned during execution is also 1 line, and the execution time is 0.2ms.
Figure 14 output of two statements
In the query result of the first statement, ccontains 1, and the statement with lock in share mode returns cquote 1000001. See here, there should be more students know why. If you still don't have a clue, don't worry. Let me explain the reproduction steps to you first, and then analyze the reasons.
Figure 16 data status of id=1
Session B updates 1 million times and generates 1 million rollback logs (undo log).
The SQL statement with lock in share mode is currently read, so it will directly read the result of 1000001, so it is very fast, while the statement of select * from t where id=1 is a consistent read, so you need to start with 1000001 and execute undo log in turn. After 1 million times, the result of 1 is returned.
Note that the operational logic recorded in undo log is "change 2 to 1" and "change 3 to 2". The purpose of subtracting 1 is to make it easier for you to look at the picture.
Summary
Today I give you an example of locking and slow execution when performing "check a row" on a simple table. This involves the concepts of table locks, row locks, and consistent reads.
In practical use, the scene encountered will be more complex. But more or less the same, you can follow the positioning method I introduced in the article, to locate and solve the problem.
Finally, let me leave you a question.
When we give an example of locking reading, we use this statement, select * from t where id=1 lock in share mode. Because there is an index on id, you can navigate directly to the line id=1, so the read lock is only added to this line.
But if it is the following SQL statement
Begin;select * from t where Centro 5 for update;commit
How is this statement sequence locked? When will the lock be released?
Time of the previous issue
The table structure is as follows:
Mysql > CREATE TABLE `table_ a` (`id` int (11) NOT NULL, `b` varchar (10) DEFAULT NULL, PRIMARY KEY (`id`), KEY `b` (`b`) ENGINE=InnoDB
Suppose there are 1 million rows of data in the table, of which 100000 rows of data have a value of '1234567890'. Suppose the execution statement now reads as follows:
Mysql > select * from table_a where bundles 1234567890abcd'
At this point, how will MySQL be implemented?
Ideally, MySQL sees that field b defines varchar (10), which must be empty. Unfortunately, MySQL did not do so.
Otherwise, if we put '1234567890abcd' into the index for matching, we must not be able to quickly determine that there is no such value on index tree b, and we will soon be able to return an empty result.
But in fact, MySQL doesn't do that either.
The execution of this SQL statement is slow, and the process goes like this:
Character truncation is done when it is passed to the engine for execution. Because this line in the engine only defines a length of 10, it only truncates the first 10 bytes, that is, '1234567890' to make a match.
There are 100000 rows of data that meet the criteria.
Because it is select *, you have to return to the table 100000 times.
But every time we check out the whole row after returning to the table, when we judge at the server layer, the value of b is not '1234567890abcd'.
The return result is empty.
Thank you for your reading, the above is the content of "why a simple one-line query in MySQL is slow". After the study of this article, I believe you have a deeper understanding of why a simple one-line query in MySQL is slow, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.