Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A sql Optimization problem of MYSQL Ali

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

The level of originality is limited, please point out if there is any mistake.

Today, after studying innodb things for a day, my colleague Tian Xingchun told me that there was an optimization question sent by Ali, and gave me the table and sentence, and told me that there was an implicit conversion in the sentence.

Yesterday, some people in the group also said that they had no time to read this question, so they just took a look at it. The whole script is as follows:

Click (here) to collapse or open

Data preparation:

Create table a (id int auto_increment,seller_id bigint,seller_name varchar collate utf8_bin, gmt_create varchar 30, primary key (id))

Insert into a (seller_id,seller_name,gmt_create) values (1000000 dint uniqlae dagger, 2017-01-01')

Insert into a (seller_id,seller_name,gmt_create) values (100001 and uniqlb)

Insert into a (seller_id,seller_name,gmt_create) values (100002 Magneto Uniqlc Magazine 2017-03-01')

Insert into a (seller_id,seller_name,gmt_create) values (100003 Magneto UniqldIn 2017-04-01')

Insert into a (seller_id,seller_name,gmt_create) values (100004 Magneto Uniqle.com 2017-05-01')

Insert into a (seller_id,seller_name,gmt_create) values (100005 and uniqlf)

Insert into a (seller_id,seller_name,gmt_create) values (100006 and uniqlg)

Insert into a (seller_id,seller_name,gmt_create) values (100007-08-01')

Insert into a (seller_id,seller_name,gmt_create) values (100008 Magnum Uniqliy Magna 2017-09-01')

Insert into a (seller_id,seller_name,gmt_create) values (100009-10-01')

Insert into a (seller_id,seller_name,gmt_create) values (100010-11-01')

Insert into a (seller_id,seller_name,gmt_create) values (100011)

Insert into a (seller_id,seller_name,gmt_create) values (2018-01-01')

Insert into a (seller_id,seller_name,gmt_create) values (2018-02-01')

Insert into a (seller_id,seller_name,gmt_create) values (2018-03-01')

Insert into a (seller_id,seller_name,gmt_create) values (2018-04-01')

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) select seller_id,seller_name,gmt_create from a

Insert into a (seller_id,seller_name,gmt_create) values (100016 menagerie uniqlqqwriting now ())

Create table b (id int auto_increment,seller_name varchar), user_id varchar (50), user_name varchar (100), sales bigint,gmt_create varchar (30), primary key (id))

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqla','1','a',1,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlb','2','b',3,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlc','3','c',1,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqld','4','d',4,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqle','5','e',5,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlf','6','f',1,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlg','7','g',7,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlh','8','h',1,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqli','9','i',1,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlj','10','j',15,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlk','11','k',61,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqll','12','l',31,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlm','13','m',134,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqln','14','n',1455,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlo','15','o',166,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('niqlp','16','p',15,now ())

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) select seller_name,user_id,user_name,sales,gmt_create from b

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('uniqlq','17','s',109,now ())

Create table c (id int auto_increment,user_id varchar (50), order_id varchar (100), state bigint,gmt_create varchar (30), primary key (id)

Insert into c (user_id,order_id,state,gmt_create) values (21 ~ (1)) now ()

Insert into c (user_id,order_id,state,gmt_create) values (22pime 2pm 0, now ())

Insert into c (user_id,order_id,state,gmt_create) values (33jue 3penny 0, now ())

Insert into c (user_id,order_id,state,gmt_create) values (43, now ())

Insert into c (user_id,order_id,state,gmt_create) values (54jue 5penny 0, now ())

Insert into c (user_id,order_id,state,gmt_create) values (65dy6, now ())

Insert into c (user_id,order_id,state,gmt_create) values (75d7) 0, now ()

Insert into c (user_id,order_id,state,gmt_create) values (85pm 8pm 0, now ())

Insert into c (user_id,order_id,state,gmt_create) values (95dy8, now ())

Insert into c (user_id,order_id,state,gmt_create) values (100pc8pm 0, now ())

Insert into c (user_id,order_id,state,gmt_create) values (1501.8pm 0, now ())

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) select user_id,order_id,state,gmt_create from c

Insert into c (user_id,order_id,state,gmt_create) values (176.8pm 0, now ())

SQL to be optimized:

Select a.seller_id,a.seller_name,b.user_name,c.state

From a,b,c

Where a.seller_name=b.seller_name

And b.user_id=c.user_id

And c.user_id=17

And a.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600MINUTE)

AND DATE_ADD (NOW (), INTERVAL 600MINUTE)

Order by a.gmt_create first explains that the optimization topic mainly examines the following five points:

1. The difference between BNL and NJL

2. The realization of NJL

3. DBA's observation of data distribution.

4. The implicit conversion index cannot be used.

5. Different indexes of comparative character sets cannot be used.

First, let's describe it separately.

1. The difference between BNL and NJL

Refer to my article for this difference.

Http://blog.itpub.net/7728585/viewspace-2129502/

(MYSQL MRR NLJ BNL BKA is discussed from the principle of sequential random Istroke O)

To put it simply, BNL is generally used in the case of TYPE=INDEX and TYPE=ALL, because the join condition of the driven table does not have an index, and join buffer is required to drive the

The main purpose of taking the data out (physical / logical reading) and putting it into join buffer is to reduce the number of drives of the driven table, thus improving efficiency, because there is no index.

It's too slow to scan the driven table once. B here means BLOCK.

While NJL is generally used in cases where the join condition of a driven table has an index, it is naturally much faster through the ref or eq_ref on the index (depending on whether the index is unique or not). At this time, join buffer will not

In use, it only needs to read one piece of data (physical / logical read) to drive the table once, because the join condition of the driven table has an index, which is naturally fast (the index is located back to the table).

2. The realization of NJL

You can also refer to the above article, which has probably said that it is no longer nonsense.

3. DBA's observation of data distribution.

This can be achieved by human beings. To put it simply, for example, if a table has 100 pieces of data, 99 items are no=1 and one is no=2, then we

You need to be aware of this, and if this representation is used as a driver table, then no=2 works much better than no=1. There is also this question.

This factor

Obviously and c. Username has only one piece of data.

4. The implicit conversion index cannot be used.

This is a problem for both MYSQL and ORACLE.

ORACLE will show to_char (id) ='1' and so on.

There will be a warning similar to the following in MYSQL

| | Warning | 1739 | Cannot use ref access on index 'user_id' due to type or collation conversion on field' user_id' |

| | Warning | 1739 | Cannot use range access on index 'user_id' due to type or collation conversion on field' user_id' |

Like here.

C.user_id=17

And

User_id is a varchar type, not an int type

Or like the one here.

A.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600MINUTE)

AND DATE_ADD (NOW (), INTERVAL 600MINUTE)

Here

Gmt_create varchar (30) is also varchar!

5. Compare the use of exceptions in different indexes of character sets

I have already described this question of string comparison in the article.

Http://blog.itpub.net/7728585/viewspace-2141914/

To put it simply here.

A.seller_name=b.seller_name

The a.seller_name comparative character set is utf8_bin case-sensitive

And

B.seller_name is case-insensitive, which is the default.

The join between them is bound to be driven by the table without using index exceptions. (innodb can icp)

There will also be warnings similar to the following:

Cannot use ref access on index 'seller_name' due to type or collation conversion on field' seller_name'

II. Optimization principles

We know that almost all the statement execution algorithm logic is at the MYSQL level, and INNODB is only responsible for transferring the data in several ways.

(PAGE_CUR_G,PAGE_CUR_GE,PAGE_CUR_L,PAGE_CUR_LE) scan it out and send it to the MYSQL level for processing. In between, there is a scan to get it.

A conversion process of innodb record-- > innodb tuple-- > mysql record, which is mostly marked as a sending data process

(update/delete is updating), then it is necessary to reduce the generation of intermediate result sets to reduce the entire amount of data from the innodb to the MYSQL layer

The whole amount of data. This is explained in terms of NJL's optimization principles, because that's what the problem is for.

1. Reduce the data of NJL-driven result sets

It is obvious that reducing the number of drivers naturally reduces the transfer of data between innodb and mysql.

2. The index uniqueness of the driven table should be as good as possible.

This problem is a little difficult to understand, but it's okay to think about it carefully. If the uniqueness of the index of the driven table is better, then the number of times to return the table through the index is less.

A rough judgment can be made here by rows and filter, probably because they are not accurate in the first place.

Once upon a time, we had a colleague Tian Xingchun who also watched Liezi with me. A driven table has two link conditions, an index with poor uniqueness and a join with good uniqueness.

There is no index on the column, we build an index on the column with good uniqueness, and the performance is improved immediately.

3. On this topic

Let's first avoid c.user_id=17 hidden trial conversion and change 17 to'17. There is no need to change him. The reason will be said later.

Select a.seller_id,a.seller_name,b.user_name,c.state

From a,b,c

Where a.seller_name=b.seller_name

And b.user_id=c.user_id

And c. Username

And a.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600MINUTE)

AND DATE_ADD (NOW (), INTERVAL 600MINUTE)

Order by a.gmt_create

Then let's take a look at the implementation plan.

+- -- +

| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+- -- +

| | 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 16108 | 11.11 | Using where; Using temporary; Using filesort |

| | 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 16173 | Using where; Using join buffer (Block Nested Loop) |

| | 1 | SIMPLE | c | NULL | ALL | NULL | NULL | NULL | NULL | 359382 | Using where; Using join buffer (Block Nested Loop) |

+- -- +

3 rows in set, 1 warning (0.02 sec)

BNL is used here, and the normal connection condition does not have any index, and we start to look at the data and find this problem.

Table c is finally inserted

Insert into c (user_id,order_id,state,gmt_create) values (176.8pm 0, now ())

Table b is finally inserted

Insert into b (seller_name,user_id,user_name,sales,gmt_create) values ('uniqlq','17','s',109,now ())

Table an is finally inserted

Insert into a (seller_id,seller_name,gmt_create) values (100016 menagerie uniqlqqwriting now ())

We can find that there is only one piece of data in the whole sentence, no matter how large the amount of data in table c is. This is also the problem of DBA's observation of data distribution.

According to the optimization method, a driver result set is obtained by filtering the c table c. Usernames automatically 17'(in fact, the b table can also be automatically converted by MYSQL) with only one piece of data

Then join table b (b.user_id=c.user_id) naturally there is only one piece of data in the intermediate driver result set, and finally connect through (a.seller_name=b.seller_name)

Table a naturally has only one piece of data.

And a.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600MINUTE)

AND DATE_ADD (NOW (), INTERVAL 600MINUTE)

Order by a.gmt_create

Don't worry about either of them.

According to this idea.

We can first build an index on c.user_id, with the intention of filtering out c. Username indexes 17 through the index. The purpose of b.user_id indexing is that NJL driven tables use indexes instead of the BNL of the whole table.

The execution plan is changed to:

Click (here) to collapse or open

Mysql > desc select a.sellerperiodical a. Sellerplaynamereachb.userplaynamerec.state from an AND c where a.seller_name=b.seller_name and b.user_id=c.user_id and c.userroomidwriting 17' and a.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600mm) AND

DATE_ADD (NOW (), INTERVAL 600MINUTE) order by a.gmt_create

+- -- +

| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+- -- +

| | 1 | SIMPLE | b | NULL | ref | user_id | user_id | 153 | const | 1 | 100.00 | Using temporary; Using filesort |

| | 1 | SIMPLE | c | NULL | ref | user_id | user_id | 153 | const | 1 | 100.00 | Using index condition |

| | 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 16108 | 1.11 | Using where; Using join buffer (Block Nested Loop) |

+- -obviously there is a transformation here in b.user_id=c.user_id and c. Usernames identices17' that we can see through sql trace.

"resulting_condition": "((`a`.`roomname` = `b`.`user _ name`) and (`c`.`user _ id` = '17') and (`a`.`gmt _ create`between (now () + interval-(600) minute) and (now () + interval 600 minute)) and multiple equal (`b`.`user _ id`, `c`.`user _ id`))")

We can pay attention to this.

Multiple equal (`b`.`user _ id`, `c`.`user _ id`)) "

This is actually transformed because it is obvious that the b. Username is valid.

All that is left is to solve the BNL problem of table a. We can't let table a be scanned by type=ALL to speed up the speed.

Our indexing execution plan in a.seller_name and b.seller_name becomes

Click (here) to collapse or open

Mysql > desc select a.sellerperiodical a. Sellerplaynamereachb.userplaynamerec.state from an AND c where a.seller_name=b.seller_name and b.user_id=c.user_id and c.userroomidwriting 17' and a.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600mm) AND

DATE_ADD (NOW (), INTERVAL 600MINUTE) order by a.gmt_create

+- -+

| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+- -+

| | 1 | SIMPLE | b | NULL | ref | user_id,seller_name | user_id | 153 | const | 1 | 100.00 | Using where; Using temporary; Using filesort |

| | 1 | SIMPLE | c | NULL | ref | user_id | user_id | 153 | const | 1 | 100.00 | Using index condition |

| | 1 | SIMPLE | a | NULL | ref | seller_name | seller_name | 303 | test.b.seller_name | 947 | 11.11 | Using index condition; Using where |

+- -+ 3 rows in set 2 warnings (0.00 sec)

We seem to be using the index at this time, but this is thanks to ICP, we see the warning.

Cannot use ref access on index 'seller_name' due to type or collation conversion on field' seller_name'

In order to eliminate this problem, we have to change the comparative character set of table a seller_name 's comparative character set.

Finally, we got the execution plan.

Click (here) to collapse or open

+- -+

| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |

+- -+

| | 1 | SIMPLE | b | NULL | ref | user_id,seller_name | user_id | 153 | const | 1 | 100.00 | Using where; Using temporary; Using filesort |

| | 1 | SIMPLE | c | NULL | ref | user_id | user_id | 153 | const | 1 | 100.00 | Using index condition |

| | 1 | SIMPLE | a | NULL | ref | seller_name | seller_name | 303 | test.b.seller_name | 1 | 11.11 | Using where |

+- -+

Now everything is normal, Using index condition ICP is gone, there is only one where, and this where is obviously

A.gmt_create BETWEEN DATE_ADD (NOW (), INTERVAL-600MINUTE)

AND DATE_ADD (NOW (), INTERVAL 600MINUTE)

As for

Using temporary; Using filesort

We can ignore him. It's just a piece of data.

This is the end of the optimization.

Optimized profile

+-+

| | Status | Duration | CPU_user | CPU_system | |

+-+

| | starting | 0.000169 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000005 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000006 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000005 | 0.000000 | 0.000000 | |

| | Opening tables | 0.000026 | 0.000000 | 0.000000 | |

| | init | 0.000055 | 0.000000 | 0.000000 | |

| | System lock | 0.000013 | 0.000000 | 0.000000 | |

| | optimizing | 0.000018 | 0.000000 | 0.000000 | |

| | statistics | 0.000118 | 0.000000 | 0.000000 | |

| | preparing | 0.000022 | 0.000000 | 0.000000 | |

| | Creating tmp table | 0.000030 | 0.000000 | 0.000000 | |

| | Sorting result | 0.000007 | 0.000000 | 0.000000 | |

| | executing | 0.000003 | 0.000000 | 0.000000 | |

| | Sending data | 0.000101 | 0.000000 | 0.000000 | |

| | Creating sort index | 0.000027 | 0.000000 | 0.000000 | |

| | end | 0.000004 | 0.000000 | 0.000000 | |

| | query end | 0.000059 | 0.001000 | 0.000000 | |

| | removing tmp table | 0.000096 | 0.000000 | 0.000000 | |

| | query end | 0.000004 | 0.000000 | 0.000000 | |

| | closing tables | 0.000008 | 0.000000 | 0.000000 | |

| | freeing items | 0.000018 | 0.000000 | 0.000000 | |

| | cleaning up | 0.000022 | 0.000000 | 0.000000 | |

+-+

This is the previous profile.

+-+

| | Status | Duration | CPU_user | CPU_system | |

+-+

| | starting | 0.000226 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000011 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000006 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000014 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000006 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000004 | 0.000000 | 0.000000 | |

| | checking permissions | 0.000007 | 0.000000 | 0.000000 | |

| | Opening tables | 0.000039 | 0.000000 | 0.000000 | |

| | init | 0.000238 | 0.001000 | 0.000000 | |

| | System lock | 0.000029 | 0.000000 | 0.000000 | |

| | optimizing | 0.000118 | 0.000000 | 0.000000 | |

| | statistics | 0.000176 | 0.000000 | 0.000000 | |

| | preparing | 0.000112 | 0.000000 | 0.000000 | |

| | Creating tmp table | 0.000052 | 0.000000 | 0.000000 | |

| | Sorting result | 0.000019 | 0.000000 | 0.000000 | |

| | executing | 0.000005 | 0.000000 | 0.000000 | |

| | Sending data | 0.231418 | 0.230965 | 0.000000 | |

| | Creating sort index | 0.000055 | 0.000000 | 0.000000 | |

| | end | 0.000006 | 0.000000 | 0.000000 | |

| | query end | 0.000012 | 0.000000 | 0.000000 | |

| | removing tmp table | 0.000005 | 0.000000 | 0.000000 | |

| | query end | 0.000004 | 0.000000 | 0.000000 | |

| | closing tables | 0.000011 | 0.000000 | 0.000000 | |

| | freeing items | 0.000347 | 0.000000 | 0.000000 | |

| | cleaning up | 0.000015 | 0.000000 | 0.000000 | |

+-+

It's obvious that Sending data is too long here, which is the problem of data exchange between innodb and mysql that I mentioned earlier.

Author Wechat:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report