What is the use of disconnection in MySQL 07/12 Update SLTechnology News&Howtos

What is the use of disconnection in MySQL

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article is to share with you about the usefulness of anti-connections in MySQL. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

In the join of the table, the semi-join and anti-join itself is very common, but the lack of rich statistical information may lead to a big difference in the evaluation of the execution plan, which is likely to magnify the difference in the implementation of semi-join, anti-join and execution path, resulting in the deterioration of SQL performance, while the gap between in and exists in MySQL is also decreasing.

I'll just simplify my description and illustrate some of the differences in MySQL version 5.6. It's a connecting link between 5.5 and 5.7.

We create a table t_fund_info with a data volume of 2 million, and another table with the same amount of data as t_fund_info. T_fund_info has primary key fields account,t_user_login_record does not have an index.

The SQL statement is as follows:

Select account

From t_fund_info

Where money > = 300

And account not in (select distinct (account))

From t_user_login_record

Where add_time > = '2016-06-01'). The execution plan is as follows:

The column select_type PRIMARY inside represents the outermost query in the subquery, which is not the primary key query. SUBQUERY represents the first SELECT of the inner query of the subquery, and the result does not depend on the result set of the external query.

From type to ALL represents a full table scan, so in such a query, both tables are full table scan, how is it decomposed when parsing inside MySQL. We use explain extended to get more detailed information.

/ * select#1 * /

Select test. T_fund_info. Account AS account

From test. T_fund_info

Where ((test. T_fund_info. Money > = 300) and

(not (

< in_optimizer >

(test. T_fund_info. Account, test. T_fund_info.

Account in

(

< materialize >

(/ * select#2 * /

Select test. T_user_login_record. Account

From test. T_user_login_record

Where (test. T_user_login_record. Add_time > = '2016-06-01')

< primary_index_lookup >

(test. T_fund_info. Account in

< temporary table >

< auto_key >

Where ((test. T_fund_info. Account = materialized-subquery.

Account) you can see that temporary tables are enabled and the data of the subquery is fetched as subsequent cache processing data.

How much does this kind of processing improve the performance, in fact, it is not big, and the performance improvement is also very limited.

Let's change our thinking, and that is to use not exists.

Explain extended select t1.account from t_fund_info T1 where t1.money > = 300and not exists (select distinct (t2.account) from t_user_login_record T2 where t1.account=t2.account and t2.add_time > = '2016-06-01'); how does this approach decompose in MySQL.

Select test. T1. Account AS account

From test. T_fund_info t1

Where ((test. T1. Money > = 300) and

(not

(exists (/ * select#2 * /)

Select test. T2. Account

From test. T_user_login_record t2

Where ((test. T1. Account = test. T2. Account) and

(test. T2. Add_time > = '2016-06-01') you can see that few special changes have been made.

This is a very similar way of thinking in 5. 5. 6. 6. 7.

Of course, the performance improvement in this way is relatively small. One limitation is that the statistical information is not rich enough, so there is a big gap in automatic assessment.

Let's put this place down for a while, and we'll take a look at it after we add an index.

Create index ind_account_id2 on t_user_login_record (account)

Then use not in to view the details of the parsing.

Select test. T_fund_info. Account AS account

From test. T_fund_info

Where ((test. T_fund_info. Money > = 300) and

(not (

< in_optimizer >

(test. T_fund_info.

Account

< exists >

(

< index_lookup >

(

< cache >

(test. T_fund_info. Account) in t_user_login_record on

Ind_account_id2

Where ((test. T_user_login_record. Add_time > = '2016-06-01') and

(

< cache >

(test. T_fund_info. Account) = test.

T_user_login_record. Account)

You can see that with indexes in this way, not in and not exits parse in a very similar way. One difference is the way it is handled outside of subqueries.

Let's take a look at the difference between the two. In the same step, with the index, the estimated key_len (the length of the index used) is 182 and the estimated number of rows is 1.

-+-

Key | key_len | ref | rows

-+-

NULL | NULL | NULL | 1875524

Ind_account_id2 | 182 | func | 1, but when there is no index before, this result is very different, which is more than 1.9 million.

-+-

Key | key_len | ref | rows

-+-

NULL | NULL | NULL | 1875524

NULL | NULL | NULL | 1945902 and see if there is any change in the way not exists works with the index.

/ * select#1 * /

Select test. T1. Account AS account

From test. T_fund_info t1

Where ((test. T1. Money > = 300) and

(not

(exists (/ * select#2 * /)

Select test. T2. Account

From test. T_user_login_record t2

Where ((test. T1. Account = test. T2. Account) and

(test. T2. Add_time > = '2016-06-01')

As you can see above, there is no difference between parsing without adding an index. Where will there be a difference, that is, there is a big difference in the estimated number of rows to be executed.

So through such a small example of anti-join, we can see that when there is an index, not in will be internally converted to not exists, while the way of not exists exists and does not exist, the two can see a big difference by executing the plan, and one of the bottlenecks is the estimated number of rows.

Thank you for reading! This is the end of this article on "what is the use of anti-connections in MySQL?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.