In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The subquery of MYSQL has been optimized since 5.6. only EXISTS strategy is included in 5.5. the following is included in 5.7:
IN (= ANY)
-- Semi-join
-- table pullout (fastest, the subquery condition is a unique key)
-- first match
-- semi-join materialization
-- loosescan
-- duplicateweedout
-- Materialization
-- EXISTS strategy (slowest)
NOT IN (ALL)
-- Materialization
-- EXISTS strategy (slowest)
However, (not) exist does not have any way to optimize or correlate subqueries, which is different from ORACLE, where in and exists are in ORACLE.
All can be optimized using semi-join (semi). So try to use in instead of exists in MYSQL. Not in cannot use semi-join
Use it carefully, let alone use not exists. For each of the above meanings, please refer to the official manual and the mariadb manual.
Let's simply look at a Liezi.
Using semi-join materialization optimized
Mysql > explain select * from testde1 where testde1.id in (select id from testde2)
+- -+
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+- -+
| | 1 | SIMPLE | | NULL | ALL | NULL | NULL | NULL | NULL | NULL | 100.00 | NULL |
| | 1 | SIMPLE | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
| | 2 | MATERIALIZED | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL |
+- -+
3 rows in set, 1 warning (0.00 sec)
Note (Code 1003): / * select#1 * / select `test`.`testde1`.`id`AS `id`from `test`.`testde1` semi join (`test`.`testde2`) where (`test`.`testde1`.`id` = ``.`id`)
Semi join (`test`.`testde2`) illustrates the problem
Disable semi join to use Materialization optimization
Mysql > set optimizer_switch='semijoin=off'
Query OK, 0 rows affected (0.00 sec)
Mysql > explain select * from testde1 where testde1.id in (select id from testde2)
+-- +
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-- +
| | 1 | PRIMARY | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 100.00 | Using where |
| | 2 | SUBQUERY | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL |
+-- +
2 rows in set, 1 warning (0.00 sec)
Note (Code 1003): / * select#1 * / select `test`.`testde1`.`id`AS `from `test`.`testde1`where (`test`.`testde1`.`id`, `test`.`testde1`.`id` in ((/ * select#2 * / select `test`.`testde2`.`id`from `test`.`testde2`testde1), (`test`.`testde1`.id` in on where (`test`.`testde1`.`id` = `test`.`id`)
Materialized- subquery`.`id`) illustrates the problem
Disable join using Materialization
Ysql > set optimizer_switch='materialization=off'
Query OK, 0 rows affected (0.00 sec)
Mysql > explain select * from testde1 where testde1.id in (select id from testde2)
+-- +
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-- +
| | 1 | PRIMARY | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 100.00 | Using where |
| | 2 | DEPENDENT SUBQUERY | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
+-- +
2 rows in set, 1 warning (0.00 sec)
Note (Code 1003): / * select#1 * / select `test`.`testde1`.`id`AS `test`.`testde1`where (`test`.`testde1`.`id`, / * select#2 * / select 1 from `test`.`testde2` where ((`test`.`testde1`.`id`) = `test`.`testde2`.`id`)
Use DEPENDENT SUBQUERY associated subquery optimization, which is also the slowest. This has something to do with
The execution plan of select * from testde1 where exists (select * from testde2 where testde1.id=testde2.id); is exactly the same.
Testde1 large table must be used as driver table
Mysql > explain select * from testde1 where exists (select * from testde2 where testde1.id=testde2.id)
+-- +
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-- +
| | 1 | PRIMARY | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 100.00 | Using where |
| | 2 | DEPENDENT SUBQUERY | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
+-- +
2 rows in set, 2 warnings (0.00 sec)
Note (Code 1276): Field or reference 'test.testde1.id' of SELECT # 2 was resolved in SELECT # 1
Note (Code 1003): / * select#1 * / select `test`.`testde1`.`id`AS `test`.`testde1`where exists (/ * select#2 * / select 1 from `test`.`testde2` where (`test`.`testde1`.`id` = `test`.`testde2`.`id`))
At the same time, it is also stated in the official documents that the subqueries in DML do not need SEMI optimization and Materialization optimization, and can only use the implication of exists, that is, only related subqueries can be used to convert to exists format.
So the speed can be imagined, this method obviously takes out a row from the outer table, drives the inner table once, and the order is fixed, while the jion generally chooses a small table as the driver table for better performance. So it is suggested that we use join.
To delete the
The original text is as follows:
A limitation on UPDATE and DELETE statements that use a subquery to modify a
Single table is that the optimizer does not use semi-join or materialization subquery
Optimizations. As a workaround, try rewriting them as multiple-table UPDATEand
DELETEstatements that use a join rather than a subquery.
It is actually the following execution plan:
Mysql > explain delete from testde1 where id in (select id from testde2)
+-- +
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-- +
| | 1 | DELETE | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 100.00 | Using where |
| | 2 | DEPENDENT SUBQUERY | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
+-- +
2 rows in set (0.00 sec)
Convert to:
Mysql > explain delete from testde1 where exists (select * from testde2 where testde1.id=testde2.id)
+-- +
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-- +
| | 1 | DELETE | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 100.00 | Using where |
| | 2 | DEPENDENT SUBQUERY | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
+-- +
2 rows in set, 1 warning (0.00 sec)
It can be seen exactly the same.
You should use:
Mysql > explain delete testde1 from testde1,testde2 where testde1.id=testde2.id
+-- +
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+-- +
| | 1 | SIMPLE | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL |
| | 1 | DELETE | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 15 | 10.00 | Using where |
+-- +
2 rows in set (0.00 sec)
Here we see that the small watch testde2 has made a driver table.
Finally, let's explain this error:
Mysql > delete from testde1 where id in (select testde1.id from testde1,testde2 where testde1.id=testde2.id)
ERROR 1093 (HY000): You can't specify target table 'testde1' for update in FROM clause
Regardless of whether it is meaningful or not, this error report is called ER_UPDATE_TABLE_USED in the manual. Let's first analyze this error report.
Such a delete will expand the exists, then the testde1 is both the source of the modification condition and the modified object, which is not allowed. So how to modify it?
In fact, you need to save the results of select testde1.id from testde1,testde2 where testde1.id=testde2.id in a temporary table.
Do not expand exists, the method given in the manual is
Method 1. Create a view of algorithm=temptable
Method 2. Create a general view and modify SET optimizer_switch = 'derived_merge=off'
The purpose is not to expand and choose the second way to test:
Mysql > create view myt1
-> as
-> select testde1.id from testde1,testde2 where testde1.id=testde2.id
Query OK, 0 rows affected (0.02 sec)
Mysql > delete from testde1 where id in (select * from myt1)
ERROR 1443 (HY000): The definition of table 'myt1' prevents operation DELETE on table' testde1'.
Mysql > SET optimizer_switch = 'derived_merge=off'
Query OK, 0 rows affected (0.00 sec)
Mysql > delete from testde1 where id in (select * from myt1)
Query OK, 2 rows affected (0.00 sec)
Mysql > flush status
Query OK, 0 rows affected (0.02 sec)
Mysql > delete from testde1 where id in (select * from myt1)
Query OK, 2 rows affected (0.03 sec)
Mysql > show status like'% tmp%'
+-+ +
| | Variable_name | Value |
+-+ +
| | Created_tmp_disk_tables | 0 | |
| | Created_tmp_files | 0 | |
| | Created_tmp_tables | 2 | |
+-+ +
3 rows in set (0.01sec)
Take a look at the implementation plan:
Mysql > explain delete from testde1 where id in (select * from myt1)
+- -+-+
| | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+- -+-+
| | 1 | DELETE | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 13 | 100.00 | Using where |
| | 2 | DEPENDENT SUBQUERY | | NULL | index_subquery | 5 | func | 2 | 100.00 | Using index |
| | 3 | DERIVED | testde2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL |
| | 3 | DERIVED | testde1 | NULL | ALL | NULL | NULL | NULL | NULL | 13 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+- -+-+
4 rows in set (0.00 sec)
You can see that the subquery as a whole, and from both the status and the execution plan dervied, you can see that temporary tables are used, which is feasible, but certainly not good in performance.
There is no such problem in ORACLE, and the execution plan is as follows:
SQL > delete from testde1 where id in (select testde1.id from testde1,testde2 where testde1.id=testde2.id)
0 rows deleted.
Execution Plan
Plan hash value: 2653154564
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
| | 0 | DELETE STATEMENT | | 1 | 26 | 7 (15) | 00:00:01 |
| | 1 | DELETE | TESTDE1 |
| | * 2 | HASH JOIN SEMI | | 1 | 26 | 7 (15) | 00:00:01 |
| | 3 | TABLE ACCESS FULL | TESTDE1 | 5 | 65 | 2 (0) | 00:00:01 |
| | 4 | VIEW | VW_NSO_1 | 1 | 13 | 5 (20) | 00:00:01 |
| | * 5 | HASH JOIN | | 1 | 26 | 5 (20) | 00:00:01 |
| | 6 | TABLE ACCESS FULL | TESTDE2 | 1 | 13 | 2 (0) | 00:00:01 |
| | 7 | TABLE ACCESS FULL | TESTDE1 | 5 | 65 | 2 (0) | 00:00:01 |
First use hash join to build TESTDE2 and TESTDE1 into a view VW_NSO_1, and then use HASH JOIN SEMI optimization, obviously using to semi-join optimization
This is why ORACLE is still a small example of more powerful than today's MYSQL, although it is all as a whole, but MYSQL no longer uses SEMI optimization, ORACLE.
It's still OK, but MYSQL will definitely support it in the near future.
Finally, let's sum up:
1 、. So try to use in instead of exists in MYSQL. Not in cannot use semi-join. Be careful with it, let alone use not exists.
2. The subquery DML should modify the associated DML (update delete)
3. The reason for the ERROR 1093 (HY000) error is that a table is both a modified object and an object of information source. Need to use algorithm=temptable or
The way optimizer_switch = 'derived_merge=off'.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.