In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Foreword:
Always want to know how a SQL statement is executed, the order in which it is executed, and then check and summarize the information of all parties, you have the following blog post.
This article will explore the knowledge in terms of the overall architecture of MySQL-> query execution process-> statement execution order.
1. Overview of MySQL architecture:
It is best to look at the picture and match it with the necessary description.
The following picture is based on a picture in the reference book, and then adds your own understanding to it.
As we can see from the figure above, the whole architecture is divided into two tiers. The upper layer is MySQLD's so-called 'SQL Layer',' and the lower layer is a variety of storage engines that provide interfaces to the upper layer, known as' Storage Engine Layer'. The functions of other modules and components can be easily understood from their names, so I won't repeat them here.
II. Query execution process
Let me go a little further, and let me talk about the process of query execution according to my own understanding:
1. Connect
1.1 the client initiates a Query request and listens to the client's "connection management module" to receive the request
1.2 forward the request to the 'Connect in / Thread Module'
1.3 call the "user module" to check authorization
1.4 after passing the check, the connection / thread module removes the free cached connection thread from the thread connection pool and docks the client request, and if it fails, create a new connection request
two。 Deal with
2.1 query the cache first, check whether the Query statements match exactly, and then check whether you have permissions. If all are successful, you will directly take the data and return it.
2.2 if there is a failure in the previous step, it will be transferred to the "command parser". After lexical analysis and syntax analysis, the parsing tree will be generated.
2.3 then comes the preprocessing stage, which deals with semantics that cannot be solved by the parser, checks permissions, and generates a new parsing tree.
2.4 and then handed over to the corresponding module for processing
2.5 if it is a SELECT query, it will also do a lot of optimization through the "query optimizer" to generate an execution plan
2.6 after receiving the request, the module checks whether the connected user has access to the target table and target fields through the access control module.
2.7. call the 'table management module'. First, check whether it exists in table cache, then directly correspond to the table and acquire the lock, otherwise reopen the table file.
2.8.According to the meta data of the table, the information such as the type of storage engine of the table is obtained, and the corresponding storage engine is called through the API to process.
2.9 when data changes occur in the above process, if the log function is turned on, it will be recorded in the corresponding binary log file
3. Result
After the 3.1Query request is completed, the result set is returned to the 'connect in / thread module'
3.2 it can also return the corresponding status identification, such as success or failure, etc.
3.3 'Connect in / Thread Module' for subsequent cleanup and continue to wait for a request or disconnect from the client
A small summary of a picture
Third, SQL parsing order
Next, let's take a look at the past life and this life of a SQL sentence.
First, take a look at the sample statement.
SELECT DISTINCT
< select_list >FROM
< left_table > < join_type >JOIN
< right_table >ON
< join_condition >WHERE
< where_condition >GROUP BY
< group_by_list >HAVING
< having_condition >ORDER BY
< order_by_condition >LIMIT
< limit_number >However, the order in which it is executed is as follows
FROM ON JOIN WHERE GROUP BY HAVING SELECT DISTINCT ORDER BY LIMIT
Although I did not expect this, but it is still very natural and harmonious, where to obtain, constantly filter conditions, to choose the same or different, in order, then know to take the first few.
In that case, let's take a look at the details step by step.
Preparatory work
1. Create a test database
Create database testQuery
two。 Create a test table
CREATE TABLE table1 (uid VARCHAR (10) NOT NULL, name VARCHAR (10) NOT NULL, PRIMARY KEY (uid)) ENGINE=INNODB DEFAULT CHARSET=UTF8;CREATE TABLE table2 (oid INT NOT NULL auto_increment, uid VARCHAR (10), PRIMARY KEY (oid)) ENGINE=INNODB DEFAULT CHARSET=UTF8
3. Insert data
INSERT INTO table1 (uid,name) VALUES ('aaa','mike'), (' bbb','jack'), ('ccc','mike'), (' ddd','mike'); INSERT INTO table2 (uid) VALUES ('aaa'), (' aaa'), ('bbb'), (' ccc'), (NULL)
4. The last result you want
SELECT a.uid, count (b.oid) AS totalFROM table1 AS aLEFT JOIN table2 AS b ON a.uid = b.uidWHERE a. NAME = 'mike'GROUP BY a.uidHAVING count (b.oid)
< 2ORDER BY total DESCLIMIT 1; !现在开始SQL解析之旅吧! 1. FROM 当涉及多个表的时候,左边表的输出会作为右边表的输入,之后会生成一个虚拟表VT1。 (1-J1)笛卡尔积 计算两个相关联表的笛卡尔积(CROSS JOIN) ,生成虚拟表VT1-J1。 mysql>Select * from table1,table2 +-+ | uid | name | uid | +-+ | aaa | mike | 1 | aaa | | bbb | jack | 1 | aaa | | ccc | mike | 1 | aaa | ddd | mike | 1 | aaa | aaa | mike | 2 | aaa | bbb | jack | 2 | aaa | ccc | mike | 2 | aaa | | ddd | mike | | 2 | aaa | | aaa | mike | 3 | bbb | | bbb | jack | 3 | bbb | | ccc | mike | 3 | bbb | | ddd | mike | 3 | bbb | | aaa | mike | 4 | bbb | bbb | jack | 4 | bbb | ccc | mike | 4 | bbb | ddd | mike | 4 | bbb | mike | 5 | bbb | bbb | jack | 5 | bbb | ccc | mike | 5 | bbb | ddd | 5 | bbb | bbb | 6 | bbb | aaa | 6 | aaa | aaa | 6 | aaa | 6 | aaa | | | | 6 | ccc | | ddd | mike | 6 | ccc | | aaa | mike | 7 | NULL | | bbb | jack | 7 | NULL | | ccc | mike | 7 | NULL | | ddd | mike | 7 | NULL | +-+ rows in set (0.00 sec) |
(1-J2) ON filtering
Based on the virtual table VT1-J1, we filter out all the columns that meet the predicate conditions of ON, and generate the virtual table VT1-J2.
Note: because of grammatical limitations, 'WHERE'' is used instead, from which the reader can also feel the subtle relationship between the two.
Mysql > SELECT-> *-> FROM-> table1,-> table2-> WHERE-> table1.uid = table2.uid-> +-+ | uid | name | uid | +-+ | aaa | mike | 1 | aaa | | aaa | mike | 2 | aaa | | bbb | jack | 3 | bbb | | bbb | jack | 4 | bbb | bbb | jack | 5 | bbb | ccc | mike | 6 | ccc | +-- -+ rows in set (0.00 sec)
(1-J3) add external column
If an external join (LEFT,RIGHT,FULL) is used, columns in the main table (reserved table) that do not meet the ON criteria are also added to the VT1-J2 as external rows to generate the virtual table VT1-J3.
Mysql > SELECT-> *-> FROM-> table1 AS a-> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid +-+ | uid | name | uid | +-+ | aaa | mike | 1 | aaa | | aaa | mike | 2 | aaa | | bbb | jack | 3 | bbb | | bbb | jack | 4 | bbb | bbb | jack | 5 | bbb | ccc | mike | 6 | ccc | ddd | mike | NULL | NULL | +- -+ rows in set (0.00 sec)
Below from the Internet to find a very vivid explanation about 'SQL JOINS', if it infringes on your rights and interests, please let me know to delete, thank you.
2. WHERE
The temporary tables generated during the VT1 process are filtered, and the columns that satisfy the WHERE clause are inserted into the VT2 table.
Note:
Aggregate operations cannot be used at this time because of grouping, nor can aliases created in SELECT
The difference from ON:
If there are external columns, ON filters the associated table, and the main table (reserved table) returns all the columns
If no external columns are added, the effect of the two is the same
Application:
The filtering of the main table should be placed in WHERE
For related tables, ON is used for conditional query before join, and WHERE is used for conditional query after join.
Mysql > SELECT-> *-> FROM-> table1 AS a-> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid-> WHERE-> a. NAME = 'mike' +-+ | uid | name | uid | +-+ | aaa | mike | 1 | aaa | | aaa | mike | 2 | aaa | | ccc | mike | 6 | ccc | | ddd | mike | NULL | NULL | +-+ rows in set (0.00 sec)
3. GROUP BY
This clause groups the tables generated in VT2 by the columns in GROUP BY. Generate the VT3 table.
Note:
Subsequent statements that process procedures, such as SELECT,HAVING, use columns that must be included in the GROUP BY, and aggregate functions for those that do not appear
Reason:
GROUP BY changes the reference to the table to a new reference, reducing the number of columns that can be logically operated on at the next level
My understanding is:
According to the grouping field, records with the same grouping field are merged into a single record, because each grouping can only return one record unless it is filtered out, and fields that are not in the grouping field may have multiple values. Multiple values cannot be put into a single record, so these columns with multiple values must be converted into a single value through the aggregate function.
Mysql > SELECT-> *-> FROM-> table1 AS a-> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid-> WHERE-> a. NAME = 'mike'-> GROUP BY-> a.uid +-+ | uid | name | uid | +-+ | aaa | mike | 1 | aaa | | ccc | mike | 6 | ccc | | ddd | mike | NULL | NULL | +-+ rows in set (0.00 sec)
4. HAVING
This clause filters the different groups in the VT3 table, only acts on the grouped data, and the clauses that meet the HAVING condition are added to the VT4 table.
Mysql > SELECT-> *-> FROM-> table1 AS a-> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid-> WHERE-> a. NAME = 'mike'-> GROUP BY-> a.uid-> HAVING-> count (b.oid)
< 2;+-----+------+------+------+| uid | name | oid | uid |+-----+------+------+------+| ccc | mike | 6 | ccc || ddd | mike | NULL | NULL |+-----+------+------+------+rows in set (0.00 sec) 5. SELECT 这个子句对SELECT子句中的元素进行处理,生成VT5表。 (5-J1)计算表达式 计算SELECT 子句中的表达式,生成VT5-J1 (5-J2)DISTINCT 寻找VT5-1中的重复列,并删掉,生成VT5-J2 如果在查询中指定了DISTINCT子句,则会创建一张内存临时表(如果内存放不下,就需要存放在硬盘了)。这张临时表的表结构和上一步产生的虚拟表VT5是一样的,不同的是对进行DISTINCT操作的列增加了一个唯一索引,以此来除重复数据。 mysql>SELECT-> a.uid,-> count (b.oid) AS total-> FROM-> table1 AS a-> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid-> WHERE-> a. NAME = 'mike'-> GROUP BY-> a.uid-> HAVING-> count (b.oid)
< 2;+-----+-------+| uid | total |+-----+-------+| ccc | 1 || ddd | 0 |+-----+-------+rows in set (0.00 sec) 6.ORDER BY 从VT5-J2中的表中,根据ORDER BY 子句的条件对结果进行排序,生成VT6表。 注意: 唯一可使用SELECT中别名的地方; mysql>SELECT-> a.uid,-> count (b.oid) AS total-> FROM-> table1 AS a-> LEFT OUTER JOIN table2 AS b ON a.uid = b.uid-> WHERE-> a. NAME = 'mike'-> GROUP BY-> a.uid-> HAVING-> count (b.oid)
< 2 ->ORDER BY-> total DESC;+-+-+ | uid | total | +-+-+ | ccc | 1 | | ddd | 0 | +-+-+ rows in set (0.00 sec)
7.LIMIT
The LIMIT clause selects the specified row data starting from the specified location from the VT6 virtual table obtained in the previous step.
Note:
The positive and negative effects of offset and rows
When the offset is large, the efficiency is very low, you can do this:
The optimization method of subquery is adopted, in which the maximum id is obtained from the index, then in reverse order, and then the N-row result set is taken.
Using INNER JOIN optimization, the ID list is first obtained from the index in the JOIN clause, and then the final result is obtained by directly associating the query.
Mysql > SELECT-> a.uid,-> count (b.oid) AS total-> FROM-> table1 AS a-> LEFT JOIN table2 AS b ON a.uid = b.uid-> WHERE-> a. NAME = 'mike'-> GROUP BY-> a.uid-> HAVING-> count (b.oid)
< 2 ->ORDER BY-> total DESC-> LIMIT 1 + | uid | total | +-+-+ | ccc | 1 | +-+-+ row in set (0.00 sec)
At this point, the parsing journey of SQL is over. The figure above summarizes:
Reference books:
"MySQL performance tuning and Architecture practice", "MySQL Technology Insider: SQL programming"
At the end:
Well, the journey of in-depth understanding here is almost over, although it is not very in-depth, just some things put it together, referring to some books I have read before, the master's pen is really different. Moreover, get has achieved a lot of things in this process, and the most important thing is to further realize the grandeur of the computer software world.
In addition, due to my lack of talent and learning, it is inevitable that there are mistakes. If you find trouble to let me know, thank you.
Summary
The above is the whole content of this article, I hope that the content of this article has a certain reference and learning value for your study or work, if you have any questions, you can leave a message and exchange, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.