In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Today, I will talk to you about the implementation process of Join in MySQL, which many people may not know very well. in order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.
How the Join of MySQL is executed
Join can be said to be a set operation, such as left join,right join,inner join,full join,outer join,cross join, etc., the computational relations between these sets correspond to the intersection of high school mathematics sets, union sets, complement sets, complete sets and so on. But in the actual code, the join operation is basically realized through multi-layer loops.
To give an example, suppose there are two tables, T1 and T2, with the following structure:
Createtablet1 (
IdintnotnullAUTO_INCREMENT
Usernamevarchar (20) notnulldefault''
Ageintnotnulldefault0
PRIMARYkey (`id`)
) ENGINE=INNODBDEFAULTCHARSET=UTF8MB4
Createtablet2 (
Idintnotnullauto_increment
Usernamevarchar (20) notnulldefault''
Scoreintnotnulldefalut0
Primarykey (`id`)
)) ENGINE=INNODBDEFAULTCHARSET=UTF8MB4
Suppose T1 has 100 pieces of data and T2 table has 200 items.
The query sql is:
Selectt1.*,t2.*fromt1leftjoint2on (t1.username=t2.username)
Then the steps to execute this SQL are as follows:
Fetch a row of data R1 from table T1
From R1, take out the field username and query it in table T2
Take the qualified rows from table T2 and form a line with R1 as part of the result set
Repeat step 1 # 2 # 3 until all data loops in table T1 are completed.
Basically, we first iterate through tGrain1, and then go to table T2 to find the records that meet the criteria according to the username in each row of data in T1. It's basically a two-layer loop.
How to optimize join query
As you can see from the above, join is essentially a loop, and the overhead here is as follows:
Traversing T1 data, reading data is the number of rows of T1 table. If the number of rows is n, the complexity is also n.
Query data from a row in T2 according to the matching field username of T1
This process, because the data storage structure of MySQL is binary tree, and the time complexity is log2 (m) m is the total number of rows of T2 table.
Then the total complexity is approximate to n (2log2 (m)).
As you can see from the steps above, the direction of optimization is:
Reduce the cost of T1 query, mainly disk io overhead, avoid full table scan, use index
Reduce the overhead of T2 queries, also using indexes
A table with a large amount of data is used as a driven table, a small table is used as a driven table, and m takes a logarithm. The effect of large amount of data on complexity does not increase linearly.
Caching T1 table without going to disk load at a time, for example, caching 100items at a time, can significantly reduce the number of disk data read, and T2 can be compared with T1 data in cache each time.
Random disk read consumes disk performance and changes to sequential read, because of the storage structure of binary tree, every time a non-primary key search, there is an action to return to the table, that is, query the required data again according to the primary key.
The basic method of optimization:
Reduce the number of cycles, reduce the number of disk IO, and change random IO to sequential IO
In fact, MySQL has a corresponding algorithm for the above optimization method:
The most common loop of Simple Nested Loop Join, this should be avoided
Block Nested Loop Join mainly aims at the fact that there is no index on the T2 table. In step 2, each row of data in T2 is compared with the join buffer data, so that the disk operation is compared with the memory operation. However, if the data of the driven table is relatively large, it also affects the performance, mainly because the cache pool is full, resulting in a decline in MySQL performance.
Index Nested Join all look up and associate through the primary key, and this kind of performance is better.
Batched Key Access Join this is an optimization done on Index Nested Join. Because of the existence of back table, random operation io also consumes performance. The core of this algorithm is to sort the primary key when searching through the secondary index, and then look up the primary key in the order in which the primary key increases, and then read the disk in close order, thus optimizing the performance.
Whether or not to use Join?
From the above analysis, we can see that it is feasible to use Join, as long as the performance is controllable and within the acceptable range, the code complexity can be reduced. What needs to be avoided is that join tables do not have indexes, otherwise such SQL is catastrophic online.
Summary
Join can still be used boldly, as long as a few principles are grasped:
1. Try to make the columns of join be index columns, preferably of the same type and primary key index as much as possible
2. Try to use the small table as the driver table (this can be done automatically after a version of MySQL 5.6)
3. Form the good habit of explain the written SQL and observe the execution process of SQL.
After reading the above, do you have any further understanding of the implementation process of Join in MySQL? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.