Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the differences between MySQL and PostgreSQL in multi-table join algorithms?

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the differences between MySQL and PostgreSQL in the multi-table join algorithm, the article is very detailed, has a certain reference value, interested friends must read it!

We know that mysql has neither hash join nor merge join, so there is only one algorithm nest loop join,nl join that uses the result set of the driven table as the exterior to find each record in the inner table. If there is an index, it will scan the index, and if there is no index, it will scan the whole table.

Nl join is not suitable for all scenarios, such as the equivalent join of two tables with large tables, which is what hash join is good at and is the most common scenario in a production environment. Mysql seems inadequate at this time, so when using mysql, we may make the following specification: prohibit the use of large table joins. This is also the eternal pain of mysql. However, it is said that version 8.0 has included hash join as a requirement, so let's wait and see.

By comparison, postgresql's optimizer is very strong. Support hash join, nest loop, sort merge join, scan algorithm support seq scan, index scan, index only scan, but also support in-heap tuple technology (HOT). Parallel scanning is also added in the postgresql11 version to test more than 3 million join result sets on two large tables (one piece of 160 million and one piece of 2.56 million data, both without index). Pg runs the results within about 20 seconds, which is better than other databases.

The algorithm of two-table join is discussed above. Let's take a look at how mysql and pg deal with multi-table join. Multi-table join actually involves a problem: how to find the optimal path with the least cost. Why is there such a problem? Because in the case of multi-table join, the join between each two tables has an algebraic value, the optimizer will adjust the order of different table join according to cost estimation, and finally calculate an optimal or near-optimal cost, which is used to generate an execution plan, which involves the shortest path problem in graph theory. Different join order combinations represent the traversal of graphs, and the optimal cost is actually the shortest path problem of passive graphs. We know that the two mainstream shortest path algorithms are Dijkstra algorithm and floyd algorithm, which are also classical algorithms in dynamic programming.

Greedy algorithm is used to calculate the optimal cost in mysql, while dynamic programming is used in pg.

Mysql:

Mysql connections use the greedy algorithm, and the following figure shows the process of the greedy algorithm:

Greedy algorithm is the premise of determining the source, the idea of the algorithm is also very similar to the name, only to find the optimal solution of the current step, is a depth-first solution, algorithm complexity is O (n ²) to find the next level, until the end point. For example, from A to G in the picture above, the path using the greedy algorithm is A-> B-> D-> G algorithm, and the cost is 1-2-6-9. Obviously, this is not the optimal solution. The optimal solution can be seen with the naked eye as A-> C-> F-> G, and the cost is 2-3-1-6. So we see that the greedy algorithm is not globally optimal, but the advantage is that the complexity of the algorithm is low. Mysql may also use the greedy algorithm based on this consideration, and do not want to waste all the time on the computational cost, because if there are a large number of associated tables, then the calculation of the cost is exponential growth, so the greedy algorithm is not the optimal solution, but it has a certain advantage in the case of a large number of join tables.

Postgresql:

Let's take a look at the dynamic planning used by pg. Dynamic planning solves the problem of passive shortest path. Let's imagine that multi-table join itself is a passive shortest path problem, but mysql randomly chooses one as the starting point when it joins.

The idea of dynamic programming is to decompose the problem into sub-problems and recursively solve them. Take floyd algorithm as an example. The algorithm uses the adjacency matrix to represent the distance between each point, and if there is no connection, it represents infinity. For example, the following picture:

Freudian algorithm uses matrix to record the direct distance of nodes. Its power is that it gets the shortest distance of any two nodes directly after several calculations. It is a passive shortest path algorithm in the real sense, but its algorithm complexity is relatively high, which is O (n ³). The core idea of the algorithm is that if a [ij] > a [ik] + a [kj], then a [ij] = a [ik] + a [kj]. For the distance between every two nodes ab, if there is a third intermediate node c to make the distance of the acb shorter, then the distance of the ab is replaced by acb and updated to the matrix. In this traversal process, we roughly understand that a three-layer loop is needed. The two-layer loop is used to calculate whether the distance of each intermediate node (outermost loop) is smaller for ab, ac, and ad...de (nMul 1) * (nMul 1) options (you don't have to calculate your own distance). The matrix calculation process is as follows:

For the first row, calculate whether the distance of ab,ac,ad,ae is replaced by a third node. For ab, it is found that ab

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report