PostgreSQL DBA (13)-A top-down approach to reading the implementation plan 07/06 Update SLTechnology News&Howtos

PostgreSQL DBA (13)-A top-down approach to reading the implementation plan

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Generally speaking, reading the implementation plan usually adopts the bottom-up method, which is like starting from a certain leaf of the tree and then going to the branch and then to the trunk and root of the tree. The problem with this method is that if the tree is very large, it may be difficult to grasp the whole situation by "seeing the leaves but not the trunk." At this time, you can combine the top-down method to read, so as to grasp the entire implementation plan as a whole.

I. Basic knowledge

In order to understand and use top-down reading methods, you need to master some basic knowledge.

Plan Node Type

In PostgreSQL, planning nodes are divided into four categories: Control Node, ScanNode, Materialization Node, and Join Node.

1. Control node: A node used to handle special situations and implement special execution processes. For example, the Result node can be used to represent the tuple to insert specified by the VALUES clause in an INSERT statement.

2. Scan nodes: These nodes are used to scan objects such as tables for tuples. For example, the SeqScan node is used to scan a table sequentially. Scan one tuple at a time.

3. Materialized nodes: These nodes are complex, but they have one common feature, that is, they can cache execution results to auxiliary storage. Materialized nodes generate all the result tuples when they are first executed, and then cache these result tuples until they are fetched by the upper nodes; non-materialized nodes generate a result tuple each time they are executed and return it to the upper nodes. For example, the Sort node can retrieve all tuples returned by the lower nodes and sort them according to the specified attributes, and cache all the sorted results. Each time the upper nodes retrieve tuples from the Sort node, they return the next tuple from the cache in order.

4. Connection node: This type of node corresponds to the connection operation in relational algebra, and can realize multiple connection modes (conditional connection, left connection, right connection, full connection, natural connection, etc.). Each node realizes a connection algorithm. For example, HashJoin implements a Hash-based join algorithm.

For convenience, the rule is set: if the child node of control node/materialized node is connected node, it is regarded as connected node, otherwise it is regarded as non-connected node.

According to this rule, all nodes can be classified into two categories, connected nodes and unconnected nodes.

II. Top-down approach

The top down approach, as the name suggests, reads from the top/outermost of the execution plan.

1. Identify node type (unconnected vs connected)

2. If it is a non-connected node, identify the specific type of node (data table scan...), The branch ends

3. If it is a connection node, identify the outer and inner ends of the connection

3.1 Apply 1/2/3 steps recursively to the outer end

3.2 Apply 1/2/3 steps recursively to the inner end

The SQL script is as follows:

testdb=# explain verbose select dw.*, grjf.grbh,grjf.xm,grjf.ny,grjf.je testdb-# from t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je testdb(# from t_grxx gr inner join t_jfxx jf testdb(# on gr.dwbh = dw.dwbh testdb(# and gr.grbh = jf.grbh) grjftestdb-# where dw.dwbh in ('1001','1002')testdb-# order by dw.dwbh; QUERY PLAN ------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.87.. 61.01 rows=20 width=47) Output: dw.dwmc, dw.dwbh, dw.dwdz, gr.grbh, gr.xm, jf.ny, jf.je -> Nested Loop (cost=0.58.. 53.88 rows=20 width=32) Output: dw.dwmc, dw.dwbh, dw.dwdz, gr.grbh, gr.xm -> Index Scan using t_dwxx_pkey on public.t_dwxx dw (cost=0.29.. 13.92 rows=2 width=20) Output: dw.dwmc, dw.dwbh, dw.dwdz Index Cond: ((dw.dwbh)::text = ANY ('{1001,1002}'::text[])) -> Index Scan using idx_t_grxx_dwbh on public.t_grxx gr (cost=0.29.. 19.88 rows=10 width=16) Output: gr.dwbh, gr.grbh, gr.xm, gr.xb, gr.nl Index Cond: ((gr.dwbh)::text = (dw.dwbh)::text) -> Index Scan using idx_t_jfxx_grbh on public.t_jfxx jf (cost=0.29.. 0.35 rows=1 width=20) Output: jf.grbh, jf.ny, jf.je Index Cond: ((jf.grbh)::text = (gr.grbh)::text)(13 rows)

1. Identify node type: Nested Loop -> Connect node

3. Connect nodes: Identify the outer end, commonly known as the drive list (here Nested Loop) and the inner end (Index Scan).

3.1 The outer end is a connection node, and the type is Nested Loop.

Recursively apply 1/2/3 steps to resolve the Nested Loop

3.1.1 The outer end is Index Scan on t_dwxx

3.1.2 Index Scan on t_grxx

3.2 Inner end, recursively apply 1/2/3 steps, i.e. Index Scan on t_jfxx

Using the top-down approach, you can grasp the implementation plan from the "big picture" and avoid entering complicated details at the beginning.

III. References

PgSQL ·Best Practices· EXPLAIN Usage Analysis

Read postgresql source code with me (9)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.