Improvement of cost calculation of PostgreSQL JOIN limit Optimizer-mergejoin startup cost Optimization 04/16 Update SLTechnology News&Howtos

Improvement of cost calculation of PostgreSQL JOIN limit Optimizer-mergejoin startup cost Optimization

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Background

The cost estimate of PostgreSQL limit N is obtained by calculating the total cost An and the estimated total number of records B:

(Nbig B) * A roughly means the method of calculating the proportion

This method is generally suitable for single-table queries, but it may not be applicable if the data distribution is skewed, such as in the following two cases:

1. The qualified data accounts for 50% of the total records, but all of them are distributed at the end of the table, so is the limit 10000 faster by index or by full table scan?

2. If the qualified data accounts for 50% of the total records, and all of them are distributed in the head of the table, then 10000 items of LIMIT must be fast for scanning the whole table.

There is a similar problem in the case of JOIN:

For example, when JOIN and conditional, LIMIT N, is it faster to walk in a nested loop, or to take MERGE or HASH JOIN?

1. The cost calculation method of nested loop + where+LIMIT can be obtained by using the proportion of LIMIT in the total estimated records, which is quite reasonable.

2. In the cost calculation method of MERGE JOIN+where+LIMIT, the startup cost must be taken into account, for example, the WHERE condition is on table A (you can scan directly from the conditional position with the index), and table B needs to be scanned from the beginning of the index (to match the index of table A, you may need to scan a lot of index ENTRY, which may be very expensive), startup cost, plus the proportion of the number of LIMIT entries in all remaining costs. The cost obtained is a more reasonable cost.

3. The cost calculation method of hash join+where+limit is obtained by using the method of starting cost + LIMIT as a percentage of the total estimated records, which is what the optimizer is doing at present, which is more reasonable.

However, for MERGE JOIN, PG does not add this startup cost when using LIMIT currently, so the resulting execution plan may be inaccurate.

It is suggested that the startup cost of merge join can be added to the improvement method.

PostgreSQL example

1. The table is as follows:

Postgres=# create table test1 (aint, b text, primary key (a)); CREATE TABLE postgres=# create table test2 (aint, b text, primary key (a)); CREATE TABLE postgres=# alter table test1 add constraint testcheck foreign key (a) references test2 (a); ALTER TABLE postgres=# insert into test2 select generate_series (1m 1000000), 'abcdefg' INSERT 0 1000000 postgres=# insert into test1 select generate_series (1 million 500000 analyze test1; analyze test2 2), 'abcdefg'; INSERT 0 000 analyze test1; analyze test2

2. Query SQL as follows:

Explain (analyze,verbose,timing,costs,buffers) select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 limit 10

The table structure in this statement is special. The two associated fields are primary keys and there is a foreign key constraint relationship. The query plan is as follows:

QUERY PLAN -Limit (cost=0.73..0.89 rows=10 width=24) (actual time=54.729..54.739 rows=10 loops=1) Output: test2.a Test2.b, test1.a, test1.b Buffers: shared hit=2042-> Merge Left Join (cost=0.73..7929.35 rows=498340 width=24) (actual time=54.728..54.735 rows=10 loops=1) Output: test2.a, test2.b, test1.a Test1.b Inner Unique: true Merge Cond: (test2.a = test1.a) Buffers: shared hit=2042-> Index Scan using test2_pkey on public.test2 (cost=0.37..3395.42 rows=498340 width=12) (actual time=0.017..0.020 rows=10 loops=1) Output: test2.a Test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4-> Index Scan using test1_pkey on public.test1 (cost=0.37..2322.99 rows=500000 width=12) (actual time=0.006..34.120 rows=250006 loops=1) Output: test1.a, test1.b Buffers: shared hit=2038 Planning Time: 0.216 ms Execution Time: 54.765 ms (17 rows)

As can be seen from the execution plan, 10 records that meet the conditions are queried first according to the test2 table, and then mergejoin is associated with the test1 table. Because the influence of limit is not taken into account in the estimation, the estimated number of rows is very large, which is 498340 rows.

It will be better to use nestloop actually (turn off seqscan and megejoin)

Postgres=# set enable_seqscan = off; SET postgres=# set enable_mergejoin = off; SET postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 limit 10 QUERY PLAN -Limit (cost=0.73..4.53 rows=10 width=24) (actual time=0.040..0.060 rows=10 loops=1) Output: test2.a Test2.b, test1.a, test1.b Buffers: shared hit=39-> Nested Loop Left Join (cost=0.73..189339.64 rows=498340 width=24) (actual time=0.039..0.057 rows=10 loops=1) Output: test2.a, test2.b, test1.a Test1.b Inner Unique: true Buffers: shared hit=39-> Index Scan using test2_pkey on public.test2 (cost=0.37..3395.42 rows=498340 width=12) (actual time=0.025..0.027 rows=10 loops=1) Output: test2.a Test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4-> Index Scan using test1_pkey on public.test1 (cost=0.37..0.37 rows=1 width=12) (actual time=0.002..0.002 rows=0 loops=10) Output: test1.a Test1.b Index Cond: (test2.a = test1.a) Buffers: shared hit=35 Planning Time: 0.112 ms Execution Time: 0.078 ms (17 rows)

However, in terms of estimated cost, merge join+limit is lower than nestloop+limit because the total cost of nestloop is higher (189339 vs 7929). Therefore, according to the proportional algorithm (without referring to the startup cost of merge join), the optimizer believes that in the case of LIMIT, the cost of merge join is also lower.

The reality is that MERGE JOIN's B table without query conditions needs to be scanned from the head of the index rather than from the specified location. So the reality is that merge join is slower.

At present, when the optimizer uses hash join, it already includes startup cost, an example

Postgres=# set enable_mergejoin = off;SETpostgres=# set enable_seqscan = off;SETpostgres=# set enable_nestloop = off;SET startup cost = 3536.51postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 limit 10 QUERY PLAN- -Limit (cost=3536.51..3536.61 rows=10 Width=24) (actual time=158.148..158.219 rows=10 loops=1) Output: test2.a Test2.b, test1.a, test1.b Buffers: shared hit=4079, temp written=1464-> Hash Left Join (cost=3536.51..8135.83 rows=494590 width=24) (actual time=158.147..158.215 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Inner Unique: true Hash Cond: (test2.a = test1.a) Buffers: shared hit=4079 Temp written=1464-> Index Scan using test2_pkey on public.test2 (cost=0.37..3369.86 rows=494590 width=12) (actual time=0.023..0.027 rows=26 loops=1) Output: test2.a Test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4-> Hash (cost=2322.99..2322.99 rows=500000 width=12) (actual time=156.848..156.849 rows=500000 loops=1) Output: test1.a, test1.b Buckets: 262144 Batches: 4 Memory Usage: 7418kB Buffers: shared hit=4072 Temp written=1464-> Index Scan using test1_pkey on public.test1 (cost=0.37..2322.99 rows=500000 width=12) (actual time=0.011..72.506 rows=500000 loops=1) Output: test1.a, test1.b Buffers: shared hit=4072 Planning Time: 0.141 ms Execution Time: 162.086 ms (21 rows) improvement proposal

For test1 tables Need to estimate a 500000 and rownum500000) Statistics-0 recursive calls 0 db block gets 25 consistent gets 0 physical reads 0 redo size 937 bytes Sent via SQL*Net to client 500 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed

Oracle chose nestloop join.

Using HINT, let Oracle use merge join to see what the cost is and whether it is close to the startup cost of the modified PostgreSQL merge join.

Select / * + USE_MERGE (test2 Test1) * / * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 and rownum500000) Statistics--- 1 recursive calls 0 db block gets 1099 consistent gets 0 physical reads Summary of 0 redo size 937 bytes sent via SQL*Net to client 500 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 10 rows processed

1. When PostgreSQL calculates the cost of merge join+limit, the optimizer has room for optimization, so we can consider taking the start-up cost into account to improve the correctness of the JOIN method in which the optimizer selects SQL with limit output.

2. In the case of inner join, the merge join can be optimized through query rewrite to skip the header INDEX SCAN that does not meet the conditions.

Postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 join test1 on test2.a = test1.a where test2.a > 500000 limit 10 QUERY PLAN -Limit (cost=0.77..1.09 rows=10 width=24) (actual time=54.626.. 54.638 rows=10 loops=1) Output: test2.a Test2.b, test1.a, test1.b Buffers: shared hit=2042-> Merge Join (cost=0.77..7895.19 rows=247295 width=24) (actual time=54.625..54.635 rows=10 loops=1) Output: test2.a, test2.b, test1.a Test1.b Inner Unique: true Merge Cond: (test2.a = test1.a) Buffers: shared hit=2042-> Index Scan using test2_pkey on public.test2 (cost=0.37..3369.86 rows=494590 width=12) (actual time=0.017..0.020 rows=19 loops=1) Output: test2.a Test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4-> Index Scan using test1_pkey on public.test1 (cost=0.37..2322.99 rows=500000 width=12) (actual time=0.008..34.009 rows=250010 loops=1) Output: test1.a Test1.b Buffers: shared hit=2038 Planning Time: 0.244 ms Execution Time: 54.669 ms (17 rows) sql rewrite: can be done in the kernel So there is no need to change the SQL. The effect is as follows, super good. Postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 join test1 on test2.a = test1.a where test2.a > 500000 and test1.a > 500000limit 10 QUERY PLAN -Limit (cost=0.75..1.30 rows=10 width=24) (actual time=0.035..0.048 rows=10 loops=1) Output: test2.a Test2.b, test1.a, test1.b Buffers: shared hit=8-> Merge Join (cost=0.75..6711.51 rows=123700 width=24) (actual time=0.034..0.044 rows=10 loops=1) Output: test2.a, test2.b, test1.a Test1.b Inner Unique: true Merge Cond: (test2.a = test1.a) Buffers: shared hit=8-> Index Scan using test2_pkey on public.test2 (cost=0.37..3369.86 rows=494590 width=12) (actual time=0.015..0.019 rows=19 loops=1) Output: test2.a Test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4-> Index Scan using test1_pkey on public.test1 (cost=0.37..1704.30 rows=250106 width=12) (actual time=0.015..0.017 rows=10 loops=1) Output: test1.a Test1.b Index Cond: (test1.a > 500000) Buffers: shared hit=4 Planning Time: 0.276 ms Execution Time: 0.074 ms (18 rows) reference

"the case of PostgreSQL Optimizer-order by limit Index selection problem"

Src/backend/optimizer/path/costsize.c

Original address: https://github.com/digoal/blog/blob/master/201810/20181004_03.md

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.