Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use OR in PostgreSQL correctly

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how to correctly use OR in PostgreSQL". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

In SQL statements, improper use of OR can lead to poor query efficiency. This does not mean that you cannot use OR but that you need to consider possible performance problems when using OR.

Test data:

DROP TABLE a create TABLE a (id integer NOT NULL, a_val text NOT NULL); INSERT INTO a SELECT i, md5 (i::text) FROM generate_series (1, 1000000) I id integer NOT NULL TABLE b; CREATE TABLE b (id integer NOT NULL, b_val text NOT NULL); INSERT INTO b SELECT i, md5 (i::text) FROM generate_series (1, 1000000) I politics alter TABLE an ADD PRIMARY KEY (id); ALTER TABLE b ADD PRIMARY KEY (id); ALTER TABLE b ADD FOREIGN KEY (id) REFERENCES a (ANALYZE) a VACUUM (ANALYZE) b

OR vs IN

Conditional statement p1 OR p2, if you can consider using IN to rewrite, for example:

[local:/data/pg12]: 5432 pg12@testdb=# EXPLAIN verboseSELECT id FROM aWHERE id = 42 OR id = 4711 QUERY PLAN-Bitmap Heap Scan on public.a (cost=8. 87.. 16.80 rows=2 width=4) Output: id Recheck Cond: (a.id = 42) OR (a.id = 4711)-> BitmapOr (cost=8.87..8.87 rows=2 width=0)-> Bitmap Index Scan on a_pkey (cost=0.00..4.43 rows=1 width=0) Index Cond: (a.id = 42)-> Bitmap Index Scan on a_pkey (cost=0.00..4) .43 rows=1 width=0) Index Cond: (a.id = 4711) (8 rows) [local:/data/pg12]: 5432 pg12@testdb=# [local:/data/pg12]: 5432 pg12@testdb=# EXPLAIN verboseSELECT id FROM aWHERE id in QUERY PLAN-Index Only Scan using a_pkey on public. A (cost=0.42..8.88 rows=2 width=4) Output: id Index Cond: (a.id = ANY ('{42 rows 4711}':: Integer [])) (3 rows) [local:/data/pg12]: 5432 pg12@testdb=#

Using the OR operator, the PG optimizer takes Bitmap Index Scan, using IN, and the path chosen by the optimizer is Index Only Scan, which is much cheaper than Bitmap Index Scan without the establishment of Bitmap.

OR and Join

In the Join scenario, if there are query conditions on all the tables participating in join and then the OR association is applied in the where clause, the optimizer will select an and b joins and then use Filter filtering. Because join is performed first instead of predicate push-down, it costs a lot to filter 999999 rows in order to get one row.

[local:/data/pg12]: 5432 pg12@testdb=# EXPLAIN verbose SELECT id, a.a_val, b.b_valFROM a JOIN b USING (id) WHERE a.id = 42 OR b.id = 42 QUERY PLAN -Gather (cost=21965.00..45327.62 rows=2 width=70) Output: a.id A.a_val, b.b_val Workers Planned: 2-> Parallel Hash Join (cost=20965.00..44327.42 rows=1 width=70) Output: a.id, a.a_val B.b_val Inner Unique: true Hash Cond: (a.id = b.id) Join Filter: (a.id = 42) OR (b.id = 42))-> Parallel Seq Scan on public.a (cost=0.00..12500.67 rows=416667 width=37) Output: a.id A.a_val-> Parallel Hash (cost=12500.67..12500.67 rows=416667 width=37) Output: b.b_val, b.id-> Parallel Seq Scan on public.b (cost=0.00..12500.67 rows=416667 width=37) Output: b.b_val, b.id (14 rows)

In this case, you can correlate two JOIN to improve performance by using UNION

[local:/data/pg12]: 5432 pg12@testdb=# EXPLAIN verbosepg12@testdb-# SELECT id, a.a_val, b.b_valpg12@testdb-# FROM a JOIN b USING (id) pg12@testdb-# WHERE a.id = 42pg12@testdb-# UNIONpg12@testdb-# SELECT id, a.a_val, b.b_valpg12@testdb-# FROM a JOIN b USING (id) pg12@testdb-# WHERE b.id = 42pg12@testdb-# QUERY PLAN -Unique (cost=33.83..33.85 rows=2 width=68) Output: a.id A.a_val, b.b_val-> Sort (cost=33.83..33.84 rows=2 width=68) Output: a.id, a.a_val, b.b_val Sort Key: a.id, a.a_val, b.b_val-> Append (cost=0.85..33.82 rows=2 width=68)-> Nested Loop (cost=0.85..16.90 rows=1 width=70) Output: a.id A.a_val, b.b_val-> Index Scan using a_pkey on public.a (cost=0.42..8.44 rows=1 width=37) Output: a.id A.a_val Index Cond: (a.id = 42)-> Index Scan using b_pkey on public.b (cost=0.42..8.44 rows=1 width=37) Output: b.id B.b_val Index Cond: (b.id = 42)-> Nested Loop (cost=0.85..16.90 rows=1 width=70) Output: a_1.id, a_1.a_val B_1.b_val-> Index Scan using a_pkey on public.an axi1 (cost=0.42..8.44 rows=1 width=37) Output: a_1.id A_1.a_val Index Cond: (a_1.id = 42)-> Index Scan using b_pkey on public.b bread1 (cost=0.42..8.44 rows=1 width=37) Output: b_1.id B_1.b_val Index Cond: (b_1.id = 42) (22 rows) [local:/data/pg12]: 5432 pg12@testdb=#

The two sub-connections choose the lowest-cost NL join, and the total cost is less than 0.1% of the original SQL statement cost, a difference of three orders of magnitude.

This is the end of the content of "how to use OR in PostgreSQL correctly". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report