In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "how to solve the limitation of PostgreSQL window function call". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to solve the limitation of PostgreSQL window function call".
Background
Window function is commonly used in analyzing scenarios. At present, citus 7.5 only supports two scenarios using window function.
1. Partition by must be a distribution key.
2. The equivalent filtering condition with distribution key in where condition.
In essence: the current (citus 7.5) window function does not support cross-shard operations, or no redistribution during the process.
And Greenplum has done a good job in this respect, is a complete MPP database.
Citus window function supports postgres=#\ set VERBOSITY verbose postgres=# select row_number () over (partition by bid order by aid) rn,* from pgbench_accounts; ERROR: 0A000: could not run distributed query because the window function that is used cannot be pushed down HINT: Window functions are supported in two ways. Either add an equality filter on the distributed tables' partition column or use the window functions with a PARTITION BY clause containing the distribution column LOCATION: DeferErrorIfQueryNotSupported, multi_logical_planner.c:938
It can be supported if the following conditions are met
1. Partition by must be a distribution key.
2. The equivalent filtering condition with distribution key in where condition.
Postgres=# select row_number () over (partition by bid order by aid) rn,* from pgbench_accounts where aid=1 Rn | aid | bid | abalance | filler-+- -1 | 1 | 1 | 0 | (1 row) postgres=# select row_number () over (partition by aid order by bid) rn * from pgbench_accounts limit 1 Rn | aid | bid | abalance | filler-+- -1 | 298 | 1 | 0 | (1 row)
Carry out the plan
Postgres=# explain verbose select row_number () over (partition by aid order by bid) rn,* from pgbench_accounts limit 1 QUERY PLAN- -Limit (cost=0.00..0.00 Rows=0 width=0) Output: remote_scan.rn Remote_scan.aid, remote_scan.bid, remote_scan.abalance, remote_scan.filler-> Custom Scan (Citus Real-Time) (cost=0.00..0.00 rows=0 width=0) Output: remote_scan.rn, remote_scan.aid, remote_scan.bid, remote_scan.abalance Remote_scan.filler Task Count: 128 Tasks Shown: One of 128-> Task Node: host=172.24.211.224 port=1921 dbname=postgres-> Limit (cost=705.99..706.01 rows=1 width=105) Output: (row_number () OVER (?)), pgbench_accounts.aid, pgbench_accounts.bid, pgbench_accounts.abalance Pgbench_accounts.filler-> WindowAgg (cost=705.99..860.95 rows=7748 width=105) Output: row_number () OVER (?), pgbench_accounts.aid, pgbench_accounts.bid, pgbench_accounts.abalance Pgbench_accounts.filler-> Sort (cost=705.99..725.36 rows=7748 width=97) Output: pgbench_accounts.aid, pgbench_accounts.bid, pgbench_accounts.abalance, pgbench_accounts.filler Sort Key: pgbench_accounts.aid Pgbench_accounts.bid-> Seq Scan on public.pgbench_accounts_106812 pgbench_accounts (cost=0.00..205.48 rows=7748 width=97) Output: pgbench_accounts.aid, pgbench_accounts.bid, pgbench_accounts.abalance, pgbench_accounts.filler (17 rows) postgres=# explain verbose select row_number () over (partition by bid order by aid) rn,* from pgbench_accounts where aid=1 QUERY PLAN -Custom Scan (Citus Router ) (cost=0.00..0.00 rows=0 width=0) Output: remote_scan.rn Remote_scan.aid, remote_scan.bid, remote_scan.abalance, remote_scan.filler Task Count: 1 Tasks Shown: All-> Task Node: host=172.24.211.232 port=1921 dbname=postgres-> WindowAgg (cost=2.51..2.53 rows=1 width=105) Output: row_number () OVER, aid, bid, abalance Filler-> Sort (cost=2.51..2.51 rows=1 width=97) Output: aid, bid, abalance, filler Sort Key: pgbench_accounts.bid-> Index Scan using pgbench_accounts_pkey_106819 on public.pgbench_accounts_106819 pgbench_accounts (cost=0.28..2.50 rows=1 width=97) Output: aid, bid Abalance, filler Index Cond: (pgbench_accounts.aid = 1) (14 rows)
Citus does not support redistributed procedures in window calls.
Support for greenplum window function
Window calls with any pose are supported
Postgres=# create table t (id int, C1 int, c2 int); NOTICE: Table doesn't have 'DISTRIBUTED BY' clause-- Using column named' id' as the Greenplum Database data distribution key for this table. HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column (s) chosen are the optimal data distribution key to minimize skew. CREATE TABLE postgres=# insert into t select random () * 100000, random () * 10, random () * 100from generate_series (1Min 10000000); INSERT 0 10000000 postgres=# explain select row_number () over (partition by C1 order by id) rn,* from t QUERY PLAN -Gather Motion 33:1 (slice2 Segments: 33) (cost=1477974.88..1553064.94 rows=10012008 width=12)-> Window (cost=1477974.88..1553064.94 rows=303395 width=12) Partition By: C1 Order By: id-> Sort (cost=1477974.88..1503004.90 rows=303395 width=12) Sort Key: C1, id / / replace the following with a temporary table in citus-> Redistribute Motion 33:33 (slice1 Segments: 33) (cost=0.00..313817.24 rows=303395 width=12) Hash Key: C1-> Seq Scan on t (cost=0.00..113577.08 rows=303395 width=12) Optimizer status: legacy query optimizer (10 rows)
Even partition with multiple different dimensions is supported in one SQL
Postgres=# explain select row_number () over (partition by C1 order by id) rn1, row_number () over (partition by c2 order by C1) rn2, * from t QUERY PLAN -Gather Motion 33:1 (slice3 Segments: 33) (cost=3017582.83..3192792.97 rows=10012008 width=12)-> Subquery Scan coplan (cost=3017582.83..3192792.97 rows=303395 width=12)-> Window (cost=3017582.83..3092672.89 rows=303395 width=12) Partition By: coplan.c1 Order By: coplan.id-> Sort (cost=3017582.83..3042612.85 rows=303395 width=12) Sort Key: coplan.c1 Coplan.id / / the following is replaced by a temporary table in citus-> Redistribute Motion 33:33 (slice2 Segments: 33) (cost=1477974.88..1853425.18 rows=303395 width=12) Hash Key: coplan.c1-> Subquery Scan coplan (cost=1477974.88..1653185.02 rows=303395 width=12)-> Window (cost=1477974.88..1553064.94 rows=303395 width=12) Partition By: t.c2 Order By: t.c1-> Sort (cost=1477974.88..1503004.90 rows=303395 width=12) Sort Key: t.c2 T.c1 / / the following is replaced by a temporary table in citus-> Redistribute Motion 33:33 (slice1 Segments: 33) (cost=0.00..313817.24 rows=303395 width=12) Hash Key: t.c2-> Seq Scan on t (cost=0.00..113577.08 rows=303395 width=12) Optimizer status: legacy query optimizer (19 rows) Thank you for your reading The above is the content of "how to solve the restrictions of PostgreSQL window function calls". After the study of this article, I believe you have a deeper understanding of how to solve the restrictions of PostgreSQL window function calls, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.