In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains the "PostgreSQL statistics in the calculation of selection rate application analysis", the article explains the content is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in-depth, together to study and learn "PostgreSQL statistics in the calculation of selection rate application analysis"!
First, calculate the selection rate
Single conditional equivalence query
The test data generation script is as follows:
Insert into t_grxx (dwbh,grbh,xm,xb,nl) select generate_series (1Power100, 000) / 10 | |', generate_series (1 else 100, 000), 'XM' | | generate_series (1 random 100, 000), (case when (floor (random () * 2) = 0) then' male 'else' female 'end), floor (random () * 100 + 1):: int
SQL scripts and execution plans:
Testdb=# explain verbose select * from t_grxx where dwbh = '6323' QUERY PLAN -Index Scan using idx_t_grxx_dwbh on public.t_grxx (cost=0.29..46.90 rows=30 width=24) Output: dwbh Grbh, xm, xb, nl Index Cond: (t_grxx.dwbh):: text = '6323'::text) (3 rows) testdb=# explain verbose select * from t_grxx where dwbh =' 24' QUERY PLAN -Index Scan using idx_t_grxx_dwbh on public.t_grxx (cost=0.29..20.29 rows=10 width=24) Output: dwbh Grbh, xm, xb, nl Index Cond: (t_grxx.dwbh):: text = '24'::text) (3 rows)
Although they are both equivalent queries, the number of rows (rows) returned by dwbh='6323' and dwbh='24' in the execution plan is different, one is rows=30 and the other is rows=10. From the script that generated the data, the rows of '6323' and'24 'should be the same, but the results displayed in the execution plan are different because' 6323 'appears in the high frequency value when calculating the selection rate, so it is different from other values.
Analysis of calculation process
Query the statistics for this column:
Testdb=#\ xExpanded display is on.testdb=# select starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 16742 and staattnum = 1bot-[RECORD 1]-starelid | 16742staattnum | 1stainherit | fstanullfrac | 0stawidth | 4stadistinct |-0.10015testdb=# select staattnum,stakind1,staop1,stanumbers1,stavalues1, stakind2,staop2,stanumbers2,stavalues2, stakind3,staop3,stanumbers3,stavalues3from pg_statistic where starelid = 16742 and staattnum = 1 -[RECORD 1]- - - - ---------------------staattnum | 1stakind1 | 1staop1 | 98stanumbers1 | {0.0003}stavalues1 | {6323}stakind2 | 2staop2 | 664stanumbers2 | stavalues2 | {0,1092,1181,1265,1350,1443,1529,1619,171,1797,1887,1972,2058,2151,2240,2334,2423,2520,2618,271,2798,2892,2987,3076,3162,3246,3332,3421,3510,3597,3685,3777,3860,3956,4051,4136,4227,4317,4408,45,4590,4671,4760,4850,4933,5025,5120,5210,5300,5396,548,5570,5656,5747,5835,5931,6017,6109,6190,6281,6374,6465,6566,6649,6735,6830,6921,7012,7101,7192,7278,737,7455,7544,7630,7711,7801,7895,7988,8081,8167,8260,8344,8430,8520,8615,8707,8809,8901,8997,9083,918,9272,9367,9451,9538,9630,9729,982,9904,9999}stakind3 | 3staop3 | 664stanumbers3 | {0.819578}stavalues3 |
The conditional statement is an equivalent expression, using the operator "=" (string equivalent comparison, texteq/eqsel/eqjoinsel), so the statistics used are high-frequency value MCV (note: staop1=98, this is string equivalent comparison). '6323' appears in high-frequency values with a selection rate of 0.0003, so rows=100000x0.0003=30. However, '24' does not appear in the high frequency value, and the selection rate = (1-0.0003) / abs (stadistinct) / Tuples= (1-0.0003) / abs (- 0.10015) / 100000099820269595606590000m rows = (1-0.0003) / abs (stadistinct) = 10 (rounding).
Single condition comparison query
Test script:
Testdb=# create table t_int (C1 int,c2 varchar (20)); CREATE TABLEtestdb=# testdb=# insert into t_int select generate_series (1m 100000) / 10 row C2' | | generate_series (1m 100000) / 100th insert 0 100000testdb=# ANALYZEt _ int;ANALYZEtestdb=# select oid from pg_class where relname='t_int'; oid-16755 (1 row)
Query statistics for C1 column
Testdb=#\ xExpanded display is on.testdb=# select staattnum,stakind1,staop1,stanumbers1,stavalues1,testdb-# stakind2,staop2,stanumbers2,stavalues2,testdb-# stakind3,staop3,stanumbers3,stavalues3testdb-# from pg_statistic testdb-# where starelid = 16755 testdb-# and staattnum = 1 -[RECORD 1]- - - - --------------------staattnum | 1stakind1 | 1staop1 | 96stanumbers1 | {0.0003}stavalues1 | {8306}stakind2 | 2staop2 | 97stanumbers2 | stavalues2 | {0,108,215,318,416,514,611,704,809,912,1015,1111,1217,1312,1410,1511,1607,1705,1805,1903,2002,2094,2189,2287,2388,2487,2592,2695,2795,2896,2998,3112,3213,3304,3408,3507,3606,3707,3798,3908,4004,4106,4205,4312,4413,4505,4606,4714,4821,4910,5014,5118,5220,5321,5418,5516,5613,5709,5807,5916,6014,6127,6235,6341,6447,6548,6648,6741,6840,6931,7032,7131,7234,7330,7433,7532,7626,7727,7827,7925,8020,8120,8217,8322,8420,8525,8630,8730,8831,8934,9032,9128,9223,9323,9425,9527,9612,9706,9804,9904,9999}stakind3 | 3staop3 | 97stanumbers3 | {1}stavalues3 |
Query statement:
Testdb=# explain verbose select * from t_int where C1 < 2312 QUERY PLAN-Seq Scan on public.t_int (cost=0.00..1790.00 rows=23231 width=9) Output: C1 C2 Filter: (t_int.c1 < 2312) (3 rows)
SQL uses non-equivalent queries (
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.