In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article focuses on "what is the impact of different data types on query performance in PostgreSQL". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the effects of different data types on query performance in PostgreSQL?"
Capacity
Size of space occupied by data column
[local]: 5432 pg12@testdb=# SELECT pg_column_size (SMALLINT'1'), pg_column_size (INT4'1'), pg_column_size (NUMERIC (6memo)'1'), pg_column_size (FLOAT'1') Pg_column_size | pg_column_size | pg_column_size | pg_column_size-+-2 | 4 | 8 | 8
Create a data table, insert 100w rows of data values of 0 and 1, and view the space occupied by the data table.
Numeric
[local]: 5432 pg12@testdb=# create table t_numeric (id numeric); CREATE TABLE [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# insert into t_numeric select 0 from generate_series (1Magazine 1000000); INSERT 0 1000000 [local]: 5432 pg12@testdb=# insert into t_numeric select 1 from generate_series (1Magazine 1000000); INSERT 01000000 [local]: 5432 pg12@testdb=# select pg_size_pretty (pg_relation_size ('tactile numeric')) Pg_size_pretty-69 MB (1 row)
Float
[local]: 5432 pg12@testdb=# create table t_float (id int); CREATE TABLE [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# insert into t_float select 0 from generate_series (1Magazine 1000000); INSERT 0 1000000 [local]: 5432 pg12@testdb=# insert into t_float select 1 from generate_series (1Magazine 1000000); INSERT 01000000 [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# select pg_size_pretty (pg_relation_size ('tweak float')) Pg_size_pretty-69 MB (1 row) [local]: 5432 pg12@testdb=#
Int
[local]: 5432 pg12@testdb=# create table t_int (id int); CREATE TABLE [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# insert into t_int select 0 from generate_series (1Magazine 1000000); INSERT 0 1000000 [local]: 5432 pg12@testdb=# insert into t_int select 1 from generate_series (1Magazine 1000000); INSERT 01000000 [local]: 5432 pg12@testdb=# select pg_size_pretty (pg_relation_size) Pg_size_pretty-69 MB (1 row)
Smallint
[local]: 5432 pg12@testdb=# create table t_smallint (id smallint); CREATE TABLE [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# insert into t_smallint select 0 from generate_series (1Magazine 1000000); INSERT 0 1000000 [local]: 5432 pg12@testdb=# insert into t_smallint select 1 from generate_series (1Magazine 1000000); INSERT 01000000 [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# select pg_size_pretty (pg_relation_size Pg_size_pretty-69 MB (1 row)
Boolean
[local]: 5432 pg12@testdb=# create table t_bool (id boolean); CREATE TABLE [local]: 5432 pg12@testdb=# insert into t_bool select 0::boolean from generate_series (1Magne 1000000); INSERT 0 1000000 [local]: 5432 pg12@testdb=# insert into t_bool select 1::boolean from generate_series (1m 1000000); INSERT 0 1000000 [local]: 5432 pg12@testdb=# [local]: 5432 pg12@testdb=# select pg_size_pretty (pg_relation_size ('tactibool')) Pg_size_pretty-69 MB (1 row)
As you can see, the space occupied by the four data types is 69 MB.
Query performance
No conditions, full table scan
-- disable parallelism [local]: 5432 pg12@testdb=# SET max_parallel_workers_per_gather = 0 * * set [local]: 5432 pg12@testdb=# explain (analyze,verbose,buffers) select count (*) from t_numeric QUERY PLAN -Aggregate (cost=33850.00..33850.01 rows=1 width=8) (actual time=478.196..478.196 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public. T_numeric (cost=0.00..28850.00 rows=2000000 width=0) (actual time=0.053..255.949 rows=2000000 loops=1) Output: id Buffers: shared hit=8850 Planning Time: 0.716 ms Execution Time: 478.280 ms (8 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_float QUERY PLAN -Aggregate (cost=33850.00..33850.01 rows=1 width=8) (actual time=421.919..421.919 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public.t_float (cost=0.00..28850.00 rows=2000000 width=0) (actual time=0.010..222.624 rows=2000000 loops=1) Output: id Buffers: shared hit=8850 Planning Time: 0.231 ms Execution Time: 421.948 ms (8 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_int QUERY PLAN -Aggregate (cost=33850.00..33850.01 rows=1 width=8) (actual time=440.328..440.328 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public.t_int (cost=0) .00.. 28850.00 rows=2000000 width=0) (actual time=0.011..236.078 rows=2000000 loops=1) Output: id Buffers: shared hit=8850 Planning Time: 0.208 ms Execution Time: 440.359 ms (8 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_smallint QUERY PLAN -Aggregate (cost=33850.00..33850.01 rows=1 width=8) (actual time=439.007..439.007 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public .t _ smallint (cost=0.00..28850.00 rows=2000000 width=0) (actual time=0.043..232.069 rows=2000000 loops=1) Output: id Buffers: shared hit=8850 Planning Time: 0.553 ms Execution Time: 439.081 ms (8 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_bool QUERY PLAN -Aggregate (cost=33850.00..33850.01 rows=1 width=8) (actual time=430.800..430.800 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public.t_bool ( Cost=0.00..28850.00 rows=2000000 width=0) (actual time=0.010..230.333 rows=2000000 loops=1) Output: id Buffers: shared hit=8850 Planning Time: 0.224 ms Execution Time: 430.831 ms (8 rows) [local]: 5432 pg12@testdb=#
Unconditional full table scan, the time is not much different, the longest execution time is the numeric type.
Add query conditions, full table scan
[local]: 5432 pg12@testdb=# explain (analyze,verbose,buffers) select count (*) from t_numeric where id = '0There is a Velcro rig of numericistlain (analyze,verbose,buffers) select count (*) from t_bool where id = 0::boolean QUERY PLAN -Aggregate (cost=36358.67..36358.68 rows=1 width=8) (actual time=723.356..723.357 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public. T_numeric (cost=0.00..33850.00 rows=1003467 width=0) (actual time=0.057..610.907 rows=1000000 loops=1) Output: id Filter: (t_numeric.id = '0'::numeric) Rows Removed by Filter: 1000000 Buffers: shared hit=8850 Planning Time: 1.901 ms Execution Time: 723.449 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_float where id = '0'::numeric QUERY PLAN -Aggregate (cost=38875.00..38875.01 rows=1 width=8) (actual time=827.686..827.687 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public.t_float (cost=0) .00.. 38850.00 rows=10000 width=0) (actual time=0.015..725.737 rows=1000000 loops=1) Output: id Filter: (t_float.id):: numeric = '0'::numeric) Rows Removed by Filter: 1000000 Buffers: shared hit=8850 Planning Time: 0.234 ms Execution Time: 827.720 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_int where id = 0 QUERY PLAN -Aggregate (cost=36329.50..36329.51 rows=1 width=8) (actual time=434.067..434.067 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public.t_int (cost=0.00) .. 33850.00 rows=991800 width=0) (actual time=0.014..333.883 rows=1000000 loops=1) Output: id Filter: (t_int.id = 0) Rows Removed by Filter: 1000000 Buffers: shared hit=8850 Planning Time: 0.295 ms Execution Time: 434.101 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_smallint where id = 0 QUERY PLAN -Aggregate (cost=36354.50..36354.51 rows=1 width=8) (actual time=486.466..486.466 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public .t _ smallint (cost=0.00..33850.00 rows=1001800 width=0) (actual time=0.053..368.184 rows=1000000 loops=1) Output: id Filter: (t_smallint.id = 0) Rows Removed by Filter: 1000000 Buffers: shared hit=8850 Planning Time: 1.396 ms Execution Time: 486.554 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze Verbose,buffers) select count (*) from t_bool where id = 0::boolean QUERY PLAN -Aggregate (cost=31356.67..31356.68 rows=1 width=8) (actual time=416.510..416.510 rows=1 loops=1) Output: count (*) Buffers: shared hit=8850-> Seq Scan on public.t_bool ( Cost=0.00..28850.00 rows=1002667 width=0) (actual time=0.014..316.188 rows=1000000 loops=1) Output: id Filter: (NOT t_bool.id) Rows Removed by Filter: 1000000 Buffers: shared hit=8850 Planning Time: 0.261 ms Execution Time: 416.551 ms (10 rows) [local]: 5432 pg12@testdb=#
In the case of query conditions, the cost of parsing the expression is different (bool
< int < numeric < float),因此时间相差较大,时长最大的是float类型,时间接近bool类型的2倍。 创建索引,全索引扫描 禁用全表扫描,使用全索引扫描 [local]:5432 pg12@testdb=# explain (analyze,verbose,buffers) select count(*) from t_numeric where id = '0'::numeric; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=35541.77..35541.78 rows=1 width=8) (actual time=594.984..594.984 rows=1 loops=1) Output: count(*) Buffers: shared hit=7160 ->Index Only Scan using idx_t_numeric_id on public.t_numeric (cost=0.43..33033.10 rows=1003467 width=0) (actual time=0.269..482.525 rows=1000000 loops=1) Output: id Index Cond: (t_numeric.id = '0'::numeric) Heap Fetches: 1000000 Buffers: shared hit=7160 Planning Time: 1.392 ms Execution Time: 595.253 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze,verbose) Buffers) select count (*) from t_float where id = '0'::numeric QUERY PLAN- -Aggregate (cost=70854.43..70854.44 Rows=1 width=8) (actual time=1337.093..1337.094 rows=1 loops=1) Output: count (*) Buffers: shared hit=14317-> Index Only Scan using idx_t_float_id on public.t_float (cost=0.43..70829.43 rows=10000 width=0) (actual time=0.037..1233.730 rows=1000000 loops=1) Output: id Filter: (t_float.id):: numeric = '0'::numeric) Rows Removed by Filter: 1000000 Heap Fetches: 2000000 Buffers: shared hit=14317 Planning Time: 0.293 ms Execution Time: 1337.168 ms (11 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_int where id = 0 QUERY PLAN -Aggregate (cost=35128.43..35128.44 rows=1 width=8) (actual Time=526.942..526.943 rows=1 loops=1) Output: count (*) Buffers: shared hit=7160-> Index Only Scan using idx_t_int_id on public.t_int (cost=0.43..32648.93 rows=991800 width=0) (actual time=0.035..414.797 rows=1000000 loops=1) Output: id Index Cond: (t_int.id = 0) Heap Fetches: 1000000 Buffers: shared hit=7160 Planning Time: 0.245 ms Execution Time: 526.979 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze Verbose,buffers) select count (*) from t_smallint where id = 0 QUERY PLAN-- - Aggregate (cost=35480.43..35480.44 rows=1 width=8) (actual time=551.394..551.394 rows=1 loops=1) Output: count (*) Buffers: shared hit=4428 read=2735-> Index Only Scan using idx_t_smallint_id on public.t_smallint (cost=0.43..32975.93 rows=1001800 width=0) (actual time=0.459..438.992 rows=1000000 loops=1) Output: id Index Cond: (t_smallint.id = 0) Heap Fetches: 1000000 Buffers: shared hit=4428 read=2735 Planning Time: 1.889 ms Execution Time: 551.499 ms (10 rows) [local]: 5432 pg12@testdb=# explain (analyze) Verbose,buffers) select count (*) from t_bool where id = 0::boolean QUERY PLAN- -Aggregate (cost=35513.77..35513.78 rows=1 Width=8) (actual time=497.886..497.886 rows=1 loops=1) Output: count (*) Buffers: shared hit=7160-> Index Only Scan using idx_t_bool_id on public.t_bool (cost=0.43..33007.10 rows=1002667 width=0) (actual time=0.035..393.653 rows=1000000 loops=1) Output: id Index Cond: (t_bool.id = false) Heap Fetches: 1000000 Buffers: shared hit=7160 Planning Time: 0. 250 ms Execution Time: 497.922 ms (10 rows) [local]: 5432 pg12@testdb=#
For full index scan, the longest execution time is still float type, while the other three types are not much different. The performance of numeric is significantly improved compared with full table scan (595ms vs 723ms).
Pressure testing
Use pgbench for stress testing, numeric/float/int three types, each insert 100w data
Drop table tincture numeracy create table t_big_numeric (id numeric); insert into t_big_numeric select 0 from generate_series (1Magne 1000000); drop table tweak bigger numbers float create table t_big_float (id int); insert into t_big_float select 0 from generate_series (1Magazine 1000000); drop table tincture bigger than intact create table t_big_int (id int); insert into t_big_int select 0 from generate_series (1Magol 1000000)
Test result
[pg12@localhost test] $pgbench-C-f. / select_numeric.sql-- time=120-- client=8-- jobs=2-d testdb...transaction type:. / select_numeric.sqlscaling factor: 1query mode: 8number of threads: 2duration: 120 snumber of transactions actually processed: 1254latency average = 768.659 mstps = 10.407739 (including connections establishing) tps = 10.906626 (excluding connections establishing) [pg12@localhost test] $[pg12@localhost test] $pgbench-C-f. / select_float.sql- -time=120-- client=8-- jobs=2-d testdb...transaction type:. / select_float.sqlscaling factor: 1query mode: simplenumber of clients: 8number of threads: 2duration: 120 snumber of transactions actually processed: 2167latency average = 444.006 mstps = 18.017778 (including connections establishing) tps = 19.461350 (excluding connections establishing) [pg12@localhost test] $cat select_float.sql\ set id random (1,500,000) select * from t_big_float where id =: id [pg12@localhost test] $[pg12@localhost test] $pgbench-C-f. / select_int.sql-- time=120-- client=8-- jobs=2-d testdb...transaction type:. / select_int.sqlscaling factor: 1query mode: simplenumber of clients: 8number of threads: 2duration: 120 snumber of transactions actually processed: 2184latency average = 440.271 mstps = 18.170626 (including connections establishing) tps = 19.658996 (excluding connections establishing) [pg12@localhost test] $ I believe that you have a deeper understanding of the impact of different data types on query performance in PostgreSQL, so you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.