In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This section describes the three scan types of data tables in PostgreSQL, namely, sequential scan SeqScan, index scan IndexScan, and bitmap heap scan BitmapHeapScan.
I. brief introduction
Selection rate = number of tuples after conditional filtering / number of tuples before conditional filtering
Sequential scan SeqScan
Direct sequential scanning of table heap data (Heap Data) is suitable for scenarios with high selection rates.
Index scan IndexScan
Access the heap data after accessing the tuple location pointer by accessing the index, which is suitable for scenarios with low selection rate.
Bitmap heap scan BitmapHeapScan
Bitmap heap scanning needs to first store the Page (Block) ID of the qualified tuple in Bitmap through BitmapIndexScan, and then access the heap data through Bitmap, which is suitable for scenarios with neither high selection rate nor low selection rate, between the above two scanning methods.
Revised in 2018.10.01, the index is suitable for cases with low selection rate, and sequential scanning is suitable for cases with high selection rate.
It is worth noting:
1. "higher selection rate" is a qualitative expression. In fact, PG determines which scanning method to use according to Cost calculation. In general, the main operation of index scanning is to randomly access the storage device. In the initialization parameter configuration of PG, the Cost of random access is 4, while the Cost of sequential access is 1. It is roughly estimated that if the Index Blocks + Heap Blocks accessed through the index exceeds the 1max 4 of the sequential access Heap Blocks, then the PG will choose to use sequential scanning instead of index scanning.
The scanning mode of 2.IndexScan is to access the index, and if it meets the conditions, it immediately accesses the heap data according to the tuple position pointer in the index to obtain the tuple, while BitmapIndexScan (bitmap index scan) accesses the index and stores the qualified Block ID in Bitmap. At this time, it does not involve scanning the heap data, and finally the operation of obtaining the tuple is completed by BitmapHeapScan scanning.
The difference between the two is summed up very well in the following paragraph:
A plain Index Scan fetches one tuple-pointer at a time from the index, and immediately visits that tuple in the table. A bitmap scan fetches all the tuple-pointers from the index in one go, sorts them using an in-memory "bitmap" data structure, and then visits the table tuples in physical tuple-location order.
The following is a visual experience of the differences in these ways through the sample script.
Test datasheet, tactidwxpl 10000 rows of data, create competition on dwbh
Testdb=# select count (*) from tweedwxx; count-10000 (1 row) II, SeqScan
Test script:
Testdb=# explain verbose select t1.* from t_dwxx T1 where dwbh > '1000' QUERY PLAN-Seq Scan on public.t_dwxx T1 (cost=0.00..189.00 rows=9999 width=20) Output: dwmc, dwbh Dwdz Filter: (t1.dwbh):: text > '1000'::text) (3 rows)
The query condition is dwbh > '1000mm, the selection rate is low, PG chooses sequential scan SeqScan, the cost is 189.00, how to calculate the cost, interested can refer to the source code interpretation (53), through gdb tracking analysis.
III. IndexScan
Test script:
Testdb=# explain verbose select T1. * from t_dwxx T1 where dwbh = '10000' QUERY PLAN -Index Scan using t_dwxx_pkey on public.t_dwxx T1 (cost=0.29..8.30 rows=1 width=20) Output: dwmc Dwbh, dwdz Index Cond: (t1.dwbh):: text = '10000'::text) (3 rows)
The query condition is dwbh = '10000 records, the selection rate is very high, there is only one record, select index scan.
Total cost 8.30 = startup cost + 1 Index Block access + 1 Heap Block access = 0.29 + 4 + 4 ≈ 8.30
IV. BitmapHeapScan
Test script:
Testdb=# explain verbose select t1.* from t_dwxx T1 where dwbh > '1000' and dwbh
< '3000'; QUERY PLAN --------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.t_dwxx t1 (cost=51.07..148.42 rows=2223 width=20) Output: dwmc, dwbh, dwdz Recheck Cond: (((t1.dwbh)::text >'1000'::text) AND ((t1.dwbh):: text
< '3000'::text)) ->Bitmap Index Scan on t_dwxx_pkey (cost=0.00..50.52 rows=2223 width=0) Index Cond: ((t1.dwbh):: text > '1000'::text) AND ((t1.dwbh):: text
< '3000'::text))(5 rows) 查询条件为dwbh >The selection rate is not high or low for '1000' and dwbh < '3000'. PG chooses BitmapHeapScan. The start-up cost is 51.07 and the total cost is 148.42. How to calculate this cost will be tracked and analyzed by subsequent source code interpretation.
It is worth noting that there is a step after BitmapIndexScan: Recheck, this is because the bitmap index scan only finds the Heap Block ID, not the qualified tuples, so there is the Recheck step.
5. Reference materials
PostgreSQL indexing: Index scan vs. Bitmap scan vs. Sequential scan
Bitmap indexes
What is a "Bitmap heap scan" in a query plan?
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.