Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of create_index_paths->choose_bitmap_and function in PostgreSQL physical Optimization

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "create_index_paths- > choose_bitmap_and function Analysis in PostgreSQL physical Optimization". The explanation in this article is simple and clear and easy to learn and understand. Please follow the editor's train of thought to study and learn "create_index_paths- > choose_bitmap_and function Analysis in PostgreSQL physical Optimization".

This function creates a bitmap index scan access path (BitmapAndPath) node after performing the Bitmap AND operation.

The following is an example of an BitmapAnd access path:

Testdb=# explain verbose select t1.* testdb-# from t_dwxx T1 testdb-# where (dwbh > '10000' and dwbh

< '15000') AND (dwdz between 'DWDZ10000' and 'DWDZ15000');QUERY PLAN ---------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.t_dwxx t1 (cost=32.33..88.38 rows=33 width=20) Output: dwmc, dwbh, dwdz Recheck Cond: (((t1.dwbh)::text >

'10000'::text) AND ((t1.dwbh):: text

< '15000'::text) AND ((t1.dwdz)::text >

= 'DWDZ10000'::text) AND ((t1.dwdz):: text BitmapAnd (cost=32.33..32.33 rows=33 width=0)-> BitmapAnd-> Bitmap Index Scan on t_dwxx_pkey (cost=0.00..13.86 rows=557 width=0) Index Cond: ((t1.dwbh):: text >' 10000'::text) AND ((t1.dwbh):: text

< '15000'::text)) ->

Bitmap Index Scan on idx_dwxx_dwdz (cost=0.00..18.21 rows=592 width=0) Index Cond: ((t1.dwdz):: text > = 'DWDZ10000'::text) AND ((t1.dwdz):: text path; / * Sort the surviving paths by index access cost * / qsort (pathinfoarray, npaths, sizeof (PathClauseUsage *), path_usage_comparator) / / sort existing paths by index access cost / * * For each surviving index, consider it as an "AND group leader", and see * whether adding on any of the later indexes results in an AND path with * cheaper total cost than before. Then take the cheapest AND group. * for the existing index, think of it as "AND group leader", * and see if you add a later index, resulting in an AND path with a lower total cost than before. * choose the AND group with the lowest cost. * * / for (I = 0; I

< npaths; i++)//遍历这些路径 { Cost costsofar; List *qualsofar; Bitmapset *clauseidsofar; ListCell *lastcell; pathinfo = pathinfoarray[i];//PathClauseUsage结构体 paths = list_make1(pathinfo->

Path); / / path linked list costsofar = bitmap_scan_cost_est (root, rel, pathinfo- > path); / / current cost qualsofar = list_concat (list_copy (pathinfo- > quals), list_copy (pathinfo- > preds)); clauseidsofar = bms_copy (pathinfo- > clauseids); lastcell = list_head (paths) / * for quick deletion, for quick deletions * / for (j = I + 1; j

< npaths; j++)//扫描后续的路径 { Cost newcost; pathinfo = pathinfoarray[j]; /* Check for redundancy */ if (bms_overlap(pathinfo->

Clauseids, clauseidsofar) continue; / * extra path, consider it redundant * / if (pathinfo- > preds) / / partial index? {bool redundant = false / * we check each predicate clause separately * / / check each predicate foreach (l, pathinfo- > preds) {Node * np = (Node *) lfirst (l) separately If (predicate_implied_by (list_make1 (np), qualsofar, false)) {redundant = true; break; / * out of inner foreach loop * /}} if (redundant) continue } / * tentatively add new path to paths, so we can estimate cost * / / try to add a new path to the path so that we can estimate the cost paths = lappend (paths, pathinfo- > path); newcost = bitmap_and_cost_est (root, rel, paths); / / estimated cost if (newcost

< costsofar)//新成本更低 { /* keep new path in paths, update subsidiary variables */ costsofar = newcost; qualsofar = list_concat(qualsofar, list_copy(pathinfo->

Quals); / / add this condition qualsofar = list_concat (qualsofar, list_copy (pathinfo- > preds)); / / add this predicate clauseidsofar = bms_add_members (clauseidsofar, pathinfo- > clauseids) / / add this clause ID lastcell = lnext (lastcell);} else {/ * reject new path, remove it from paths list * / paths = list_delete_cell (paths, lnext (lastcell), lastcell); / / remove the new path} Assert (lnext (lastcell) = = NULL) } / * Keep the cheapest AND-group (or singleton) * / if (I = = 0 | costsofar

< bestcost)//单条路径或者取得最小的成本 { bestpaths = paths; bestcost = costsofar; } /* some easy cleanup (we don't try real hard though) */ list_free(qualsofar); } if (list_length(bestpaths) == 1) return (Path *) linitial(bestpaths); /* 无需AND路径,no need for AND */ return (Path *) create_bitmap_and_path(root, rel, bestpaths);//生成BitmapAndPath } //-------------------------------------------------------------------------- bitmap_scan_cost_est /* * Estimate the cost of actually executing a bitmap scan with a single * index path (no BitmapAnd, at least not at this level; but it could be * a BitmapOr). */ static Cost bitmap_scan_cost_est(PlannerInfo *root, RelOptInfo *rel, Path *ipath) { BitmapHeapPath bpath; Relids required_outer; /* Identify required outer rels, in case it's a parameterized scan */ required_outer = get_bitmap_tree_required_outer(ipath); /* Set up a dummy BitmapHeapPath */ bpath.path.type = T_BitmapHeapPath; bpath.path.pathtype = T_BitmapHeapScan; bpath.path.parent = rel; bpath.path.pathtarget = rel->

Reltarget; bpath.path.param_info = get_baserel_parampathinfo (root, rel, required_outer); bpath.path.pathkeys = NIL; bpath.bitmapqual = ipath; / * * Check the cost of temporary path without considering parallelism. * Parallel bitmap heap path will be considered at later stage. * / bpath.path.parallel_workers = 0; cost_bitmap_heap_scan (& bpath.path, root, rel, bpath.path.param_info, ipath, get_loop_count (root, rel- > relid, required_outer)); / / BitmapHeapPath calculation cost return bpath.path.total_cost } / /-bitmap_and_cost_est / * * Estimate the cost of actually executing a BitmapAnd scan with the given * inputs. * estimate the actual cost of performing the BitmapAnd scan given the input * / static Cost bitmap_and_cost_est (PlannerInfo * root, RelOptInfo * rel, List * paths) {BitmapAndPath apath; BitmapHeapPath bpath; Relids required_outer; / * Set up a dummy BitmapAndPath * / apath.path.type = Tunable BitmapAndPath.apath.path.pathtype = Tunable BitmapAnd; apath.path.parent = rel; apath.path.pathtarget = rel- > reltarget Apath.path.param_info = NULL; / * not used in bitmap trees * / apath.path.pathkeys = NIL; apath.bitmapquals = paths; cost_bitmap_and_node (& apath, root); / * Identify required outer rels, in case it's a parameterized scan * / required_outer = get_bitmap_tree_required_outer ((Path *) & apath); / * Set up a dummy BitmapHeapPath * / bpath.path.type = T_BitmapHeapPath Bpath.path.pathtype = root Bitmap HeapScan; bpath.path.parent = rel; bpath.path.pathtarget = rel- > reltarget; bpath.path.param_info = get_baserel_parampathinfo (root, rel, required_outer); bpath.path.pathkeys = NIL; bpath.bitmapqual = (Path *) & apath; / * * Check the cost of temporary path without considering parallelism. * Parallel bitmap heap path will be considered at later stage. * / bpath.path.parallel_workers = 0; / * Now we can do cost_bitmap_heap_scan * / cost_bitmap_heap_scan (& bpath.path, root, rel, bpath.path.param_info, (Path *) & apath, get_loop_count (root, rel- > relid, required_outer)) / / BitmapHeapPath calculates cost return bpath.path.total_cost;} / /-create_bitmap_and_path / * * create_bitmap_and_path * Creates a path node representing a BitmapAnd. * / BitmapAndPath * create_bitmap_and_path (PlannerInfo * root, RelOptInfo * rel, List * bitmapquals) {BitmapAndPath * pathnode = makeNode (BitmapAndPath); pathnode- > path.pathtype = titled Bitmap and; pathnode- > path.parent = rel; pathnode- > path.pathtarget = rel- > reltarget; pathnode- > path.param_info = NULL / * not used in bitmap trees * / * * Currently, a BitmapHeapPath, BitmapAndPath, or BitmapOrPath will be * parallel-safe if and only if rel- > consider_parallel is set. So, we can * set the flag for this path based only on the relation-level flag, * without actually iterating over the list of children. * / pathnode- > path.parallel_aware = false; pathnode- > path.parallel_safe = rel- > consider_parallel; pathnode- > path.parallel_workers = 0; pathnode- > path.pathkeys = NIL; / * always unordered * / pathnode- > bitmapquals = bitmapquals; / * this sets bitmapselectivity as well as the regular cost fields: * / cost_bitmap_and_node (pathnode, root); / / calculate cost return pathnode } / /-cost_bitmap_and_node / * * cost_bitmap_and_node * Estimate the cost of a BitmapAnd node * estimated BitmapAnd node cost * * Note that this considers only the costs of index scanning and bitmap * creation, not the eventual heap access. In that sense the object isn't * truly a Path, but it has enough path-like properties (costs in particular) * to warrant treating it as one. We don't bother to set the path rows field, * however. * / void cost_bitmap_and_node (BitmapAndPath * path, PlannerInfo * root) {Cost totalCost; Selectivity selec; ListCell * l; / * We estimate AND selectivity on the assumption that the inputs are * independent. This is probably often wrong, but we don't have the info * to do better. * * The runtime cost of the BitmapAnd itself is estimated at 100x * cpu_operator_cost for each tbm_intersect needed. Probably too small, * definitely too simplistic? * / totalCost = 0; selec = 1.0; foreach (l, path- > bitmapquals) {Path * subpath = (Path *) lfirst (l); Cost subCost; Selectivity subselec; cost_bitmap_tree_node (subpath, & subCost, & subselec); selec * = subselec; totalCost + = subCost If (l! = list_head (path- > bitmapquals)) totalCost + = 100.0 * cpu_operator_cost;} path- > bitmapselectivity = selec; path- > path.rows = 0; / * per above, not used * / path- > path.startup_cost = totalCost; path- > path.total_cost = totalCost;} III, tracking analysis

The test script is as follows

Select t1.* from t_dwxx T1 where (dwbh > '10000' and dwbh

< '15000') AND (dwdz between 'DWDZ10000' and 'DWDZ15000'); 启动gdb跟踪 (gdb) b choose_bitmap_andBreakpoint 1 at 0x74e8c2: file indxpath.c, line 1372.(gdb) cContinuing.Breakpoint 1, choose_bitmap_and (root=0x1666638, rel=0x1666a48, paths=0x166fdf0) at indxpath.c:13721372 int npaths = list_length(paths); 输入参数 (gdb) p *paths$1 = {type = T_List, length = 2, head = 0x166fe20, tail = 0x16706b8}(gdb) p *(Node *)paths->

Head- > data.ptr_value$2 = {type = T_IndexPath} (gdb) p * (Node *) paths- > head- > next- > data.ptr_value$3 = {type = T_IndexPath} (gdb) set $p1 = (IndexPath *) paths- > head- > data.ptr_value (gdb) set $p2 = (IndexPath *) paths- > head- > next- > data.ptr_value (gdb) p * $p1 $4 = {path = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x1666a48, pathtarget = 0x166d988, param_info = param_info, 0x0 = 0x0, 0x0 = 0x0 Parallel_workers = 0, rows = 33, startup_cost = 0.285000000000003, total_cost = 116.20657683302848, pathkeys = 0x0}, indexinfo = 0x166e420, indexclauses = 0x166f528, indexquals = 0x166f730, indexqualcols = 0x166f780, indexorderbys = 0x0, indexorderbycols = 0x0, indexscandir = ForwardScanDirection, indextotalcost = 18.205000000000002, indexselectivity = 0.059246954595791879} (gdb) p * $p2 $5 = {path = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x1666a48, pathtarget = 0x166d988, param_info = 0x0, parallel_aware = false, parallel_safe = parallel_safe, parallel_safe = false Rows = 33, startup_cost = 0.285000000000003, total_cost = 111.33157683302848, pathkeys = 0x0}, indexinfo = 0x1666c58, indexclauses = 0x166fed0, indexquals = 0x166ffc8, indexqualcols = 0x1670018, indexorderbys = 0x0, indexorderbycols = 0x0, indexscandir = ForwardScanDirection, indextotalcost = 13.855, indexselectivity = 0.0556888888888899}

The first element in the paths corresponds to (dwbh > '10000' and dwbh

< '15000') ,第2个元素对应(dwdz between 'DWDZ10000' and 'DWDZ15000') (gdb) set $ri1=(RestrictInfo *)$p1->

Indexclauses- > head- > data.ptr_value (gdb) set $tmp= (RelabelType *) ((OpExpr *) $ri1- > clause)-> args- > head- > data.ptr_value (gdb) p * (Var *) $tmp- > arg$17 = {xpr = {type = T_Var}, varno = 1, varattno = 3, vartype = 1043, vartypmod = 104,100, varlevelsup = 0, varnoold = 1, varoattno = 3 Location = 76} (gdb) p * (Node *) ((OpExpr *) $ri1- > clause)-> args- > head- > next- > data.ptr_value$18 = {type = T_Const} (gdb) p * (Const *) ((OpExpr *) $ri1- > clause)-> args- > head- > next- > data.ptr_value$19 = {xpr = {type = T_Const}, consttype = 25, consttypmod =-1, constcollid = 100, constlen =-1, constvalue = 23636608, constisnull = false, constbyval = false, location = 89}

Start traversing paths, extract clause conditions and detect whether to use all paths of exactly the same set of clauses, leaving only the lowest cost of these paths, which are put into an array for qsort.

(gdb) 1444 npaths = 0; (gdb) 1445 foreach (l, paths) (gdb)

Collect information into the PathClauseUsage array

(gdb) n1471 pathinfoarray [npaths++] = pathinfo; (gdb) 1445 foreach (l, paths) (gdb) 1476 if (npaths = = 1) (gdb) p npaths$26 = 2 (gdb)

Sort by cost

(gdb) n1480 qsort (pathinfoarray, npaths, sizeof (PathClauseUsage *)

Traverse the path to find the AND group with the lowest cost

1488 for (I = 0; I

< npaths; i++)(gdb) n1495 pathinfo = pathinfoarray[i];(gdb) 1496 paths = list_make1(pathinfo->

Path); (gdb) 1497 costsofar = bitmap_scan_cost_est (root, rel, pathinfo- > path); (gdb) 1499 list_copy (pathinfo- > preds))

Get the current cost and set the current conditional clause

(gdb) p costsofar$27 = 89.003250000000008 (gdb) n1498 qualsofar = list_concat (pathinfo- > quals)

Perform AND operation (path overlay), lower cost, adjust current cost and related variables

(gdb) n1531 newcost = bitmap_and_cost_est (root, rel, paths); (gdb) 1532 if (newcost)

< costsofar)(gdb) p newcost$30 = 88.375456720095343(gdb) n1535 costsofar = newcost;(gdb) n1537 list_copy(pathinfo->

Quals); (gdb) 1536 qualsofar = list_concat (qualsofar, (gdb) 1539 list_copy (pathinfo- > preds))

When dealing with the next AND condition, the cost of a single AND condition is higher than that of the previous condition, leaving the original

1488 for (I = 0; I

< npaths; i++)(gdb) 1495 pathinfo = pathinfoarray[i];(gdb) 1496 paths = list_make1(pathinfo->

Path); (gdb) 1497 costsofar = bitmap_scan_cost_est (root, rel, pathinfo- > path); (gdb) 1499 list_copy (pathinfo- > preds)); (gdb) p costsofar$34 = 94.053 250000000006 (gdb) n1498 qualsofar = list_concat (list_copy (pathinfo- > quals), (gdb) 1500 clauseidsofar = bms_copy (pathinfo- > clauseids); (gdb) 1501 lastcell = list_head (paths) / * for quick deletions * / (gdb) 1503 for (j = I + 1; j

< npaths; j++)(gdb) 1553 if (i == 0 || costsofar < bestcost)(gdb) p i$35 = 1(gdb) p costsofar$36 = 94.053250000000006(gdb) p bestcost$37 = 88.375456720095343(gdb) 构建BitmapAndPath,返回 (gdb) n1563 if (list_length(bestpaths) == 1)(gdb) 1565 return (Path *) create_bitmap_and_path(root, rel, bestpaths);(gdb) 1566 } DONE! (gdb) ncreate_index_paths (root=0x1666638, rel=0x1666a48) at indxpath.c:337337 bpath = create_bitmap_heap_path(root, rel, bitmapqual,感谢各位的阅读,以上就是"PostgreSQL物理优化中的create_index_paths->

The content of choose_bitmap_and function analysis, after the study of this article, I believe you have a deeper understanding of the problem of create_index_paths- > choose_bitmap_and function analysis in PostgreSQL physical optimization, and the specific use needs to be verified by practice. here is, the editor will push for you more articles related to knowledge points, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report