How PostgreSQL builds access paths for append relation 07/02 Update SLTechnology News&Howtos

How PostgreSQL builds access paths for append relation

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article is about how PostgreSQL builds access paths for append relation. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

I. data structure

AppendRelInfo

When we expand an inheritable table (partitioned table) or a UNION- all subquery into an "append relationship" (essentially a linked list of child RTE), build an AppendRelInfo for each child RTE.

/ * Append-relation info. * Append-relation information. * When we expand an inheritable table or a UNION-ALL subselect into an * "append relation" (essentially, a list of child RTEs), we build an * AppendRelInfo for each child RTE. The list of AppendRelInfos indicates * which child RTEs must be included when expanding the parent, and each node * carries information needed to translate Vars referencing the parent into * Vars referencing that child. * when we expand an inheritable table (partition table) or UNION- all subquery to an "append relationship" (essentially a linked list of child RTE), * build an AppendRelInfo for each child RTE. The AppendRelInfos linked list indicates which child rte must be included when expanding the parent node, and * each node has all the information needed to convert the Vars that references the parent node to the Vars that references the child node. * * These structs are kept in the PlannerInfo node's append_rel_list. * Note that we just throw all the structs into one list, and scan the * whole list when desiring to expand any one parent. We could have used * a more complex data structure (eg, one list per parent), but this would * be harder to update during operations such as pulling up subqueries, * and not really any easier to scan. Considering that typical queries * will not have many different append parents, it doesn't seem worthwhile * to complicate things. * these structures are stored in the append_rel_list of the PlannerInfo node. * Note that you just put all the structures in a linked list and scan the entire linked list when you want to expand any parent classes. * A more complex data structure (for example, one list per parent node) could have been used, * but it would be more difficult to update it in operations such as extracting child queries, and it would not actually be easier to scan. * considering that a typical query doesn't have many different additions, it doesn't seem worth complicating things. * * Note: after completion of the planner prep phase, any given RTE is an * append parent having entries in append_rel_list if and only if its * "inh" flag is set. We clear "inh" for plain tables that turn out not * to have inheritance children, and (in an abuse of the original meaning * of the flag) we set "inh" for subquery RTEs that turn out to be * flattenable UNION ALL queries. This lets us avoid useless searches * of append_rel_list. * Note: after the plan preparation phase is completed, * if and only if its "inh" flag is set, the given RTE is an entry of an append parent in the append_rel_list. * We clear the "inh" tag for flat tables without child, and * set the "inh" flag for the subquery RTEs in the UNION ALL query (suspected of abusing the tag). * this avoids useless searches for append_rel_list. * * Note: the data structure assumes that append-rel members are single * baserels. This is OK for inheritance, but it prevents us from pulling * up a UNION ALL member subquery if it contains a join. While that could * be fixed with a more complex data structure, at present there's not much * point because no improvement in the plan could result. * Note: the data structure assumes that the additional rel members are independent baserels. This is fine for inheritance but if the UNION ALL membership subquery contains a join then it will prevent us from extracting the UNION ALL membership subquery. * although this problem can be solved with a more complex data structure, it does not make much sense at the moment, as there may not be any improvement in the plan. * / typedef struct AppendRelInfo {NodeTag type; / * * These fields uniquely identify this append relationship. There can be * (in fact, always should be) multiple AppendRelInfos for the same * parent_relid, but never more than one per child_relid, since a given * RTE cannot be a child of more than one append parent. * these fields uniquely identify the append relationship. * there can be (and should always be) multiple AppendRelInfos for the same parent_relid, * but each child_relid cannot have more than one AppendRelInfos, * because a given RTE cannot be a child of multiple append parent. * / Index parent_relid; / * parent rel RT index; RT index of append parent rel * / Index child_relid; / * child rel RT index; RT index of append child rel * / * * For an inheritance appendrel, the parent and child are both regular * relations, and we store their rowtype OIDs here for use in translating * whole-row Vars. For a UNION-ALL appendrel, the parent and child are * both subqueries with no named rowtype, and we store InvalidOid here. * for inheriting appendrel, the parent and subclasses are normal. * We store their rowtype OIDs here to transform the whole-row Vars. * for UNION-ALL appendrel, both the parent query and the subquery are subqueries with no specified row type. * We store InvalidOid here. * / Oid parent_reltype; / * OID of parent's composite type * / Oid child_reltype; / * OID of child's composite type * / / * The N'th element of this list is a Var or expression representing the * child column corresponding to the N'th column of the parent. This is * used to translate Vars referencing the parent rel into references to * the child. A list element is NULL if it corresponds to a dropped * column of the parent (this is only possible for inheritance cases, not * UNION ALL). The list elements are always simple Vars for inheritance * cases, but can be arbitrary expressions in UNION ALL cases. * the Nth element of this list is a Var or expression that represents the child column corresponding to the Nth column of the parent element. * this is used to convert a Vars that references a parent rel into a reference to a child rel. * if the linked list element corresponds to the deleted column of the parent element, the element is NULL * (this only applies to inheritance cases, not UNION ALL). * in the case of inheritance, linked list elements are always simple variables, but can be any expression in the case of UNION ALL. * * Notice we only store entries for user columns (attno > 0). Whole-row * Vars are special-cased, and system columns (attno

< 0) need no special * translation since their attnos are the same for all tables. * 注意，我们只存储用户列的条目(attno >

0). * Whole-row Vars is case sensitive, system column (attno)

< 0)不需要特别的转换， * 因为它们的attno对所有表都是相同的。 * * Caution: the Vars have varlevelsup = 0. Be careful to adjust as needed * when copying into a subquery. * 注意:Vars的varlevelsup = 0。 * 在将数据复制到子查询时，要注意根据需要进行调整。 */ //child's Vars中的表达式 List *translated_vars; /* Expressions in the child's Vars */ /* * We store the parent table's OID here for inheritance, or InvalidOid for * UNION ALL. This is only needed to help in generating error messages if * an attempt is made to reference a dropped parent column. * 我们将父表的OID存储在这里用于继承， * 如为UNION ALL,则这里存储的是InvalidOid。 * 只有在试图引用已删除的父列时，才需要这样做来帮助生成错误消息。 */ Oid parent_reloid; /* OID of parent relation */} AppendRelInfo; RelOptInfo 规划器/优化器使用的关系信息结构体参见PostgreSQL 源码解读（99）- 分区表#5（数据查询路由#2-RelOptInfo数据结构）二、源码解读 set_rel_pathlist函数为基础关系构建访问路径. //是否DUMMY访问路径#define IS_DUMMY_PATH(p) \ (IsA((p), AppendPath) && ((AppendPath *) (p))->

< root->

Simple_rel_array_size); component = root- > simple_rel_ Array [relid]; Assert (component- > part_scheme! = NULL); Assert (list_length (component- > partitioned_child_rels) > = 1); partrels = list_concat (partrels, list_copy (component- > partitioned_child_rels)) } partitioned_rels = list_make1 (partrels);} Assert (list_length (partitioned_rels) > = 1);} / * * For every non-dummy child, remember the cheapest path. Also, identify * all pathkeys (orderings) and parameterizations (required_outer sets) * available for the non-dummy member relations. * for each non-virtual sub-relationship, record the access path with the lowest cost. * at the same time, identify all path keys (sort) and available parameterized information (required_outer collection) for each non-virtual membership * / foreach (l, live_childrels) / / traverse {RelOptInfo * childrel = lfirst (l); ListCell * lcp; Path * cheapest_partial_path = NULL / * For UNION ALLs with non-empty partitioned_child_rels, accumulate * the Lists of child relations. * for UNION ALLs operations containing non-empty partitioned_child_rels, * accumulate sub-relational linked list * / if (rel- > rtekind = = RTE_SUBQUERY & & childrel- > partitioned_child_rels! = NIL) partitioned_rels = lappend (partitioned_rels, childrel- > partitioned_child_rels) / * If child has an unparameterized cheapest-total path, add that to * the unparameterized Append path we are constructing for the parent. * If not, there's no workable unparameterized path. * if the child relationship has a non-parameterized access path with the lowest total cost, * add this path to the non-parameterized Append access path we built for the parent relationship. * * With partitionwise aggregates, the child rel's pathlist may be * empty, so don't assume that a path exists here. * with partitionwise aggregation, the access path list of child relationships may be empty and it cannot be assumed that there is an access path * / if (childrel- > pathlist! = NIL & & childrel- > cheapest_total_path- > param_info = = NULL) accumulate_append_subpath (childrel- > cheapest_total_path, & subpaths, NULL) Else subpaths_valid = false; / * Same idea, but for a partial plan. * / in the same way, deal with some plans in parallel processing if (childrel- > partial_pathlist! = NIL) {cheapest_partial_path = linitial (childrel- > partial_pathlist); accumulate_append_subpath (cheapest_partial_path, & partial_subpaths, NULL) } else partial_subpaths_valid = false; / * * Same idea, but for a parallel append mixing partial and non-partial * paths. * similarly, handle parallel append mixed parallel / non-parallel access paths * / if (pa_subpaths_valid) {Path * nppath = NULL; nppath = get_cheapest_parallel_safe_total_inner (childrel- > pathlist) If (cheapest_partial_path = = NULL & & nppath = = NULL) {/ * Neither a partial nor a parallel-safe path? Forget it. * / / is not a partial path or a parallel secure path. Skip pa_subpaths_valid = false;} else if (nppath = = NULL | | (cheapest_partial_path! = NULL & & cheapest_partial_path- > total_cost)

< nppath->

Total_cost) (gdb) 1497 (cheapest_partial_path! = NULL & & (gdb) 1501 Assert (cheapest_partial_path! = NULL); (gdb) 1502 accumulate_append_subpath (cheapest_partial_path)

Collects all available sorting and parameterized path linked lists for child relationships.

(gdb) 1534 foreach (lcp, childrel- > pathlist) (gdb) (gdb) n1536 Path * childpath = (Path *) lfirst (lcp); (gdb) 1537 List * childkeys = childpath- > pathkeys; (gdb) 1538 Relids childouter = PATH_REQ_OUTER (childpath) (gdb) 1541 if (childkeys! = NIL) (gdb) 1567 if (childouter) (gdb) 1534 foreach (lcp, childrel- > pathlist)

Proceed to the next sub-relationship and complete the process.

(gdb) 1441 foreach (l, live_childrels) (gdb) 1598 if (subpaths_valid)

For example, an unsorted / unparameterized Append access path is constructed for non-parameterized access paths with child relationships.

(gdb) n1599 add_path (rel, (Path *) create_append_path (root, rel, subpaths, NIL, (gdb) p * rel- > pathlistCannot access memory at address 0x0 (gdb) n1607 if (partial_subpaths_valid) (gdb) p * rel- > pathlist$22 = {type = T_List, length = 1, head = 0x1fd0230, tail = 0x1fd0230}

Try unsorted / unparameterized partial Append access paths. If possible, build the parallel-aware access path.

(gdb) 1641 appendpath = create_append_path (root, rel, NIL, partial_subpaths, (gdb) 1650 partial_rows = appendpath- > path.rows; (gdb) 1653 add_partial_path (rel, (Path *) appendpath); (gdb)

Use mixed partial and non-partial parallel append.

1662 if (pa_subpaths_valid & & pa_nonpartial_subpaths! = NIL) (gdb)

Based on the collected subpath keys, build a non-parametric MergeAppend access path

1701 if (subpaths_valid) (gdb) 1702 generate_mergeappend_paths (root, rel, live_childrels)

Complete the call

(gdb) 1719 foreach (l, all_child_outers) (gdb) 1757} (gdb) set_append_rel_pathlist (root=0x1f1cdb8, rel=0x1fc1800, rti=1, rte=0x1efa3d0) at allpaths.c:13541354} (gdb) p * rel- > pathlist$23 = {type = T_List, length = 1, head = 0x1fd0230 Tail = 0x1fd0230} (gdb) (gdb) p * (Node *) rel- > pathlist- > head- > data.ptr_value$24 = {type = T_AppendPath} (AppendPath *) rel- > pathlist- > head- > data.ptr_value$25 = {path = {type = T_AppendPath, pathtype = T_Append, parent = 0x1fc1800, pathtarget = 0x1fc1a38, param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 6, startup_cost = 0, total_cost = 30.5300000000001, total_cost = pathkeys} Partitioned_rels = 0x1fd01f8, subpaths = 0x1fcffc8, first_partial_path = 2}

End the call

(gdb) nset_rel_pathlist (root=0x1f1cdb8, rel=0x1fc1800, rti=1, rte=0x1efa3d0) at allpaths.c:495495 if (rel- > reloptkind = = RELOPT_BASEREL & & (gdb) 496bms_membership (root- > all_baserels)! = BMS_SINGLETON) (gdb) 495if (rel- > reloptkind = = RELOPT_BASEREL & & (gdb) 504if (set_rel_pathlist_hook) (gdb) 508set_cheapest (rel); (gdb) 513} (gdb) Thank you for reading! This is the end of the article on "how PostgreSQL builds access paths for append relation". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.