Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

PostgreSQL Source Code interpretation (98)-Partition Table # 4 (data query Route # 1-"extended" Partition Table)

2025-02-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

How does PG determine which partition is queried when querying a partition table? How can I be sure? What is the relevant mechanism? This section is the first part, which will be covered one by one in the following chapters.

Zero, realization mechanism

Let's first look at the following example, two regular tables, t_normal_1 and t_normal_2, to perform the UNION ALL operation:

Drop table if exists tincture: drop table if exists tincture: create table t_normal_1 (c1 int not null,c2 varchar (40), c3 varchar (40)); create table t_normal_2 (c1 int not null,c2 varchar (40), c3 varchar (40)); insert into t_normal_1 (c1 int not null,c2 varchar (40), c2 varchar (40)) VALUES (0mm HASH0'); insert into t_normal_2 (c1meme c2minc3) VALUES (0min HASH0') Testdb=# explain verbose select * from t_normal_1 where C1 = 0testdb-# union alltestdb-# select * from t_normal_2 where c1 0 QUERY PLAN-Append (cost=0.00..34. 00 rows=350 width=200)-> Seq Scan on public.t_normal_1 (cost=0.00..14.38 rows=2 width=200) Output: t_normal_1.c1 T_normal_1.c2, t_normal_1.c3 Filter: (t_normal_1.c1 = 0)-> Seq Scan on public.t_normal_2 (cost=0.00..14.38 rows=348 width=200) Output: t_normal_2.c1, t_normal_2.c2, t_normal_2.c3 Filter: (t_normal_2.c1 0) (7 rows)

The UNION ALL,PG of two ordinary tables uses the APPEND operator to output the result set of t_normal_1 sequential scan and the result set of t_normal_2 sequential scan "APPEND" as the final result set.

The query for partitioned tables is a similar mechanism, where the result sets of each partition are APPEND together and then output as the final result set, as shown in the following example:

Testdb=# explain verbose select * from t_hash_partition where C1 = 1 OR C1 = 2 QUERY PLAN -Append (cost=0.00..30.53 rows=6 width=200)-> Seq Scan on public.t_hash_partition_1 (cost=0.00..15.25 rows=3 width=200) Output: t_hash_partition_1.c1 T_hash_partition_1.c2, t_hash_partition_1.c3 Filter: (t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2)-> Seq Scan on public.t_hash_partition_3 (cost=0.00..15.25 rows=3 width=200) Output: t_hash_partition_3.c1, t_hash_partition_3.c2 T_hash_partition_3.c3 Filter: (t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2)) (7 rows)

Query the partition table t_hash_partition with the condition C1 = 1 OR C1 = 2. It can be seen from the execution plan that the result set of t_hash_partition_1 sequential scan and the result set of t_hash_partition_3 sequential scan "APPEND" are output as the final result set.

There are several problems that need to be solved:

1. Identify partition tables and find all partition child tables

two。 Identify the partitions that need to be queried according to constraints, which is due to performance considerations

3. APPEND the result set and output it as the final result.

This section describes how PG recognizes partition tables and finds all partition child tables, implementing the function expand_inherited_tables.

I. data structure

AppendRelInfo

Append-relation information.

When we expand an inheritable table (partitioned table) or a UNION- all subquery into an "append relationship" (essentially a linked list of child RTE), build an AppendRelInfo for each child RTE.

The AppendRelInfos linked list indicates which child rte must be included when expanding the parent node, and each node has all the information necessary to convert the Vars that references the parent node to the Vars that references the child node.

/ * Append-relation info. * Append-relation information. * When we expand an inheritable table or a UNION-ALL subselect into an * "append relation" (essentially, a list of child RTEs), we build an * AppendRelInfo for each child RTE. The list of AppendRelInfos indicates * which child RTEs must be included when expanding the parent, and each node * carries information needed to translate Vars referencing the parent into * Vars referencing that child. * when we expand an inheritable table (partition table) or UNION- all subquery to an "append relationship" (essentially a linked list of child RTE), * build an AppendRelInfo for each child RTE. The AppendRelInfos linked list indicates which child rte must be included when expanding the parent node, and * each node has all the information needed to convert the Vars that references the parent node to the Vars that references the child node. * * These structs are kept in the PlannerInfo node's append_rel_list. * Note that we just throw all the structs into one list, and scan the * whole list when desiring to expand any one parent. We could have used * a more complex data structure (eg, one list per parent), but this would * be harder to update during operations such as pulling up subqueries, * and not really any easier to scan. Considering that typical queries * will not have many different append parents, it doesn't seem worthwhile * to complicate things. * these structures are stored in the append_rel_list of the PlannerInfo node. * Note that you just put all the structures in a linked list and scan the entire linked list when you want to expand any parent classes. * A more complex data structure (for example, one list per parent node) could have been used, * but it would be more difficult to update it in operations such as extracting child queries, and it would not actually be easier to scan. * considering that a typical query doesn't have many different additions, it doesn't seem worth complicating things. * * Note: after completion of the planner prep phase, any given RTE is an * append parent having entries in append_rel_list if and only if its * "inh" flag is set. We clear "inh" for plain tables that turn out not * to have inheritance children, and (in an abuse of the original meaning * of the flag) we set "inh" for subquery RTEs that turn out to be * flattenable UNION ALL queries. This lets us avoid useless searches * of append_rel_list. * Note: after the plan preparation phase is completed, * if and only if its "inh" flag is set, the given RTE is an entry of an append parent in the append_rel_list. * We clear the "inh" tag for flat tables without child, and * set the "inh" flag for the subquery RTEs in the UNION ALL query (suspected of abusing the tag). * this avoids useless searches for append_rel_list. * * Note: the data structure assumes that append-rel members are single * baserels. This is OK for inheritance, but it prevents us from pulling * up a UNION ALL member subquery if it contains a join. While that could * be fixed with a more complex data structure, at present there's not much * point because no improvement in the plan could result. * Note: the data structure assumes that the additional rel members are independent baserels. This is fine for inheritance but if the UNION ALL membership subquery contains a join then it will prevent us from extracting the UNION ALL membership subquery. * although this problem can be solved with a more complex data structure, it does not make much sense at the moment, as there may not be any improvement in the plan. * / typedef struct AppendRelInfo {NodeTag type; / * * These fields uniquely identify this append relationship. There can be * (in fact, always should be) multiple AppendRelInfos for the same * parent_relid, but never more than one per child_relid, since a given * RTE cannot be a child of more than one append parent. * these fields uniquely identify the append relationship. * there can be (and should always be) multiple AppendRelInfos for the same parent_relid, * but each child_relid cannot have more than one AppendRelInfos, * because a given RTE cannot be a child of multiple append parent. * / Index parent_relid; / * parent rel RT index; RT index of append parent rel * / Index child_relid; / * child rel RT index; RT index of append child rel * / * * For an inheritance appendrel, the parent and child are both regular * relations, and we store their rowtype OIDs here for use in translating * whole-row Vars. For a UNION-ALL appendrel, the parent and child are * both subqueries with no named rowtype, and we store InvalidOid here. * for inheriting appendrel, the parent and subclasses are normal. * We store their rowtype OIDs here to transform the whole-row Vars. * for UNION-ALL appendrel, both the parent query and the subquery are subqueries with no specified row type. * We store InvalidOid here. * / Oid parent_reltype; / * OID of parent's composite type * / Oid child_reltype; / * OID of child's composite type * / / * The N'th element of this list is a Var or expression representing the * child column corresponding to the N'th column of the parent. This is * used to translate Vars referencing the parent rel into references to * the child. A list element is NULL if it corresponds to a dropped * column of the parent (this is only possible for inheritance cases, not * UNION ALL). The list elements are always simple Vars for inheritance * cases, but can be arbitrary expressions in UNION ALL cases. * the Nth element of this list is a Var or expression that represents the child column corresponding to the Nth column of the parent element. * this is used to convert a Vars that references a parent rel into a reference to a child rel. * if the linked list element corresponds to the deleted column of the parent element, the element is NULL * (this only applies to inheritance cases, not UNION ALL). * in the case of inheritance, linked list elements are always simple variables, but can be any expression in the case of UNION ALL. * * Notice we only store entries for user columns (attno > 0). Whole-row * Vars are special-cased, and system columns (attno

< 0) need no special * translation since their attnos are the same for all tables. * 注意,我们只存储用户列的条目(attno >

0). * Whole-row Vars is case sensitive, system column (attno)

< 0)不需要特别的转换, * 因为它们的attno对所有表都是相同的。 * * Caution: the Vars have varlevelsup = 0. Be careful to adjust as needed * when copying into a subquery. * 注意:Vars的varlevelsup = 0。 * 在将数据复制到子查询时,要注意根据需要进行调整。 */ //child's Vars中的表达式 List *translated_vars; /* Expressions in the child's Vars */ /* * We store the parent table's OID here for inheritance, or InvalidOid for * UNION ALL. This is only needed to help in generating error messages if * an attempt is made to reference a dropped parent column. * 我们将父表的OID存储在这里用于继承, * 如为UNION ALL,则这里存储的是InvalidOid。 * 只有在试图引用已删除的父列时,才需要这样做来帮助生成错误消息。 */ Oid parent_reloid; /* OID of parent relation */} AppendRelInfo; PlannerInfo 该数据结构用于存储查询语句在规划/优化过程中的相关信息 /*---------- * PlannerInfo * Per-query information for planning/optimization * 用于规划/优化的每个查询信息 * * This struct is conventionally called "root" in all the planner routines. * It holds links to all of the planner's working state, in addition to the * original Query. Note that at present the planner extensively modifies * the passed-in Query data structure; someday that should stop. * 在所有计划程序例程中,这个结构通常称为"root"。 * 除了原始查询之外,它还保存到所有计划器工作状态的链接。 * 注意,目前计划器会毫无节制的修改传入的查询数据结构,相信总有一天这种情况会停止的。 *---------- */struct AppendRelInfo;typedef struct PlannerInfo{ NodeTag type;//Node标识 //查询树 Query *parse; /* the Query being planned */ //当前的planner全局信息 PlannerGlobal *glob; /* global info for current planner run */ //查询层次,1标识最高层 Index query_level; /* 1 at the outermost Query */ // 如为子计划,则这里存储父计划器指针,NULL标识最高层 struct PlannerInfo *parent_root; /* NULL at outermost Query */ /* * plan_params contains the expressions that this query level needs to * make available to a lower query level that is currently being planned. * outer_params contains the paramIds of PARAM_EXEC Params that outer * query levels will make available to this query level. * plan_params包含该查询级别需要提供给当前计划的较低查询级别的表达式。 * outer_params包含PARAM_EXEC Params的参数,外部查询级别将使该查询级别可用这些参数。 */ List *plan_params; /* list of PlannerParamItems, see below */ Bitmapset *outer_params; /* * simple_rel_array holds pointers to "base rels" and "other rels" (see * comments for RelOptInfo for more info). It is indexed by rangetable * index (so entry 0 is always wasted). Entries can be NULL when an RTE * does not correspond to a base relation, such as a join RTE or an * unreferenced view RTE; or if the RelOptInfo hasn't been made yet. * simple_rel_array保存指向"base rels"和"other rels"的指针 * (有关RelOptInfo的更多信息,请参见注释)。 * 它由可范围表索引建立索引(因此条目0总是被浪费)。 * 当RTE与基本关系(如JOIN RTE或未被引用的视图RTE时)不相对应 * 或者如果RelOptInfo还没有生成,条目可以为NULL。 */ //RelOptInfo数组,存储"base rels",比如基表/子查询等. //该数组与RTE的顺序一一对应,而且是从1开始,因此[0]无用 */ struct RelOptInfo **simple_rel_array; /* All 1-rel RelOptInfos */ int simple_rel_array_size; /* 数组大小,allocated size of array */ /* * simple_rte_array is the same length as simple_rel_array and holds * pointers to the associated rangetable entries. This lets us avoid * rt_fetch(), which can be a bit slow once large inheritance sets have * been expanded. * simple_rte_array的长度与simple_rel_array相同, * 并保存指向相应范围表条目的指针。 * 这使我们可以避免执行rt_fetch(),因为一旦扩展了大型继承集,rt_fetch()可能会有点慢。 */ //RTE数组 RangeTblEntry **simple_rte_array; /* rangetable as an array */ /* * append_rel_array is the same length as the above arrays, and holds * pointers to the corresponding AppendRelInfo entry indexed by * child_relid, or NULL if none. The array itself is not allocated if * append_rel_list is empty. * append_rel_array与上述数组的长度相同, * 并保存指向对应的AppendRelInfo条目的指针,该条目由child_relid索引, * 如果没有索引则为NULL。 * 如果append_rel_list为空,则不分配数组本身。 */ //处理集合操作如UNION ALL时使用和分区表时使用 struct AppendRelInfo **append_rel_array; /* * all_baserels is a Relids set of all base relids (but not "other" * relids) in the query; that is, the Relids identifier of the final join * we need to form. This is computed in make_one_rel, just before we * start making Paths. * all_baserels是查询中所有base relids(但不是"other" relids)的一个Relids集合; * 也就是说,这是需要形成的最终连接的Relids标识符。 * 这是在开始创建路径之前在make_one_rel中计算的。 */ Relids all_baserels;//"base rels" /* * nullable_baserels is a Relids set of base relids that are nullable by * some outer join in the jointree; these are rels that are potentially * nullable below the WHERE clause, SELECT targetlist, etc. This is * computed in deconstruct_jointree. * nullable_baserels是由jointree中的某些外连接中值可为空的base Relids集合; * 这些是在WHERE子句、SELECT targetlist等下面可能为空的树。 * 这是在deconstruct_jointree中处理获得的。 */ //Nullable-side端的"base rels" Relids nullable_baserels; /* * join_rel_list is a list of all join-relation RelOptInfos we have * considered in this planning run. For small problems we just scan the * list to do lookups, but when there are many join relations we build a * hash table for faster lookups. The hash table is present and valid * when join_rel_hash is not NULL. Note that we still maintain the list * even when using the hash table for lookups; this simplifies life for * GEQO. * join_rel_list是在计划执行中考虑的所有连接关系RelOptInfos的链表。 * 对于小问题,只需要扫描链表执行查找,但是当存在许多连接关系时, * 需要构建一个散列表来进行更快的查找。 * 当join_rel_hash不为空时,哈希表是有效可用于查询的。 * 注意,即使在使用哈希表进行查找时,仍然维护该链表;这简化了GEQO(遗传算法)的生命周期。 */ //参与连接的Relation的RelOptInfo链表 List *join_rel_list; /* list of join-relation RelOptInfos */ //可加快链表访问的hash表 struct HTAB *join_rel_hash; /* optional hashtable for join relations */ /* * When doing a dynamic-programming-style join search, join_rel_level[k] * is a list of all join-relation RelOptInfos of level k, and * join_cur_level is the current level. New join-relation RelOptInfos are * automatically added to the join_rel_level[join_cur_level] list. * join_rel_level is NULL if not in use. * 在执行动态规划算法的连接搜索时,join_rel_level[k]是k级的所有连接关系RelOptInfos的列表, * join_cur_level是当前级别。 * 新的连接关系RelOptInfos会自动添加到join_rel_level[join_cur_level]链表中。 * 如果不使用join_rel_level,则为NULL。 */ //RelOptInfo指针链表数组,k层的join存储在[k]中 List **join_rel_level; /* lists of join-relation RelOptInfos */ //当前的join层次 int join_cur_level; /* index of list being extended */ //查询的初始化计划链表 List *init_plans; /* init SubPlans for query */ //CTE子计划ID链表 List *cte_plan_ids; /* per-CTE-item list of subplan IDs */ //MULTIEXPR子查询输出的参数链表的链表 List *multiexpr_params; /* List of Lists of Params for MULTIEXPR * subquery outputs */ //活动的等价类链表 List *eq_classes; /* list of active EquivalenceClasses */ //规范化的PathKey链表 List *canon_pathkeys; /* list of "canonical" PathKeys */ //外连接约束条件链表(左) List *left_join_clauses; /* list of RestrictInfos for mergejoinable * outer join clauses w/nonnullable var on * left */ //外连接约束条件链表(右) List *right_join_clauses; /* list of RestrictInfos for mergejoinable * outer join clauses w/nonnullable var on * right */ //全连接约束条件链表 List *full_join_clauses; /* list of RestrictInfos for mergejoinable * full join clauses */ //特殊连接信息链表 List *join_info_list; /* list of SpecialJoinInfos */ //AppendRelInfo链表 List *append_rel_list; /* list of AppendRelInfos */ //PlanRowMarks链表 List *rowMarks; /* list of PlanRowMarks */ //PHI链表 List *placeholder_list; /* list of PlaceHolderInfos */ // 外键信息链表 List *fkey_list; /* list of ForeignKeyOptInfos */ //query_planner()要求的PathKeys链表 List *query_pathkeys; /* desired pathkeys for query_planner() */ //分组子句路径键 List *group_pathkeys; /* groupClause pathkeys, if any */ //窗口函数路径键 List *window_pathkeys; /* pathkeys of bottom window, if any */ //distinctClause路径键 List *distinct_pathkeys; /* distinctClause pathkeys, if any */ //排序路径键 List *sort_pathkeys; /* sortClause pathkeys, if any */ //已规范化的分区Schema List *part_schemes; /* Canonicalised partition schemes used in the * query. */ //尝试连接的RelOptInfo链表 List *initial_rels; /* RelOptInfos we are now trying to join */ /* Use fetch_upper_rel() to get any particular upper rel */ //上层的RelOptInfo链表 List *upper_rels[UPPERREL_FINAL + 1]; /* upper-rel RelOptInfos */ /* Result tlists chosen by grouping_planner for upper-stage processing */ //grouping_planner为上层处理选择的结果tlists struct PathTarget *upper_targets[UPPERREL_FINAL + 1];// /* * grouping_planner passes back its final processed targetlist here, for * use in relabeling the topmost tlist of the finished Plan. * grouping_planner在这里传回它最终处理过的targetlist,用于重新标记已完成计划的最顶层tlist。 */ ////最后需处理的投影列 List *processed_tlist; /* Fields filled during create_plan() for use in setrefs.c */ //setrefs.c中在create_plan()函数调用期间填充的字段 //分组函数属性映射 AttrNumber *grouping_map; /* for GroupingFunc fixup */ //MinMaxAggInfos链表 List *minmax_aggs; /* List of MinMaxAggInfos */ //内存上下文 MemoryContext planner_cxt; /* context holding PlannerInfo */ //关系的page计数 double total_table_pages; /* # of pages in all tables of query */ //query_planner输入参数:元组处理比例 double tuple_fraction; /* tuple_fraction passed to query_planner */ //query_planner输入参数:limit_tuple double limit_tuples; /* limit_tuples passed to query_planner */ //表达式的最小安全等级 Index qual_security_level; /* minimum security_level for quals */ /* Note: qual_security_level is zero if there are no securityQuals */ //注意:如果没有securityQuals, 则qual_security_level是NULL(0) //如目标relation是分区表的child/partition/分区表,则通过此字段标记 InheritanceKind inhTargetKind; /* indicates if the target relation is an * inheritance child or partition or a * partitioned table */ //是否存在RTE_JOIN的RTE bool hasJoinRTEs; /* true if any RTEs are RTE_JOIN kind */ //是否存在标记为LATERAL的RTE bool hasLateralRTEs; /* true if any RTEs are marked LATERAL */ //是否存在已在jointree删除的RTE bool hasDeletedRTEs; /* true if any RTE was deleted from jointree */ //是否存在Having子句 bool hasHavingQual; /* true if havingQual was non-null */ //如约束条件中存在pseudoconstant = true,则此字段为T bool hasPseudoConstantQuals; /* true if any RestrictInfo has * pseudoconstant = true */ //是否存在递归语句 bool hasRecursion; /* true if planning a recursive WITH item */ /* These fields are used only when hasRecursion is true: */ //这些字段仅在hasRecursion为T时使用: //工作表的PARAM_EXEC ID int wt_param_id; /* PARAM_EXEC ID for the work table */ //非递归模式的访问路径 struct Path *non_recursive_path; /* a path for non-recursive term */ /* These fields are workspace for createplan.c */ //这些字段用于createplan.c //当前节点之上的外部rels Relids curOuterRels; /* outer rels above current node */ //未赋值的NestLoopParams参数 List *curOuterParams; /* not-yet-assigned NestLoopParams */ /* optional private data for join_search_hook, e.g., GEQO */ //可选的join_search_hook私有数据,例如GEQO void *join_search_private; /* Does this query modify any partition key columns? */ //该查询是否更新分区键列? bool partColsUpdated;} PlannerInfo;二、源码解读 expand_inherited_tables函数将表示继承集合的每个范围表条目展开为"append relation"。 /* * expand_inherited_tables * Expand each rangetable entry that represents an inheritance set * into an "append relation". At the conclusion of this process, * the "inh" flag is set in all and only those RTEs that are append * relation parents. * 将表示继承集合的每个范围表条目展开为"append relation"。 * 在这个过程结束时,"inh"标志被设置在所有且只有那些作为append * relation parents的RTEs中。 */voidexpand_inherited_tables(PlannerInfo *root){ Index nrtes; Index rti; ListCell *rl; /* * expand_inherited_rtentry may add RTEs to parse->

Rtable. The function is * expected to recursively handle any RTEs that it creates with inh=true. * So just scan as far as the original end of the rtable list. * expand_inherited_rtentry can add RTEs to parse- > rtable. * this function is expected to recursively handle all the RTEs it creates with inh = true. * so just scan to the end of the rtable linked list. * / nrtes = list_length (root- > parse- > rtable); rl = list_head (root- > parse- > rtable); for (rti = 1; rti append_rel_list). * if not, clear the "inh" flag of the entry to prevent future code from looking for AppendRelInfos. * Note that the original RTE is considered to represent the whole * inheritance set. The first of the generated RTEs is an RTE for the same * table, but with inh = false, to represent the parent table in its role * as a simple member of the inheritance set. * Note that the original RTEs is considered to represent the entire inheritance collection. * the first RTE generated is the RTE of the same table, but inh = false indicates the role of the parent table as a simple member of the inherited set. * * A childless table is never considered to be an inheritance set. For * regular inheritance, a parent RTE must always have at least two associated * AppendRelInfos: one corresponding to the parent table as a simple member of * inheritance set and one or more corresponding to the actual children. * Since a partitioned table is not scanned, it might have only one associated * AppendRelInfo. * relationships without child tables are never considered inheritance collections. * for regular inheritance, the parent RTE must always have at least two related AppendRelInfos: * one as a simple member of the inheritance set corresponds to the parent table, and * one or more corresponds to the actual child table. * because the partition table is not scanned, it may have only one associated AppendRelInfo. * / static voidexpand_inherited_rtentry (PlannerInfo * root, RangeTblEntry * rte, Index rti) {Oid parentOID; PlanRowMark * oldrc; Relation oldrelation; LOCKMODE lockmode; List * inhOIDs; ListCell * l; / * Does RT entry allow inheritance? * / / whether to partition the table? If (! rte- > inh) return;/ * Ignore any already-expanded UNION ALL nodes * / / ignores all extended UNION ALL nodes if (rte- > rtekind! = RTE_RELATION) {Assert (rte- > rtekind = = RTE_SUBQUERY); return;// returns} / * Fast path for common case of childless table * / / for regular non-child tables, quickly determine parentOID = rte- > relid If (! has_subclass (parentOID)) {/ * Clear flag before returning * / / No child table, set the tag and return rte- > inh = false; return;} / * * The rewriter should already have obtained an appropriate lock on each * relation named in the query. However, for each child relation we add * to the query, we must obtain an appropriate lock, because this will be * the first use of those relations in the parse/rewrite/plan pipeline. * Child rels should use the same lockmode as their parent. * the query rewriter program should have acquired the appropriate lock on each relationship named in the query. However, for each child relationship added to the query, the appropriate lock must be obtained, * because this will be the first use of these relationships during parsing / rewriting / planning. * the subtree should use the same lock mode as the parent tree. * / lockmode = rte- > rellockmode; / * Scan for all members of inheritance set, acquire needed locks * / / scan all members of the inheritance set to get the required lock inhOIDs = find_all_inheritors (parentOID, lockmode, NULL); / * Check that there's at least one descendant, else treat as no-child * case. This could happen despite above has_subclass () check, if table * once had a child but no longer does. * check whether there is at least one offspring, otherwise it will be regarded as childless. Although there is a has_subclass () check on it, this can happen if table used to have a child element, but no longer has it. * / if (list_length (inhOIDs))

< 2) { /* Clear flag before returning */ //清除标记,返回 rte->

Inh = false; return;} / * * If parent relation is selected FOR UPDATE/SHARE, we need to mark its * PlanRowMark as isParent = true, and generate a new PlanRowMark for each * child. * if the parent relationship is selected FOR UPDATE/SHARE, * its PlanRowMark needs to be marked isParent = true, * and a new PlanRowMark is generated for each child relationship. * / oldrc = get_plan_rowmark (root- > rowMarks, rti); if (oldrc) oldrc- > isParent = true; / * * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter already did. * the parent relationship must be opened to check its tupdesc. * No locking is required, and we assume that query rewriting has already done so. * / oldrelation = heap_open (parentOID, NoLock); / * Scan the inheritance set and expand it * / / scan the extended if (RelationGetPartitionDesc (oldrelation)! = NULL) / {Assert (rte- > relkind = = RELKIND_PARTITIONED_TABLE); / * * If this table has partitions, recursively expand them in the order * in which they appear in the PartitionDesc. While at it, also * extract the partition key columns of all the partitioned tables. * if the table has partitions, expand them recursively in the order in which they appear in PartitionDesc. * at the same time, the partition key columns of all partition tables are extracted. * / expand_partitioned_rtentry (root, rte, rti, oldrelation, oldrc, lockmode, & root- > append_rel_list);} else {/ / the partition descriptor was not obtained successfully (no partition information) List * appinfos = NIL; RangeTblEntry * childrte; Index childRTindex / * This table has no partitions. Expand any plain inheritance * children in the order the OIDs were returned by * find_all_inheritors. * this table has no partitions. * expand all normal inherited child elements in the order of the OIDs returned by find_all_inheritors. * / foreach (l, inhOIDs) / / traversing OIDs {Oid childOID = lfirst_oid (l); Relation newrelation; / * Open rel if needed; we already have required locks * / / if necessary, open rel (lock acquired) if (childOID! = parentOID) newrelation = heap_open (childOID, NoLock) Else newrelation = oldrelation; / * * It is possible that the parent table has children that are temp * tables of other backends. We cannot safely access such tables * (because of buffering issues), and the best thing to do seems * to be to silently ignore them. * the child tables of the parent table may be temporary tables of other backgrounds. * We cannot safely access these tables (because of buffering problems), and the best way seems to be to quietly ignore them. * / if (childOID! = parentOID & & RELATION_IS_OTHER_TEMP (newrelation)) {heap_close (newrelation, lockmode); / / ignore them continue } expand_single_inheritance_child (root, rte, rti, oldrelation, oldrc, newrelation, & appinfos, & childrte, & childRTindex) / / expand / * Close child relations, but keep locks * / / close the child table, but still hold the lock if (childOID! = parentOID) heap_close (newrelation, NoLock);} / * * If all the children were temp tables, pretend it's a * non-inheritance situation; we don't need Append node in that case. * The duplicate RTE we added for the parent table is harmless, so we * don't bother to get rid of it; ditto for the useless PlanRowMark * node. * if all child tables are temporary tables, it is assumed that this is a non-inherited case; * in this case, APPEND NODE is not required. It doesn't matter that we add duplicate RTE to the parent table * so we don't have to bother to delete it; the same is true of useless PlanRowMark nodes. * / if (list_length (appinfos))

< 2) rte->

Inh = false;// set tag else root- > append_rel_list = list_concat (root- > append_rel_list, appinfos); / / add to the linked list} heap_close (oldrelation, NoLock); / / close relation} / * * expand_partitioned_rtentry * Recursively expand an RTE for a partitioned table. * Recursive extended partition table RTE * / static voidexpand_partitioned_rtentry (PlannerInfo * root, RangeTblEntry * parentrte, Index parentRTindex, Relation parentrel, PlanRowMark * top_parentrc, LOCKMODE lockmode, List * * appinfos) {int i; RangeTblEntry * childrte; Index childRTindex; PartitionDesc partdesc = RelationGetPartitionDesc (parentrel); check_stack_depth () / * A partitioned table should always have a partition descriptor. * / / allocation tables should usually have partition descriptors Assert (partdesc); Assert (parentrte- > inh); / * Note down whether any partition key cols are being updated. Though it's * the root partitioned table's updatedCols we are interested in, we * instead use parentrte to get the updatedCols. This is convenient * because parentrte already has the root partrel's updatedCols translated * to match the attribute ordering of parentrel. * Please note whether the partition key cols is being updated. * although you are interested in the updatedCols of the root partition table, use parentrte to get the updatedCols. * this is convenient because parentrte has converted the updatedCols of root partrel to match the attribute order of parentrel. * / if (! root- > partColsUpdated) root- > partColsUpdated = has_partition_attrs (parentrel, parentrte- > updatedCols, NULL); / * First expand the partitioned table itself. * / / expand_single_inheritance_child (root, parentrte, parentRTindex, parentrel, top_parentrc, parentrel, appinfos, & childrte, & childRTindex); / * If the partitioned table has no partitions, treat this as the * non-inheritance case. * if the partitioned table does not have a partition, it is considered a non-inherited case. * / if (partdesc- > nparts = = 0) {parentrte- > inh = false; return;} for (I = 0; I

< partdesc->

< 2)(gdb) p inhOIDs$7 = (List *) 0x28fd208(gdb) p *inhOIDs$8 = {type = T_OidList, length = 7, head = 0x28fd1e0, tail = 0x28fd778}(gdb) expand_inherited_rtentry->

Open relation

(gdb) n1584 if (oldrc) (gdb) 1591 oldrelation = heap_open (parentOID, NoLock)

Expand_inherited_rtentry- > successfully get the partition descriptor, call expand_partitioned_rtentry

(gdb) 1594 if (RelationGetPartitionDesc (oldrelation)! = NULL) (gdb) 1596 Assert (rte- > relkind = = RELKIND_PARTITIONED_TABLE); (gdb) 1603 expand_partitioned_rtentry (root, rte, rti, oldrelation, oldrc, (gdb)

Expand_inherited_rtentry- > enter expand_partitioned_rtentry

(gdb) stepexpand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:16841684 PartitionDesc partdesc = RelationGetPartitionDesc (parentrel)

Expand_partitioned_rtentry- > get the partition descriptor

1684 PartitionDesc partdesc = RelationGetPartitionDesc (parentrel); (gdb) n1686 check_stack_depth (); (gdb) p * partdesc$9 = {nparts = 6, oids = 0x298e4f8, boundinfo = 0x298e530}

Expand_partitioned_rtentry- > perform relevant verification

(gdb) n1689 Assert (partdesc); (gdb) 1691 Assert (parentrte- > inh); (gdb) 1700 if (! root- > partColsUpdated) (gdb) 1702 has_partition_attrs (parentrel, parentrte- > updatedCols, NULL); (gdb) 1701 root- > partColsUpdated = (gdb) 1705 expand_single_inheritance_child (root, parentrte, parentRTindex, parentrel)

Expand_partitioned_rtentry- > first expand the partition table itself and enter expand_single_inheritance_child

(gdb) stepexpand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, childrel=0x7f4e66827980, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4) at prepunion.c:17781778 Query * parse = root- > parse

Expand_single_inheritance_child- > perform related initialization (childrte)

(gdb) n1779 Oid parentOID = RelationGetRelid (parentrel); (gdb) 1780 Oid childOID = RelationGetRelid (childrel); (gdb) 1797 childrte = copyObject (parentrte); (gdb) p parentOID$10 = 16986 (gdb) p childOID$11 = 16986 (gdb) n1798 * childrte_p = childrte; (gdb) 1799 childrte- > relid = childOID; (gdb) 1800 childrte- > relkind = childrel- > rd_rel- > relkind (gdb) 1802 if (childOID! = parentOID & & (gdb) 1806 childrte- > inh = false; (gdb) 1807 childrte- > requiredPerms = 0; (gdb) 1808 childrte- > securityQuals = NIL; (gdb) 1809 parse- > rtable = lappend (parse- > rtable, childrte); (gdb) 1810 childRTindex = list_length (parse- > rtable); (gdb) 1811 * childRTindex_p = childRTindex (gdb) p * childrte-- > relid = 16986, which is still the partition table $12 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112112paired, tablesample = 0x0, subquery = 0x0, security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, coltypmods = 0x0, 0x0 = colcollations, colcollations = 0x0, 0x0 = 0, 0x0 = 0x0, enrname = enrname Lateral = false, inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fd898, insertedCols = 0x0, updatedCols = 0x0, securityQuals = 0x0} (gdb) p * childRTindex_p$13 = 0

Expand_single_inheritance_child- > finish extending the partition table itself and go back to expand_partitioned_rtentry

(gdb) p * childRTindex_p$13 = 0 (gdb) n1820 if (childrte- > relkind! = RELKIND_PARTITIONED_TABLE | | childrte- > inh) (gdb) 1855 if (top_parentrc) (gdb) 1881} (gdb) expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:17131713 if (partdesc- > nparts = = 0)

Expand_partitioned_rtentry- > start traversing the partition in the partition descriptor

1713 if (partdesc- > nparts = = 0) (gdb) n1719 for (I = 0; I

< partdesc->

Nparts; iTunes +) (gdb) 1721 Oid childOID = partdesc- > oids [I]; (gdb) 1725 childrel = heap_open (childOID, NoLock) (gdb) 1732 if (RELATION_IS_OTHER_TEMP (childrel)) (gdb) 1735 expand_single_inheritance_child (root, parentrte, parentRTindex, (gdb) p childOID$14 = 16989-- testdb=# select relname from pg_class where oid=16989 Relname-t_hash_partition_1 (1 row)--

Expand_single_inheritance_child- > enter expand_single_inheritance_child again

(gdb) stepexpand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, childrel=0x7f4e668306a0, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4) at prepunion.c:17781778 Query * parse = root- > parse

Expand_single_inheritance_child- > start building AppendRelInfo

... 1820 if (childrte- > relkind! = RELKIND_PARTITIONED_TABLE | | childrte- > inh) (gdb) 1822 appinfo = makeNode (AppendRelInfo) (gdb) p * childrte$17 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16989, relkind = 114' ritual, tablesample = 0x0, subquery = 0x0, security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, coltypmods = 0x0, colcollations = colcollations, 0x0 = 0x0, enrname = 0, enrname = enrname, enrname =, =, = RequiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fdbc8, insertedCols = 0x0, updatedCols = 0x0, securityQuals = 0x0} (gdb) p * childrte- > relkindCannot access memory at address 0x72 (gdb) p childrte- > relkind$18 = 114r' (gdb) p childrte- > inh$19 = false

Expand_single_inheritance_child- > after building, look at the AppendRelInfo structure

(gdb) n1823 appinfo- > parent_relid = parentRTindex; (gdb) 1824 appinfo- > child_relid = childRTindex; (gdb) 1825 appinfo- > parent_reltype = parentrel- > rd_rel- > reltype; (gdb) 1826 appinfo- > child_reltype = childrel- > rd_rel- > reltype; (gdb) 1827 make_inh_translation_list (parentrel, childrel, childRTindex, (gdb) 1829 appinfo- > parent_reloid = parentOID (gdb) 1830 * appinfos = lappend (* appinfos, appinfo) (gdb) 1841 if (childOID! = parentOID) (gdb) 1843 childrte- > selectedCols = translate_col_privs (parentrte- > selectedCols, (gdb) 1845 childrte- > insertedCols = translate_col_privs (parentrte- > insertedCols, (gdb) 1847 childrte- > updatedCols = translate_col_privs (parentrte- > updatedCols, (gdb) 1855 if (top_parentrc) (gdb) p * appinfo$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3 Parent_reltype = 16988, child_reltype = 16991, translated_vars = 0x28fdc90, parent_reloid = 16986}

Expand_single_inheritance_child- > completes the call and returns

(gdb) 1855 if (top_parentrc) (gdb) p * appinfo$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, translated_vars = 0x28fdc90, parent_reloid = 16986} (gdb) n1881} (gdb) expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:17401740 if (childrel- > rd_rel- > relkind = = RELKIND_PARTITIONED_TABLE)

Expand_inherited_rtentry- > finish the expand_partitioned_rtentry procedure call and return to expand_inherited_rtentry

(gdb) finishRun till exit from # 0 expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:17400x00000000007e55e3 in expand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:16031603 expand_partitioned_rtentry (root, rte, rti, oldrelation, oldrc, (gdb)

Expand_inherited_rtentry- > finish the call to expand_inherited_rtentry and return to expand_inherited_tables

(gdb) n1665 heap_close (oldrelation, NoLock); (gdb) 1666} (gdb) expand_inherited_tables (root=0x28fcdc8) at prepunion.c:14901490 rl = lnext (rl); (gdb)

Expand_inherited_tables- > finish the expand_inherited_tables call and go back to subquery_planner

(gdb) n1485 for (rti = 1; rti hasHavingQual = (parse- > havingQual! = NULL); (gdb)

DONE!

IV. Reference materials

Parallel Append implementation

Partition Elimination in PostgreSQL 11

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report