In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "PostgreSQL Shao aggregate function implementation of how to use simplehash", the content of the article is simple and clear, easy to learn and understand, now please follow the editor's ideas slowly in depth, together to study and learn "PostgreSQL Shao aggregate function implementation of how to use simplehash" bar!
/ / src/backend/executor/execGrouping.c#define SH_HASH_KEY (tb, key) TupleHashTableHash (tb, key) / / SH_HASH_KEY-- > TupleHashTableHash#define SH_EQUAL (tb, a, b) TupleHashTableMatch (tb, a, b) = = 0 / / SH_EQUAL-- > TupleHashTableMatch 1, data structure
TupleHashTable
Hash table definition
Typedef struct TupleHashTableData * TupleHashTable;typedef struct TupleHashTableData {/ / underlying Hash table tuplehash_hash * hashtab; / * underlying hash table * / / number of columns in the search key int numCols; / * number of columns in lookup key * / / attribute format AttrNumber * keyColIdx; / * attr numbers of key columns * / Hash function FmgrInfo * tab_hash_funcs of data type / * hash functions for table datatype (s) * / / data type comparator ExprState * tab_eq_func; / * comparator for table datatype (s) * / / memory context MemoryContext tablecxt; / * memory context containing table * / function parsing context MemoryContext tempcxt containing the data table / * context for function evaluations * / / construct the actual size of each hash entry Size entrysize; / * actual size to make each hash entry * / depend on the data table bar destination slot TupleTableSlot * tableslot / * slot for referencing table entries * / * The following fields are set transiently for each table search: * / / the following fields temporarily set / / input tuple slot TupleTableSlot * inputslot; / * current input tuple's slot * / / the hash function FmgrInfo * in_hash_funcs of the input data type for each table retrieval / * hash functions for input datatype (s) * / / input vs table comparator ExprState * cur_eq_func; / * comparator for input vs. Table * / / Hash function IV uint32 hash_iv; / * hash-function IV * / expression context ExprContext * exprcontext; / * expression context * /} TupleHashTableData;typedef tuplehash_iterator TupleHashIterator / * type definitions * / / Hash table type definition typedef struct SH_TYPE / / tuplehash_hash {/ * * Size of data / bucket array, 64 bits to handle UINT32_MAX sized hash * tables. Note that the maximum number of elements is lower * (SH_MAX_FILLFACTOR) * data / bucket array size, 64 bit is used to process UINT32_MAX hash table. * Note that the maximum format of the element is less than (SH_MAX_FILLFACTOR) * / uint64 size; / * how many elements have valid contents * / how many elements have valid content uint32 members; / * mask for bucket and size calculations, based on size * / based on the size, the mask uint32 sizemask used to calculate the bucket and size / * boundary after which to grow hashtable * / / threshold for hash table growth uint32 grow_threshold; / * hash buckets * / / hash bucket SH_ELEMENT_TYPE * data; / * memory context to use for allocations * / / memory context MemoryContext ctx for allocation / * user defined data, useful for callbacks * / user-defined data, which is usually used for callback function void * private_data;} SH_TYPE;// is actually tuplehash_hash
TupleHashEntryData
Hash table entry
Typedef struct TupleHashEntryData * TupleHashEntry;typedef struct TupleHashTableData * TupleHashTable;typedef struct TupleHashEntryData {/ / copy of the first tuple of this group MinimalTuple firstTuple; / * copy of first tuple in this group * / / user data void * additional; / * user data * / / status (see SH_STATUS) uint32 status; / * hash status * / Hash (cached) uint32 hash / * hash value (cached) * /} TupleHashEntryData;typedef enum SH_STATUS {SH_STATUS_EMPTY = 0x00, SH_STATUS_IN_USE = 0x01} SH_STATUS
MinimalTuple
Minimized tuple definition
/ * * MinimalTuple is an alternative representation that is used for transient * tuples inside the executor, in places where transaction status information * is not required, the tuple rowtype is known, and shaving off a few bytes * is worthwhile because we need to store many tuples. The representation * is chosen so that tuple access routines can work with either full or * minimal tuples via a HeapTupleData pointer structure. The access routines * see no difference, except that they must not access the transaction status * or t_ctid fields because those aren't there. * For the most part, MinimalTuples should be accessed via TupleTableSlot * routines. These routines will prevent access to the "system columns" * and thereby prevent accidental use of the nonexistent fields. * * MinimalTupleData contains a length word, some padding, and fields matching * HeapTupleHeaderData beginning with t_infomask2. The padding is chosen so * that offsetof (t_infomask2) is the same modulo MAXIMUM_ALIGNOF in both * structs. This makes data alignment rules equivalent in both cases. * * When a minimal tuple is accessed via a HeapTupleData pointer, t_data is * set to point MINIMAL_TUPLE_OFFSET bytes before the actual start of the * minimal tuple-that is, where a full tuple matching the minimal tuple's * data would start. This trick is what makes the structs seem equivalent. * Note that t_hoff is computed the same as in a full tuple, hence it includes * the MINIMAL_TUPLE_OFFSET distance. T_len does not include that, however. * * MINIMAL_TUPLE_DATA_OFFSET is the offset to the first useful (non-pad) data * other than the length word. Tuplesort.c and tuplestore.c use this to avoid * writing the padding to disk. * / # define MINIMAL_TUPLE_OFFSET\ ((offsetof (HeapTupleHeaderData, t_infomask2)-sizeof (uint32)) / MAXIMUM_ALIGNOF * MAXIMUM_ALIGNOF) # define MINIMAL_TUPLE_PADDING\ ((offsetof (HeapTupleHeaderData, t_infomask2)-sizeof (uint32))% MAXIMUM_ALIGNOF) # define MINIMAL_TUPLE_DATA_OFFSET\ offsetof (MinimalTupleData, t_infomask2) struct MinimalTupleData {uint32 t_len / * actual length of minimal tuple * / char mt_ padding [minor _ TUPLE_PADDING]; / * Fields below here must match HeapTupleHeaderData! * / uint16 tasking infomask2; / * number of attributes + various flags * / uint16 tasking infomask2; / * various flag bits, see below * / uint8 tweehoff; / * sizeof header incl. Bitmap, padding * / / ^-23 bytes-^ * / bits8 t _ bits [flex _ ARRAY_MEMBER]; / * bitmap of NULLs * / / * MORE DATA FOLLOWS AT END OF STRUCT * /}; / * typedef appears in htup.h * / # define SizeofMinimalTupleHeader offsetof (MinimalTupleData, t_bits) typedef struct MinimalTupleData MinimalTupleData;typedef MinimalTupleData * MinimalTuple; II. Source code interpretation
TupleHashTableHash
TupleHashTableHash is used to calculate the hash value of tuple (grouped column value)
/ * * Compute the hash value for a tuple * calculate the hash value of tuple * * The passed-in key is a pointer to TupleHashEntryData. In an actual hash * table entry, the firstTuple field points to a tuple (in MinimalTuple * format). LookupTupleHashEntry sets up a dummy TupleHashEntryData with a * NULL firstTuple field-that cues us to look at the inputslot instead. * This convention avoids the need to materialize virtual input tuples unless * they actually need to get copied into the table. * the key passed in is a pointer to the TupleHashEntryData structure. * in the actual hash table entry, the firstTuple field points to a tuple (saved in MinimalTuple format). * LookupTupleHashEntry uses the NULL firstTuple field to set a virtual TupleHashEntryData. *-this prompts us to look at inputslot instead. * this transformation avoids materializing virtual input tuples unless they need to be actually copied into the data table. * Also, the caller must select an appropriate memory context for running * the hash functions. (dynahash.c doesn't change CurrentMemoryContext.) * at the same time, the caller must choose the appropriate memory context to run the hash function. * (dynahash.c does not change CurrentMemoryContext) * / static uint32TupleHashTableHash (struct tuplehash_hash * tb, const MinimalTuple tuple) {/ / Tuple hash table TupleHashTable hashtable = (TupleHashTable) tb- > private_data; / / number of columns int numCols = hashtable- > numCols; / / attribute number AttrNumber * keyColIdx = hashtable- > keyColIdx; / / Hash key uint32 hashkey = hashtable- > hash_iv; / / tuple slot TupleTableSlot * slot / / the hash function pointer FmgrInfo * hashfunctions; int i; if (tuple = = NULL) / / the tuple is NULL {/ * Process the current input tuple for the table * / handles the current input tuple slot = hashtable- > inputslot; hashfunctions = hashtable- > in_hash_funcs;} else {/ * * Process a tuple already stored in the table. * deal with tuples that have been stored in the data table. * * (this case never actually occurs due to the way simplehash.h is * used, as the hash-value is stored in the entries) * (this does not actually happen because of the use of simplehash.h, because the hash value is stored in the entry) * / slot = hashtable- > tableslot; / / Storage MinimalTuple ExecStoreMinimalTuple (tuple, slot, false) Hashfunctions = hashtable- > tab_hash_funcs;} for (I = 0; I
< numCols; i++) { //------- 循环遍历列数 //获取属性编号 AttrNumber att = keyColIdx[i]; Datum attr;//属性 bool isNull;//是否为NULL? /* rotate hashkey left 1 bit at each step */ //每一步向左移动一位 hashkey = (hashkey private_data; ExprContext *econtext = hashtable->Exprcontext; / * * We assume that simplehash.h will only ever call us with the first * argument being an actual table entry, and the second argument being * LookupTupleHashEntry's dummy TupleHashEntryData. The other direction * could be supported too, but is not currently required. * / Assert (tuple1! = NULL); slot1 = hashtable- > tableslot; ExecStoreMinimalTuple (tuple1, slot1, false); Assert (tuple2 = = NULL); slot2 = hashtable- > inputslot; / * For crosstype comparisons, the inputslot must be first * / econtext- > ecxt_innertuple = slot2; econtext- > ecxt_outertuple = slot1; return! ExecQualAndReset (hashtable- > cur_eq_func, econtext)
Test script
-- disable parallel set max_parallel_workers_per_gather=0;select bh,avg (C1), min (C1), max (c2) from t_agg_simple group by bh
Tracking and analysis
(gdb) b TupleHashTableHashBreakpoint 1 at 0x6d3b2b: file execGrouping.c, line 379. (gdb) b TupleHashTableMatchBreakpoint 2 at 0x6d3c79: file execGrouping.c, line 446. (gdb) (gdb) cContinuing.Breakpoint 1, TupleHashTableHash (tb=0x2dd2720, tuple=0x0) at execGrouping.c:379379 TupleHashTable hashtable = (TupleHashTable) tb- > private_data; (gdb)
Input parameters
(gdb) p * tb$1 = {size = 256, members = 0, sizemask = 255,230,230,data = 0x2ddca00, ctx = 0x2db5310, private_data = 0x2dd2890} (gdb) p * tb- > data$2 = {firstTuple = 0x0, additional = 0x0, status = 0, hash = 0}
Get the number of grouped columns
(gdb) n380 int numCols = hashtable- > numCols; (gdb) p * hashtable$3 = {hashtab = 0x2dd2720, numCols = 1, keyColIdx = 0x2dd2680, tab_hash_funcs = 0x2db72d0, tab_eq_func = 0x2ddea18, tablecxt = 0x2dcc370, tempcxt = 0x2db7320, entrysize = 24, tableslot = 0x2dd2928, inputslot = 0x2db7238, in_hash_funcs = 0x2db72d0, cur_eq_func = 0x2ddea18, hash_iv = 0, exprcontext = 0x2ddf338} (gdb) p tb- > private_data$4 = (void *) 0x2dd2890
Get grouped column attribute number
(gdb) n381 AttrNumber * keyColIdx = hashtable- > keyColIdx; (gdb) 382 uint32 hashkey = hashtable- > hash_iv; (gdb) p * keyColIdx$5 = 1
If the input tuple is NULL, set the slot and hash functions
(gdb) n387 if (tuple = = NULL) (gdb) p hashkey$6 = 0 (gdb) n390 slot = hashtable- > inputslot; (gdb) 391 hashfunctions = hashtable- > in_hash_funcs
Start traversing the grouped column
Get hashkey
(gdb) n406 for (I = 0; I
< numCols; i++)(gdb) p numCols$8 = 1(gdb) n408 AttrNumber att = keyColIdx[i];(gdb) 413 hashkey = (hashkey private_data;(gdb) 输入参数 (gdb) p *tb$18 = {size = 256, members = 1, sizemask = 255, grow_threshold = 230, data = 0x2ddca00, ctx = 0x2db5310, private_data = 0x2dd2890}(gdb) p *tuple1$19 = {t_len = 21, mt_padding = "\000\000\000\000\000", t_infomask2 = 1, t_infomask = 2, t_hoff = 24 '\030', t_bits = 0x2dcc497 ""} 对比是否匹配 (gdb) n447 ExprContext *econtext = hashtable->Exprcontext; (gdb) 455 Assert (tuple1! = NULL); (gdb) 456 slot1 = hashtable- > tableslot; (gdb) 457 ExecStoreMinimalTuple (tuple1, slot1, false); (gdb) 458 Assert (tuple2 = = NULL); (gdb) 459 slot2 = hashtable- > inputslot; (gdb) 462econtext- > ecxt_innertuple = slot2; (gdb) 463econtext- > ecxt_outertuple = slot1; (gdb) 464 return! ExecQualAndReset (hashtable- > cur_eq_func, econtext) (gdb) p hashtable- > cur_eq_func$20 = (ExprState *) 0x2ddea18 (gdb) p * hashtable- > cur_eq_func$21 = {tag = {type = T_ExprState}, flags = 7'\ asides, resnull = false, resvalue = 0, resultslot = 0x0, steps = 0x2ddeab0, evalfunc = 0x6cd882, expr = 0x0, evalfunc_private = 0x6cb43e, steps_len = 7, steps_alloc = 16, parent = 0x0, ext_params = 0x0, innermost_caseval = 0x0, innermost_casenull = 0x0, innermost_domainval = innermost_domainval, innermost_domainval = 0x0
Return value
$22 = true (gdb) n465} (gdb) tuplehash_insert (tb=0x2dd2720, key=0x0, found=0x7fff585be487) at.. / src/include/lib/simplehash.h:556556 Assert (entry- > status = = SH_STATUS_IN_USE) (gdb) Thank you for reading. The above is the content of "how to use simplehash in the implementation of PostgreSQL Shao aggregation function". After the study of this article, I believe you have a deeper understanding of how to use simplehash in the implementation of PostgreSQL Shao aggregation function, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.