In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
< hashNode-> < hashNode->Ps.plan- > total_cost & & (gdb)
The startup cost of HJ_BUILD_HASHTABLE- > outer node is lower than the total cost of creating the Hash table and outer relation is empty (initialize node- > hj_OuterNotEmpty to false). Try to get the first tuple of outer relation. If it is NULL, you can quickly return NULL, otherwise set node- > hj_OuterNotEmpty to mark T
Else if (HJ_FILL_OUTER (node) | | (gdb) 260! node- > hj_OuterNotEmpty)) (gdb) 259 (outerNode- > plan- > startup_cost)
< hashNode->Ps.plan- > total_cost & & (gdb) 262node- > hj_FirstOuterTupleSlot = ExecProcNode (outerNode); (gdb) 263if (TupIsNull (node- > hj_FirstOuterTupleSlot)) (gdb) 269node- > hj_OuterNotEmpty = true
HJ_BUILD_HASHTABLE- > create Hash Table
(gdb) n263 if (TupIsNull (node- > hj_FirstOuterTupleSlot)) (gdb) 281 HJ_FILL_INNER (node)); (gdb) 279 hashtable = ExecHashTableCreate (hashNode, (gdb))
Memory structure of HJ_BUILD_HASHTABLE- > Hash Table (HashJoinTable structure)
The number of bucket is 16384 (16K), and the logarithmic result is 14 (that is, the result value of log2_nbuckets/log2_nbuckets_optimal).
SkewEnabled is F, tilt optimization is not enabled
(gdb) p * hashtable$14 = {nbuckets = 16384, log2_nbuckets = 14, nbuckets_original = 16384, nbuckets_optimal = 16384, log2_nbuckets_optimal = 14, buckets = {unshared = 0x2fb1260, shared = 0x2fb1260}, keepNulls = false, skewEnabled = false, skewBucket = 0x0, skewBucketLen = 0, nSkewBuckets = 0, skewBucketNums = 0x0, nbatch = 1, curbatch = 0, nbatch_original = 1, nbatch_outstart = 1, growEnabled = true, totalTuples = 0, partialTuples = 0, skewTuples = 0, innerBatchFile = 0x0, outerBatchFile = 0x0, outer_hashfunctions = outer_hashfunctions Inner_hashfunctions = 0x3053bc0, hashStrict = 0x3053c18, spaceUsed = 0, spaceAllowed = 16777216, spacePeak = 0, spaceUsedSkew = 0, spaceAllowedSkew = 335544, hashCxt = 0x3053a50, batchCxt = 0x2f8b170, chunks = 0x0, current_chunk = 0x0, area = 0x0, parallel_state = 0x0, batches = 0x0, current_chunk_shared = 9187201950435737471}
Hash function used by HJ_BUILD_HASHTABLE- >
(gdb) p * hashtable- > inner_hashfunctions$15 = {fn_addr = 0x4c8a0a, fn_oid = 400, fn_nargs = 1, fn_strict = true, fn_retset = false, fn_stats = 2'\ 002, fn_extra = 0x0, fn_mcxt = 0x3053a50, fn_expr = 0x0} (gdb) p * hashtable- > outer_hashfunctions$16 = {fn_addr = 0x4c8a0a, fn_oid = 400, fn_nargs = 1, fn_strict = true, fn_retset = false, fn_stats = 2'\ 002, fn_extra = 0x0 Fn_mcxt = 0x3053a50, fn_expr = 0x0}
HJ_BUILD_HASHTABLE- > assign, and execute this Hash Node node, resulting in a total tuple of 10000
(gdb) n289 hashNode- > hashtable = hashtable; (gdb) 290 (void) MultiExecProcNode ((PlanState *) hashNode); (gdb) 297 if (hashtable- > totalTuples = = 0 & &! HJ_FILL_OUTER (node)) (gdb) p hashtable- > totalTuples $18 = 10000
HJ_BUILD_HASHTABLE- > the number of batches is 1, and only one batch needs to be executed.
(gdb) n304 hashtable- > nbatch_outstart = hashtable- > nbatch; (gdb) p hashtable- > nbatch$19 = 1
HJ_BUILD_HASHTABLE- > reset OuterNotEmpty to F
(gdb) n311 node- > hj_OuterNotEmpty = false; (gdb) 313 if (parallel)
HJ_BUILD_HASHTABLE- > non-parallel execution. Switch status to HJ_NEED_NEW_OUTER
(gdb) 313 if (parallel) (gdb) n340 node- > hj_JoinState = HJ_NEED_NEW_OUTER
HJ_NEED_NEW_OUTER- > gets (executes ExecHashJoinOuterGetTuple) a tuple of the next outer relation
349 if (parallel) (gdb) n354 outerTupleSlot = (gdb) 357 if (TupIsNull (outerTupleSlot)) (gdb) p * outerTupleSlot$20 = {type = T_TupleTableSlot, tts_isempty = false, tts_shouldFree = false, tts_shouldFreeMin = false, tts_slow = true, tts_tuple = 0x2f88300, tts_tupleDescriptor = 0x7f0710d02bd0, tts_mcxt = 0x2ee1640, tts_buffer = 507, tts_nvalid = 1, tts_values = 0x2ee22a8, tts_isnull = 0x2ee22d0 Tts_mintuple = 0x0, tts_minhdr = {t_len = 0, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 0}, t_tableOid = 0, t_data = 0x0}, tts_off = 2, tts_fixedTupleDescriptor = true}
HJ_NEED_NEW_OUTER- > set related variables
(gdb) n371 econtext- > ecxt_outertuple = outerTupleSlot; (gdb) 372node- > hj_MatchedOuter = false; (gdb) 378node- > hj_CurHashValue = hashvalue; (gdb) 379ExecHashGetBucketAndBatch (hashtable, hashvalue, (gdb) p hashvalue$21 = 2324234220 (gdb) n381 node- > hj_CurSkewBucketNo = ExecHashGetSkewBucket (hashtable, (gdb) 383node- > hj_CurTuple = NULL (gdb) p * node$22 = {js = {ps = {type = T_HashJoinState, plan = 0x2faaff8, state = 0x2ee1758, ExecProcNode = 0x70291d, ExecProcNodeReal = 0x70291d, instrument = 0x0, worker_instrument = 0x0, worker_jit_instrument = 0x0, qual = 0x0, lefttree = 0x2ee2070, righttree = 0x2ee2918, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, ps_ResultTupleSlot = 0x2f20d98, ps_ExprContext = 0x2ee1fb0, 0x2ee1fb0 = 0x0, ps_ResultTupleSlot = 0x2f20d98, ps_ExprContext = 0x2ee1fb0, 0x2ee1fb0 = ps_ProjInfo, 0x2ee3550 = 0x2ee3550}, 0x2ee3550 = 0x2ee3550, scandesc =} Hashclauses = 0x2f21430, hj_OuterHashKeys = 0x2f22230, hj_InnerHashKeys = 0x2f22740, hj_HashOperators = 0x2f227a0, hj_HashTable = 0x2f88ee8, hj_CurHashValue = 2324234220, hj_CurBucketNo = 16364, hj_CurSkewBucketNo =-1, hj_CurTuple = 0x0, hj_OuterTupleSlot = 0x2f212f0, hj_HashTupleSlot = 0x2ee3278, hj_NullOuterTupleSlot = 0x0, hj_NullInnerTupleSlot = 0x0, hj_FirstOuterTupleSlot = 0x0, hj_JoinState = 2, hj_MatchedOuter = false, hj_OuterNotEmpty = true} (gdb) p * econtext$25 = {type = T_ExprContext, ecxt_scantuple = 0x0, 0x0 = ecxt_innertuple Ecxt_outertuple = 0x2ee2248, ecxt_per_query_memory = 0x2ee1640, ecxt_per_tuple_memory = 0x2f710c0, ecxt_param_exec_vals = 0x0, ecxt_param_list_info = 0x0, ecxt_aggvalues = 0x0, ecxt_aggnulls = 0x0, caseValue_datum = 0, caseValue_isNull = true, domainValue_datum = 0, domainValue_isNull = true, ecxt_estate = 0x2ee1758, ecxt_callbacks = 0x0} (gdb) p * node- > hj_HashTupleSlot$26 = {type = T_TupleTableSlot, tts_isempty = true, tts_shouldFree = false Tts_shouldFreeMin = false, tts_slow = false, tts_tuple = 0x0, tts_tupleDescriptor = 0x2ee3060, tts_mcxt = 0x2ee1640, tts_buffer = 0, tts_nvalid = 0, tts_values = 0x2ee32d8, tts_isnull = 0x2ee32f0, tts_mintuple = 0x0, tts_minhdr = {t_len = 0, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 0}, t_tableOid = 0, t_data = 0x0}, tts_off = 0, tts_fixedTupleDescriptor = true}
HJ_NEED_NEW_OUTER- > switch status to HJ_SCAN_BUCKET and start scanning Hash Table
(gdb) n407 node- > hj_JoinState = HJ_SCAN_BUCKET; (gdb)
HJ_SCAN_BUCKET- > does not match. Switch status to HJ_FILL_OUTER_TUPLE
(gdb) 416 if (parallel) (gdb) n427 if (! ExecScanHashBucket (node, econtext)) (gdb) 430 node- > hj_JoinState = HJ_FILL_OUTER_TUPLE; (gdb) 431 continue; (gdb)
HJ_FILL_OUTER_TUPLE- > switch status to HJ_NEED_NEW_OUTER
Whether or not a tuple is obtained / issued, the next state is NEED_NEW_OUTER
209 switch (node- > hj_JoinState) (gdb) 483node- > hj_JoinState = HJ_NEED_NEW_OUTER
HJ_FILL_OUTER_TUPLE- > since it is not an external connection and there is no need for FILL, go back to the HJ_NEED_NEW_OUTER processing logic
(gdb) n485 if (! node- > hj_MatchedOuter & & (gdb) 486HJ_FILL_OUTER (node)) (gdb) 485if (! node- > hj_MatchedOuter & & (gdb) 549} (gdb)
HJ_SCAN_BUCKET- > set a breakpoint at the location where SCAN_BUCKET scanned successfully
(gdb) b nodeHashjoin.c:441Breakpoint 3 at 0x7025c3: file nodeHashjoin.c, line 441. (gdb) cContinuing.Breakpoint 3, ExecHashJoinImpl (pstate=0x2ee1d98, parallel=false) at nodeHashjoin.c:447447 if (joinqual = = NULL | | ExecQual (joinqual, econtext))
HJ_SCAN_BUCKET- > there are matching tuples. Set related tags.
(gdb) n449 node- > hj_MatchedOuter = true; (gdb) 450 HeapTupleHeaderSetMatch (HJTUPLE_MINTUPLE (node- > hj_CurTuple)); (gdb) 453 if (node- > js.jointype = = JOIN_ANTI) (gdb) n464 if (node- > js.single_match) (gdb) 465 node- > hj_JoinState = HJ_NEED_NEW_OUTER (gdb)
HJ_SCAN_BUCKET- > perform the projection operation and return
467 if (otherqual = = NULL | | ExecQual (otherqual, econtext)) (gdb) 468 return ExecProject (node- > js.ps.ps_ProjInfo); (gdb)
Generally speaking, the implementation of Hash Join is to create the Hash Table of inner relation, then get the tuples of outer relation, and then perform the projection operation to return the corresponding tuples if matching. In addition to creating HT, other steps continue to change the state until the number of tuples required by Portal is met.
IV. Reference materials
Hash Joins: Past, Present and Future/PGCon 2017
A Look at How Postgres Executes a Tiny Join-Part 1
A Look at How Postgres Executes a Tiny Join-Part 2
Assignment 2 Symmetric Hash Join
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.