In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what is the logic of PageAddItemExtended function in PostgreSQL". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "what is the logic of the PageAddItemExtended function in PostgreSQL"?
1. Test data sheet
Testdb=# drop table if exists t_insert
NOTICE: table "t_insert" does not exist, skipping
DROP TABLE
Testdb=# create table t_insert (id int,c1 char (10), c2 char (10), c3 char (10))
CREATE TABLE
Second, source code analysis
The main implementation of inserting data is bufpage.c, and the main function is PageAddItemExtended. The English comment for this function is very clear, but for better understanding, Chinese comments have been added here.
Description of variables, macro definitions, structures, etc.
1. Page pointer to char: typedef char * Pointer;typedef Pointer Page;2, Item pointer to char: typedef Pointer Item;3, Sizetypedef size_t Size;4, OffsetNumberunsigned short (16bits): typedef uint16 OffsetNumber;5, PageHeaderPageHeaderData structure pointer typedef struct PageHeaderData {/ * XXX LSN is member of * any* block, not only page-organized ones * / PageXLogRecPtr pd_lsn / * LSN: next byte after last byte of xlog * record for last change to this page * / uint16 pd_checksum; / * checksum * / uint16 pd_flags; / * flag bits, see below * / LocationIndex pd_lower; / * offset to start of free space * / LocationIndex pd_upper; / * offset to end of free space * / LocationIndex pd_special / * offset to start of special space * / uint16 pd_pagesize_version; TransactionId pd_prune_xid; / * oldest prunable XID, or zero if none * / ItemIdData pd_ Linp [flex _ ARRAY_MEMBER]; / * line pointer array * /} PageHeaderData; typedef PageHeaderData * PageHeader # define PD_HAS_FREE_LINES 0x0001 / * are there any unused line pointers? * / # define PD_PAGE_FULL 0x0002 / * not enough free space for new tuple? * / # define PD_ALL_VISIBLE 0x0004 / * all tuples on page are visible to * everyone * / # define PD_VALID_FLAG_BITS 0x0007 / * OR of all valid pd_flags bits * / 6, SizeOfPageHeaderData long integer Define the header size: # define offsetof (type, field) ((long) & (type *) 0)-> field) # define SizeOfPageHeaderData (offsetof (PageHeaderData, pd_linp)) 7, BLCKSZ#define BLCKSZ 81928, PageGetMaxOffsetNumber if the lower of free space is less than or equal to the header size, the value is 0 Otherwise, the value is lower minus the header size (24Bytes) divided by ItemId size (4Bytes) # define PageGetMaxOffsetNumber (page)\ (PageHeader) (page))-> pd_lower pd_lower-SizeOfPageHeaderData)\ / sizeof (ItemIdData)) 9, InvalidOffsetNumber invalid offset value # define InvalidOffsetNumber ((OffsetNumber) 0) 10, OffsetNumberNext input value + 1 # define OffsetNumberNext (offsetNumber)\ (OffsetNumber) (1 + (offsetNumber)) 11, ItemId structure ItemIdData pointer typedef struct ItemIdData {unsigned lp_off:15 / * offset to tuple (from start of page) * / lp_flags:2, / * state of item pointer, see below * / lp_len:15 / * byte length of tuple * /} ItemIdData; typedef ItemIdData * ItemId # define LP_UNUSED 0 / * unused (should always have lp_len=0) * / # define LP_NORMAL 1 / * used (should always have lp_len > 0) * / # define LP_REDIRECT 2 / * HOT redirect (should have lp_len=0) * / # define LP_DEAD 3 / * dead, may or may not have storage * / 12, PageGetItemId get the corresponding ItemIdData pointer # define PageGetItemId (page OffsetNumber)\ (ItemId) (& ((PageHeader) (page))-> pd_linp [(offsetNumber)-1]) 13, PAI_OVERWRITE bit tag The actual value is 1#define PAI_OVERWRITE (1 lp_len!) = 017, PageHasFreeLinePointers/PageClearHasFreeLinePointers determines whether there is a vacancy between Header and Lower # define PageHasFreeLinePointers (page)\ (PageHeader) (page))-> pd_flags & PD_HAS_FREE_LINES) clears the vacancy mark (clears when the mark is wrong) # define PageClearHasFreeLinePointers (page)\ (PageHeader) (page))-> pd_flags & = ~ PD_HAS_FREE_LINES) 18, Maximum number of Tuple that a MaxHeapTuplesPerPage can hold per Page The calculation formula is: (Block size-header size) / (align row header size + row pointer size) # define MaxHeapTuplesPerPage\ ((int) ((BLCKSZ-SizeOfPageHeaderData) /\ (MAXALIGN (SizeofHeapTupleHeader) + sizeof (ItemIdData)
Interpretation of PageAddItemExtended function
/ * PageAddItemExtended function: input: page- pointer to page item- to data pointer size- data size offsetNumber- specify data store offset flags- flag bit (whether to overwrite / whether Heap data) output: OffsetNumber- data store actual offset * / OffsetNumberPageAddItemExtended (Page page, Item item, Size size, OffsetNumber offsetNumber Int flags) {PageHeader phdr = (PageHeader) page / / header pointer Size alignedSize;// aligns size int lower;//Free space low-order int upper;//Free space high-order ItemId itemId;// row pointer OffsetNumber limit;// row offset, the first available position offset in Free space bool needshuffle = false / / do you need to move the original data / * * Be wary about corrupted page pointers * / if (phdr- > pd_lower)
< SizeOfPageHeaderData || phdr->Pd_lower > phdr- > pd_upper | | phdr- > pd_upper > phdr- > pd_special | | phdr- > pd_special > BLCKSZ) ereport (PANIC, (errcode (ERRCODE_DATA_CORRUPTED), errmsg ("corrupted page pointers: lower =% u, upper =% u, special =% u", phdr- > pd_lower, phdr- > pd_upper, phdr- > pd_special) / * * Select offsetNumber to place the new item at * / / get the offset of the stored data (between Lower and Upper) limit = OffsetNumberNext (PageGetMaxOffsetNumber (page)) / * was offsetNumber passed in? * / if (OffsetNumberIsValid (offsetNumber)) {/ / if the offset of the data store is specified (the passed offset parameter is valid) / * yes, check it * / if ((flags & PAI_OVERWRITE)! = 0) / / does not overwrite the original data {if (offsetNumber)
< limit) { //获取指定偏移的ItemId itemId = PageGetItemId(phdr, offsetNumber); //指定的数据偏移已使用或者已分配存储空间,报错 if (ItemIdIsUsed(itemId) || ItemIdHasStorage(itemId)) { elog(WARNING, "will not overwrite a used ItemId"); return InvalidOffsetNumber; } } } else//覆盖原有数据 { //指定的行偏移不在空闲空间中,需要移动原数据为新数据腾位置 if (offsetNumber < limit) needshuffle = true; /* need to move existing linp's */ } } else//没有指定数据存储行偏移 { /* offsetNumber was not passed in, so find a free slot */ /* if no free slot, we'll put it at limit (1st open slot) */ if (PageHasFreeLinePointers(phdr))//页头标记提示存在已回收的空间 { /* * Look for "recyclable" (unused) ItemId. We check for no storage * as well, just to be paranoid --- unused items should never have * storage. */ //循环找出第1个可用的空闲行偏移 for (offsetNumber = 1; offsetNumber < limit; offsetNumber++) { itemId = PageGetItemId(phdr, offsetNumber); if (!ItemIdIsUsed(itemId) && !ItemIdHasStorage(itemId)) break; } //没有找到,说明页头标记有误,需清除标记,以免误导 if (offsetNumber >= limit) {/ * the hint is wrong, so reset it * / PageClearHasFreeLinePointers (phdr);}} else// has no reclaimed space, row pointers / data are stored in Free Space {/ * don't bother searching if hint says there's no free slot * / offsetNumber = limit }} / * Reject placing items beyond the first unused line pointer * / if (offsetNumber > limit) {/ / if the specified offset is greater than the first location available for free space, error elog (WARNING, "specified item offset is too large"); return InvalidOffsetNumber } / * Reject placing items beyond heap boundary, if heap * / if ((flags & PAI_IS_HEAP)! = 0 & & offsetNumber > MaxHeapTuplesPerPage) {/ / Heap data, but offset greater than the maximum number of Tuple that can be stored on a page, error elog (WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page"); return InvalidOffsetNumber;} / * * Compute new lower and upper pointers for page, see if it'll fit. * * Note: do arithmetic as signed ints, to avoid mistakes if, say, * alignedSize > pd_upper. * / if (offsetNumber = = limit | | needshuffle) / / if the data is stored in Free space, modify the low value lower = phdr- > pd_lower + sizeof (ItemIdData); else lower = phdr- > pd_lower;// otherwise, find the reclaimed free location and use the original lower alignedSize = MAXALIGN (size); / / size alignment upper = (int) phdr- > pd_upper-(int) alignedSize / / apply for storage space if (lower > upper) / / verify return InvalidOffsetNumber; / * * OK to insert the item. First, shuffle the existing pointers if needed. * / / get row pointer itemId = PageGetItemId (phdr, offsetNumber); / / if (needshuffle) / / if you need to move the original row pointer back one "grid" memmove (itemId + 1, itemId, (limit-offsetNumber) * sizeof (ItemIdData)); / * set the item pointer * / / set the new data row pointer ItemIdSetNormal (itemId, upper, size) / * Items normally contain no uninitialized bytes. Core bufpage consumers * conform, but this is not a necessary coding rule; a new index AM could * opt to depart from it. However, data type input functions and other * C-language functions that synthesize datums should initialize all * bytes; datumIsEqual () relies on this. Testing here, along with the * similar check in printtup (), helps to catch such mistakes. * * Values of the "name" type retrieved via index-only scans may contain * uninitialized bytes; see comment in btrescan () Valgrind will report * this as an error, but it is safe to ignore. * / VALGRIND_CHECK_MEM_IS_DEFINED (item, size); / * copy the item's data onto the page * / / put the data in the data area memcpy ((char *) page + upper, item, size); / * adjust page header * / / update lower & upper phdr- > pd_lower = (LocationIndex) lower; phdr- > pd_upper = (LocationIndex) upper; / / return the actual row offset return offsetNumber if successful Third, follow-up and analysis
Let's use gdb to track and analyze the PageAddItemExtended function.
Test scenario: first insert 8 rows of data, then delete row 2, and then insert 1 row of data:
Testdb=#-insert 8 rows of data testdb=# insert into t_insert values (1); insert into t_insert values (6); insert into t_insert values (7); insert into t_insert values (8); checkpoint;INSERT 0 1testdb=# insert into t_insert values (2) INSERT 0 1testdb=# insert into t_insert values (3); INSERT 0 1testdb=# insert into t_insert values (4); INSERT 0 1testdb=# insert into t_insert values (5); INSERT 0 1testdb=# insert into t_insert values (6); INSERT 0 1testdb=# insert into t_insert values (7). INSERT 0 1testdb=# insert into t_insert values; INSERT 0 1testdb=# testdb=# checkpoint;CHECKPOINTtestdb=#-- delete line 2 testdb=# delete from t_insert where id = 2 delete 1testdb=# testdb=# checkpoint;CHECKPOINTtestdb=# testdb=#-- get pidtestdb=# select pg_backend_pid (); pg_backend_pid-1572 (1 row)
Use gdb to track the process of inserting the last row of data:
# start gdb, bind process [root@localhost demo] # gdb-p 1572GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7Copyright (C) 2013 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details.This GDB was configured as "x86_64-redhat-linux-gnu" .for bug reporting instructions, please see:.Attaching to process 1572... (gdb) # set breakpoint (gdb) b PageAddItemExtendedBreakpoint 1 at 0x845119: file bufpage.c, line 196. (gdb) # switch to psql and execute the insert statement testdb=# insert into t_insert values. (suspended) # switch back to gdb (gdb) cContinuing.Breakpoint 1, PageAddItemExtended (page=0x7feaaefac300 "\ 001", item=0x29859f8 "2\ 234\ 030", size=61, offsetNumber=0, flags=2) at bufpage.c:196196 PageHeader phdr = (PageHeader) page # breakpoint on the first line of the function # bt prints call stack (gdb) bt#0 PageAddItemExtended (page=0x7feaaefac300 "\ 001", item=0x29859f8 "2\ 234\ 030", size=61, offsetNumber=0, flags=2) at bufpage.c:196#1 0x00000000004cf4f9 in RelationPutHeapTuple (relation=0x7feac6e2ccb8, buffer=141, tuple=0x29859e0, token=false) at hio.c:53#2 0x00000000004c34ec in heap_insert (relation=0x7feac6e2ccb8, tup=0x29859e0, cid=0, options=0, bistate=0x0) at heapam.c:2487#3 0x00000000006c076b in ExecInsert (mtstate=0x2984c10, slot=0x2985250, planSlot=0x2985250, estate=0x29848c0 CanSetTag=true) at nodeModifyTable.c:529#4 0x00000000006c29f3 in ExecModifyTable (pstate=0x2984c10) at nodeModifyTable.c:2126#5 0x000000000069a7d8 in ExecProcNodeFirst (node=0x2984c10) at execProcnode.c:445#6 0x0000000000690994 in ExecProcNode (node=0x2984c10) at. / src/include/executor/executor.h:237#7 0x0000000000692e5e in ExecutePlan (estate=0x29848c0, planstate=0x2984c10, use_parallel_mode=false, operation=CMD_INSERT, sendTuples=false, numberTuples=0, direction=ForwardScanDirection, dest=0x2990dc8, execute_once=true) at execMain.c:1726#8 0x0000000000690e58 in standard_ExecutorRun (queryDesc=0x2981020, direction=ForwardScanDirection, count=0 Execute_once=true) at execMain.c:363#9 0x0000000000690cef in ExecutorRun (queryDesc=0x2981020, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:306#10 0x0000000000851d84 in ProcessQuery (plan=0x2990c68, sourceText=0x28c5ef0 "insert into t_insert values) ", params=0x0, queryEnv=0x0, dest=0x2990dc8, completionTag=0x7ffdbc052d10") at pquery.c:161#11 0x00000000008534f4 in PortalRunMulti (portal=0x292b490, isTopLevel=true, setHoldSnapshot=false, dest=0x2990dc8, altdest=0x2990dc8, completionTag=0x7ffdbc052d10 "") at pquery.c:1286#12 0x0000000000852b32 in PortalRun (portal=0x292b490, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2990dc8, altdest=0x2990dc8, completionTag=0x7ffdbc052d10 ") at pquery.c:799#13 0x000000000084cebc in exec_simple_query (query_string=0x28c5ef0" insert into t_insert values ") at postgres.c:1122#14 0x0000000000850f3c in PostgresMain (argc=1, argv=0x28efaa8, dbname=0x28ef990" testdb ", username=0x28ef978" xdb ") at postgres.c:4153#15 0x00000000007c0168 in BackendRun (port=0x28e7970) at postmaster.c:4361#16 0x00000000007bf8fc in BackendStartup (port=0x28e7970) at postmaster.c:4033#17 0x00000000007bc139 in ServerLoop () at postmaster.c:1706#18 0x00000000007bb9f9 in PostmasterMain (argc=1, argv=0x28c0b60) at postmaster.c:1379#19 0x00000000006f19e8 in main (argc=1, argv=0x28c0b60) at main.c:228# uses next command to single-step debug (gdb) next207 if (phdr- > pd_lower)
< SizeOfPageHeaderData ||(gdb) next208 phdr->Pd_lower > phdr- > pd_upper | | (gdb) next207 if (phdr- > pd_lower
< SizeOfPageHeaderData ||(gdb) next209 phdr->Pd_upper > phdr- > pd_special | | (gdb) next208 phdr- > pd_lower > phdr- > pd_upper | | (gdb) next210 phdr- > pd_special > BLCKSZ) (gdb) next209 phdr- > pd_upper > phdr- > pd_special | | (gdb) next219 limit = OffsetNumberNext (PageGetMaxOffsetNumber (page) (gdb) next222 if (OffsetNumberIsValid (offsetNumber)) (gdb) p offsetNumber$1 = 0#offsetNumber is 0 No inserted row offset (gdb) next247 if (PageHasFreeLinePointers (phdr)) # View header information (gdb) p * phdr$2 = {pd_lsn = {xlogid = 1, xrecoff = 3677462648}, pd_checksum = 0, pd_flags = 0, pd_lower = 56, pd_upper = 7680, pd_special = 8192, pd_pagesize_version = 8196, pd_prune_xid = 1612849 Pd_linp = 0x7feaaefac318} (gdb) # View row pointer information (gdb) p phdr- > pd_linp [0] $3 = {lp_off = 8128, lp_flags = 1, lp_len = 61} (gdb) p phdr- > pd_linp [1] $4 = {lp_off = 8064, lp_flags = 1, lp_len = 61}. (gdb) p phdr- > pd_linp [8] $11 = {lp_off = 0, lp_flags = 0 Lp_len = 0} # lp_flags of row pointer offset is 1 (LP_NORMAL) Indicates that it is in use. (gdb) p size$17 = 61 (gdb) p alignedSize$18 = 5045956 (gdb) next300 upper = (int) phdr- > pd_upper-(int) alignedSize; (gdb) p alignedSize$19 = 64 # alignment is 64 (gdb) next302 if (lower > upper) (gdb) p lower$20 = 60 (gdb) p upper$21 = 7616 (gdb) next308 itemId = PageGetItemId (phdr, offsetNumber); (gdb) 310 if (needshuffle) (gdb) next315 ItemIdSetNormal (itemId, upper, upper) (gdb) next332 memcpy ((char *) page + upper, item, size); (gdb) p * itemId$23 = {lp_off = 7616, lp_flags = 1, lp_len = 61}... (gdb) next338 return offsetNumber; (gdb) # data is inserted at a row offset of 9 (row pointer subscript = 8). (gdb) p offsetNumber$24 = 9 (gdb) p phdr- > pd_linp [8] $25 = {lp_off = 7616, lp_flags = 1, lp_len = 61}. # Note: the newly inserted data is not placed above offset 2 because the deleted second row is not recycled. # View the data in Page testdb=# select * from heap_page_items (get_raw_page ('tweak insertinsertinsert)) Lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data-+- -+- -1 | 8128 | 1 | 61 | 1612841 | 0 | 0 | (0prime1) | 4 | 2306 | 24 |\ x01000000173131202020202020207313220202020202020202073133202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020 | | 61 | 1612842 | 1612849 | 0 | (0Magne2) | 8196 | 258 | 24 |\ x020000001731312020202020202020731322020202020202020173133202020202020203 | 8000 | 1 | 61 | 1612843 | 0 | 0 | 0 | 4 | 2306 | 24 |\ x0300001731320202020202020713202020207132020202071320202020204 | 7936 | 1 | 61 | 1612844 | 0 | | | 0 | (0P4) | 4 | 2306 | 24 |\ x04000000173131202020202020207313220202020202017313320202020202020202020207313202020202020202020202020713202020202020202020202020202020207132020202020202020202020207132020202020202020202071320202020202020202020713202020202020202020713202020202020202071320202020202020207132020202020202020202071320202020202020202071320202020202020202020713202020202020202020202020202020202020202020713202020202020202020202020202020202020202020202020202020202020207132020202020202020202020713202020202020202073132020202020202020731320202020202020202020 (0force 6) | 4 | 2306 | 24 |\ x060000001731312020202020207313220202020202020173133202020202020207 | 1 | 61 | 1612847 | 0 | 0 | (0Magne7) | 4 | 2306 | 24 |\ x070000173131202020202020173132202020202020207313220202020202020202020202020202020202020202020202020202073132020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020 | 4 | 2306 | 24 | 2020202020207313220202020202073132020202020731332020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020 I believe you have a deeper understanding of "what is the logic of the PageAddItemExtended function in PostgreSQL". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.