In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "what is the role of the lazy_vacuum_heap function in the vacuum process of PostgreSQL". In the daily operation, I believe that many people have doubts about the role of the lazy_vacuum_heap function in the vacuum process of PostgreSQL. The editor consulted all kinds of materials and sorted out a simple and easy-to-use method of operation. I hope it will be helpful for everyone to answer the question of "what is the role of the lazy_vacuum_heap function in the vacuum process of PostgreSQL?" Next, please follow the editor to study!
This section briefly introduces the processing flow of manual vacuum execution by PostgreSQL, and mainly analyzes the implementation logic of the ExecVacuum- > vacuum- > vacuum_rel- > heap_vacuum_rel- > lazy_scan_heap- > lazy_vacuum_heap function, which accesses the heap table, marks the abandoned tuples as unused and compresses the free space on the page where these tuples reside.
I. data structure
Macro definition
Vacuum and Analyze command options
/ *-- * Vacuum and Analyze Statements * Vacuum and Analyze command options * * Even though these are nominally two statements, it's convenient to use * just one node type for both. Note that at least one of VACOPT_VACUUM * and VACOPT_ANALYZE must be set in options. Although there are two different statements here, you only need to use a uniform Node type. * Note that at least VACOPT_VACUUM/VACOPT_ANALYZE is set in the options. *-/ typedef enum VacuumOption {VACOPT_VACUUM = 1 num_dead_tuples; tupindex++) {BlockNumber tblk;// block number OffsetNumber toff;// offset ItemId itemid;// row pointer / / get the block number tblk = ItemPointerGetBlockNumber according to the row pointer (& vacrelstats- > dead_ tuples [tupindex]) If (tblk! = blkno) / / is not the same block, jump out of the loop break; / * past end of tuples for this block * / get offset toff = ItemPointerGetOffsetNumber (& vacrelstats- > dead_ tuples [tupindex]); / / get row pointer itemid = PageGetItemId (page, toff); / / mark as unused ItemIdSetUnused (itemid) / / record offset unused [uncnt++] = toff;} / / defragment PageRepairFragmentation (page); / * * Mark buffer dirty before we write WAL. * Mark buffer as dirty * / MarkBufferDirty (buffer); / * XLOG stuff * / if (RelationNeedsWAL (onerel)) {/ / record WAL Record XLogRecPtr recptr Recptr = log_heap_clean (onerel, buffer, NULL, 0, NULL, 0, unused, uncnt, vacrelstats- > latestRemovedXid); PageSetLSN (page, recptr);} / * End critical section, so we safely can do visibility tests (which * possibly need to perform IO and allocate memory!). If we crash now the * page (including the corresponding vm bit) might not be marked all * visible, but that's fine. A later vacuum will fix that. * end the critical area so that we can safely perform a visibility check * (this may require IO/ to allocate memory) * if the process crashes, the page (including the corresponding vm bit) may be marked as all-visible, but this is no problem, and the vacuum will be repaired later. * / END_CRIT_SECTION (); / * * Now that we have removed the dead tuples from the page, once again * check if the page has become all-visible. The page is already marked * dirty, exclusively locked, and, if needed, a full page image has been * emitted in the log_heap_clean () above. * now we have removed the obsolete tuples from the page and checked again to see if the page is all visible. * the page has been marked as dirty, locked exclusively, and the complete page image is recorded in log_heap_clean () if necessary. * / if (heap_page_is_all_visible (onerel, buffer, & visibility_cutoff_xid, & all_frozen)) PageSetAllVisible (page); / * All the changes to the heap page have been done. If the all-visible * flag is now set, also set the VM all-visible bit (and, if possible, the * all-frozen bit) unless this has already been done previously. * all changes to the heap page have been completed. If the all-visible flag is set, also set the VM all-visible bit * (and, if possible, set the all-frozen bit), unless previously completed. * / if (PageIsAllVisible (page)) {uint8 vm_status = visibilitymap_get_status (onerel, blkno, vmbuffer); uint8 flags = 0; / * Set the VM all-frozen bit to flag, if needed * / / set the VM all-frozen tag bit if ((vm_status & VISIBILITYMAP_ALL_VISIBLE) = = 0) flags | = VISIBILITYMAP_ALL_VISIBLE if necessary If ((vm_status & VISIBILITYMAP_ALL_FROZEN) = 0 & & all_frozen) flags | = VISIBILITYMAP_ALL_FROZEN; Assert (BufferIsValid (* vmbuffer)); if (flags! = 0) visibilitymap_set (onerel, blkno, buffer, InvalidXLogRecPtr, * vmbuffer, visibility_cutoff_xid, flags);} return tupindex;} / * * PageRepairFragmentation * * Frees fragmented space on a page. * release the debris space on the page. * * It doesn't remove unused line pointers! Please don't change this. * this method does not know unused row pointers! Therefore, do not modify it. * This routine is usable for heap pages only, but see PageIndexMultiDelete. * this method is only used for heap pages, but refer to PageIndexMultiDelete. * As a side effect, the page's PD_HAS_FREE_LINES hint bit is updated. * when this method is processed, the PD_HAS_FREE_LINES tag bit of the page will be updated. * * / voidPageRepairFragmentation (Page page) {Offset pd_lower = ((PageHeader) page)-> pd_lower; Offset pd_upper = ((PageHeader) page)-> pd_upper; Offset pd_special = ((PageHeader) page)-> pd_special; itemIdSortData itemidbase [MaxHeapTuplesPerPage]; / / Storage data itemIdSort itemidptr; ItemId lp; int nline, nstorage, nunused Int i; Size totallen; / * * It's worth the trouble to be more paranoid here than in most places, * because we are about to reshuffle data in (what is usually) a shared * disk buffer. If we aren't careful then corrupted pointers, lengths, * etc could cause us to clobber adjacent disk buffers, spreading the data * loss further. So, check everything. * it is worthwhile to perform more checks here than elsewhere, because we will reshuffle the data in the (usually) shared disk buffer. * if we are not careful, corrupted row pointers, data lengths, etc., may cause conflicts with adjacent disk buffers, and * further propagation of errors will lead to data loss. Therefore, it needs to be examined carefully. * / if (pd_lower
< SizeOfPageHeaderData || pd_lower >Pd_upper | | pd_upper > pd_special | | pd_special > BLCKSZ | | pd_special! = MAXALIGN (pd_special)) ereport (ERROR, (errcode (ERRCODE_DATA_CORRUPTED), errmsg ("corrupted page pointers: lower =% u, upper =% u, special =% u", pd_lower, pd_upper, pd_special) / * Run through the line pointer array and collect data about live items. * iterate through the row pointer array to collect surviving entries. * / nline = PageGetMaxOffsetNumber (page); / / get the maximum offset itemidptr = itemidbase;// nunused = totallen = 0; for (I = FirstOffsetNumber; I lp_len! = 0) itemidptr- > offsetindex = I-1; itemidptr- > itemoff = ItemIdGetOffset (lp); / / execute judgment if (itemidptr- > itemoff
< (int) pd_upper || itemidptr->Itemoff > = (int) pd_special) ereport (ERROR, (errcode (ERRCODE_DATA_CORRUPTED), errmsg ("corrupted item pointer:% u", itemidptr- > itemoff) / / one alignment length itemidptr- > alignedlen = MAXALIGN (ItemIdGetLength (lp)); totallen + = itemidptr- > alignedlen; itemidptr++ / / next element of the array}} else {/ * Unused entries should have lp_len = 0, but make sure * / / unused ItemId ItemIdSetUnused (lp); nunused++;}} / / the number of elements stored in the array nstorage = itemidptr-itemidbase If (nstorage = = 0) {/ * Page is completely empty, so just reset it quickly * / / page is completely empty, reset page ((PageHeader) page)-> pd_upper = pd_special } else {/ * Need to compact the page the hard way * / / page is not empty, compressed page if (totallen > (Size) (pd_special-pd_lower)) ereport (ERROR, (errcode (ERRCODE_DATA_CORRUPTED), errmsg ("corrupted item lengths: total% u, available space% u" (unsigned int) totallen, pd_special-pd_lower) Compactify_tuples (itemidbase, nstorage, page);} / * Set hint bit for PageAddItem * / / set the mark bit if (nunused > 0) / / there is unused space for the PageAddItem method, set the mark PageSetHasFreeLinePointers (page); else / / clear the tag PageClearHasFreeLinePointers (page);} / * * After removing or marking some line pointers unused, move the tuples to * remove the gaps caused by the removed items. * after clearing or marking some row pointers as unused, move tuples to bridge the gap between deleted tuples * / static voidcompactify_tuples (itemIdSort itemidbase, int nitems, Page page) {PageHeader phdr = (PageHeader) page; Offset upper; int i / * sort itemIdSortData array into decreasing itemoff order * / / sort the itemIdSortData array qsort ((char *) itemidbase, nitems, sizeof (itemIdSortData), itemoffcompare) in descending order of itemoff; / / page upper = phdr- > pd_special; for (I = 0; I
< nitems; i++) { itemIdSort itemidptr = &itemidbase[i]; ItemId lp; lp = PageGetItemId(page, itemidptr->Offsetindex + 1); upper-= itemidptr- > alignedlen; memmove ((char *) page + upper, (char *) page + itemidptr- > itemoff, itemidptr- > alignedlen); lp- > lp_off = upper;} phdr- > pd_upper = upper;} / * * ItemIdSetUnused * Set the item identifier to be UNUSED, with no storage. * Beware of multiple evaluations of itemId! * set ItemId to unused. * / # define ItemIdSetUnused (itemId)\ (\ (itemId)-> lp_flags = LP_UNUSED,\ (itemId)-> lp_off = 0,\ (itemId)-> lp_len = 0) 3. Tracking analysis
Test script: delete data, execute vacuum
11:04:59 (xdb@ [local]: 5432) testdb=# delete from T1 where id
< 600;DELETE 10014:26:16 (xdb@[local]:5432)testdb=# checkpoint;CHECKPOINT11:18:29 (xdb@[local]:5432)testdb=# vacuum verbose t1; lazy_vacuum_heap 启动gdb,设置断点 (gdb) b lazy_vacuum_heapBreakpoint 7 at 0x6bdf2e: file vacuumlazy.c, line 1472.(gdb) cContinuing.Breakpoint 7, lazy_vacuum_heap (onerel=0x7f4c70d96688, vacrelstats=0x1873928) at vacuumlazy.c:14721472 Buffer vmbuffer = InvalidBuffer;(gdb) 输入参数 1-relation (gdb) p *onerel$14 = {rd_node = {spcNode = 1663, dbNode = 16402, relNode = 50820}, rd_smgr = 0x18362e0, rd_refcnt = 1, rd_backend = -1, rd_islocaltemp = false, rd_isnailed = false, rd_isvalid = true, rd_indexvalid = 1 '\001', rd_statvalid = false, rd_createSubid = 0, rd_newRelfilenodeSubid = 0, rd_rel = 0x7f4c70d95bb8, rd_att = 0x7f4c70d95cd0, rd_id = 50820, rd_lockInfo = {lockRelId = {relId = 50820, dbId = 16402}}, rd_rules = 0x0, rd_rulescxt = 0x0, trigdesc = 0x0, rd_rsdesc = 0x0, rd_fkeylist = 0x0, rd_fkeyvalid = false, rd_partkeycxt = 0x0, rd_partkey = 0x0, rd_pdcxt = 0x0, rd_partdesc = 0x0, rd_partcheck = 0x0, rd_indexlist = 0x7f4c70d94820, rd_oidindex = 0, rd_pkindex = 0, rd_replidindex = 0, rd_statlist = 0x0, rd_indexattr = 0x0, rd_projindexattr = 0x0, rd_keyattr = 0x0, rd_pkattr = 0x0, rd_idattr = 0x0, rd_projidx = 0x0, rd_pubactions = 0x0, rd_options = 0x0, rd_index = 0x0, rd_indextuple = 0x0, rd_amhandler = 0, rd_indexcxt = 0x0, rd_amroutine = 0x0, rd_opfamily = 0x0, rd_opcintype = 0x0, rd_support = 0x0, rd_supportinfo = 0x0, rd_indoption = 0x0, rd_indexprs = 0x0, rd_indpred = 0x0, rd_exclops = 0x0, rd_exclprocs = 0x0, rd_exclstrats = 0x0, rd_amcache = 0x0, rd_indcollation = 0x0, rd_fdwroutine = 0x0, rd_toastoid = 0, pgstat_info = 0x182a030} 2-vacrelstats 存在索引,pages总数为124,扫描pages为124,原存活tuple为9501,新tuples为9401,已删除tuples为100,已删除的tuples的ItemPointer存储在dead_tuples数组中(大小为num_dead_tuples) (gdb) p *vacrelstats$15 = {hasindex = true, old_rel_pages = 124, rel_pages = 124, scanned_pages = 124, pinskipped_pages = 0, frozenskipped_pages = 0, tupcount_pages = 124, old_live_tuples = 9501, new_rel_tuples = 9401, new_live_tuples = 9401, new_dead_tuples = 0, pages_removed = 0, tuples_deleted = 100, nonempty_pages = 124, num_dead_tuples = 100, max_dead_tuples = 36084, dead_tuples = 0x1884820, num_index_scans = 0, latestRemovedXid = 397073, lock_waiter_detected = false}(gdb) 1.初始化变量 (gdb) n1474 pg_rusage_init(&ru0);(gdb) 1475 npages = 0;(gdb) 1477 tupindex = 0;(gdb) p ru0$16 = {tv = {tv_sec = 1548743482, tv_usec = 626506}, ru = {ru_utime = {tv_sec = 0, tv_usec = 40060}, ru_stime = { tv_sec = 0, tv_usec = 114769}, {ru_maxrss = 8900, __ru_maxrss_word = 8900}, {ru_ixrss = 0, __ru_ixrss_word = 0}, { ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 5455, __ru_minflt_word = 5455}, {ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 2616, __ru_inblock_word = 2616}, {ru_oublock = 376, __ru_oublock_word = 376}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, { ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 814, __ru_nvcsw_word = 814}, {ru_nivcsw = 2, __ru_nivcsw_word = 2}}} 2.遍历vacrelstats->Num_dead_tuples row pointer array (ItemPointer)
(gdb) n1478 while (tupindex
< vacrelstats->Num_dead_tuples) (gdb)
2.1 get block number / read block to buffer
1485 vacuum_delay_point (); (gdb) 1487 tblk = ItemPointerGetBlockNumber (& vacrelstats- > dead_ tuples [tupindex]); (gdb) 1488 buf = ReadBufferExtended (onerel, MAIN_FORKNUM, tblk, RBM_NORMAL, (gdb) (gdb) p tblk$17 = 29 (gdb) p buf$18 = 175
2.2 lock, if unsuccessful, process the next tuple
1490 if (! ConditionalLockBufferForCleanup (buf)) (gdb)
2.3 call lazy_vacuum_page to free up space and defragment
1496 tupindex = lazy_vacuum_page (onerel, tblk, buf, tupindex, vacrelstats, (gdb) p tupindex$1 = 0 (gdb) n1500 page = BufferGetPage (buf); (gdb) p tupindex$2 = 2 (gdb)
2.4 get the page and the free space of the page
(gdb) n1500 page = BufferGetPage (buf); (gdb) p tupindex$2 = 2 (gdb) n1501 freespace = PageGetHeapFreeSpace (page); (gdb)
2.5 release buffer and record free space
(gdb) 1503 UnlockReleaseBuffer (buf); (gdb) 1504 RecordPageWithFreeSpace (onerel, tblk, freespace); (gdb) 1505 npages++; (gdb)
Lazy_vacuum_page
Enter the lazy_vacuum_page function
1496 tupindex= lazy_vacuum_page (onerel, tblk, buf, tupindex, vacrelstats, (gdb) p tblk$3 = 30 (gdb) p buf$4 = 178 (gdb) p tupindex$5 = 2 (gdb) (gdb) steplazy_vacuum_page (onerel=0x7f4c70d95570, blkno=30, buffer=178, tupindex=2, vacrelstats=0x18676a8, vmbuffer=0x7fffaef4a19c) at vacuumlazy.c:15351535 Page page = BufferGetPage (buffer); (gdb)
Input parameters: block number / buffer number / tuple array subscript and vacrelstats (statistics + auxiliary storage information, such as obsolete tuple array, etc.)
(gdb) p vacrelstats- > dead_tuples [0] $6 = {ip_blkid = {bi_hi = 0, bi_lo = 29}, ip_posid = 168}
1. Initialize related variables
(gdb) n1537 int uncnt = 0; (gdb) 1541 pgstat_progress_update_param (PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno); (gdb) 1543 START_CRIT_SECTION (); (gdb) 1545 for (; tupindex)
< vacrelstats->Num_dead_tuples; tupindex++) (gdb) p page$7 = (Page) 0x7f4c44f46380 "\ 001" (gdb) p * page$8 = 1'\ 001' (gdb) p * (PageHeader *) page$9 = (PageHeader) 0x4ec2441800000001 (gdb) p * (PageHeader) page$10 = {pd_lsn = {xlogid = 1, xrecoff = 1321354264}, pd_checksum = 0, pd_flags = 1, pd_lower = 1188, pd_upper = 7856, pd_special = 8192, pd_pagesize_version = 8196, pd_prune_xid = 0, pd_linp = 0x7f4c44f46398} (gdb)
two。 Traversing an array of abandoned tuples
2.1 get the block number. If the block number is inconsistent, jump out of the loop
2.2 get offset / row pointer
2.3 Mark as unused, record offset
(gdb) n1551 tblk = ItemPointerGetBlockNumber (& vacrelstats- > dead_ tuples [tupindex]); (gdb) 1552 if (tblk! = blkno) (gdb) p tblk$11 = 30 (gdb) n1554 toff = ItemPointerGetOffsetNumber (& vacrelstats- > dead_ tuples [tupindex]); (gdb) p vacrelstats- > dead_tuples [tupindex] $12 = {ip_blkid = {bi_hi = 0, bi_lo = 30}, ip_posid = 162} (gdb) n1555 itemid = PageGetItemId (page, toff) (gdb) p toff$13 = 162 (gdb) n1556 ItemIdSetUnused (itemid); (gdb) p itemid$14 = (ItemId) 0x7f4c44f4661c (gdb) p * itemid$15 = {lp_off = 0, lp_flags = 3, lp_len = 0} (gdb) n1557 unused [uncnt++] = toff; (gdb) 1545 for (; tupindex)
< vacrelstats->Num_dead_tuples; tupindex++) (gdb)
3. Call PageRepairFragmentation to defragment
3.1 judgment and inspection (rigorous coding!)
(gdb) b vacuumlazy.c:1560Breakpoint 2 at 0x6be604: file vacuumlazy.c, line 1560. (gdb) cContinuing.Breakpoint 2, lazy_vacuum_page (onerel=0x7f4c70d95570, blkno=30, buffer=178, tupindex=5, vacrelstats=0x18676a8, vmbuffer=0x7fffaef4a19c) at vacuumlazy.c:15601560 PageRepairFragmentation (page); (gdb) (gdb) stepPageRepairFragmentation (page=0x7f4c44f46380 "\ 001") at bufpage.c:481481 Offset pd_lower = ((PageHeader) page)-> pd_lower; (gdb) n482 Offset pd_upper = ((PageHeader) page)-> pd_upper (gdb) 483Offset pd_special = ((PageHeader) page)-> pd_special; (gdb) 500if (pd_lower)
< SizeOfPageHeaderData ||(gdb) p pd_lower$17 = 1188(gdb) p pd_upper$18 = 7856(gdb) p pd_special$19 = 8192(gdb) n501 pd_lower >Pd_upper | | (gdb) 502 pd_upper > pd_special | | (gdb) 504 pd_special! = MAXALIGN (pd_special)) (gdb) 503 pd_special > BLCKSZ |
3.2 get offset, initialize variables
(gdb) 513 nline = PageGetMaxOffsetNumber (page); (gdb) n514 itemidptr = itemidbase; (gdb) 515 nunused = totallen = 0; (gdb) p nline$20 = 291 (gdb) p * itemidptr$21 = {offsetindex = 162, itemoff = 8144, alignedlen = 48} (gdb)
3.3 traversal row pointer array
3.3.1 get row pointer lp
3.3.2 record in the itemidbase array if ItemId is in use; otherwise, the tag ItemId is not used
(gdb) 516 for (I = FirstOffsetNumber; I
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.