In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "what is the implementation logic of ReserveXLogInsertLocation and CopyXLogRecordToWAL functions in PostgreSQL". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
The ReserveXLogInsertLocation function reserves the appropriate space for XLOG Record, and CopyXLogRecordToWAL is responsible for copying XLOG Record into the reserved space of WAL buffer.
I. data structure
Global variable
/ * flags for the in-progress insertion * / / the tag information used during insertion is static uint8 curinsert_flags = 0; / * * These are used to hold the record header while constructing a record. * 'hdr_scratch' is not a plain variable, but is palloc'd at initialization, * because we want it to be MAXALIGNed and padding bytes zeroed. * the header information of the record is usually stored when building a XLOG Record. * 'hdr_scratch' is not a plain variable, but is initialized with palloc during initialization, * because we want the variable to be MAXALIGNed and populated by 0x00. * For simplicity, it's allocated large enough to hold the headers for any * WAL record. * for simplicity, this variable allocates enough space in advance to store all WAL Record header information. * / static XLogRecData hdr_rdt; static char * hdr_scratch = NULL; # define SizeOfXlogOrigin (sizeof (RepOriginId) + sizeof (char)) # define HEADER_SCRATCH_SIZE\ (SizeOfXLogRecord +\ MaxSizeOfXLogRecordBlockHeader * (XLR_MAX_BLOCK_ID + 1) +\ SizeOfXLogRecordDataHeaderLong + SizeOfXlogOrigin) / * * An array of XLogRecData structs, to hold registered data. * XLogRecData structure array to store registered data. * / static XLogRecData * rdatas; static int num_rdatas; / * entries currently used * / / allocated space static int max_rdatas; / * allocated size * / / whether to call the XLogBeginInsert function static bool begininsert_called = false; static XLogCtlData * XLogCtl = NULL; / * flags for the in-progress insertion * / static uint8 curinsert_flags = 0; / * A chain of XLogRecDatas to hold the "main data" of a WAL record, registered * with XLogRegisterData (...). * store the mainrdata size of the XLogRecDatas data link * / static XLogRecData * mainrdata_head; static XLogRecData * mainrdata_last = (XLogRecData *) & mainrdata_head; / / somewhere in the WAL Record "main data" static uint32 mainrdata_len; / * total # of bytes in chain * / / * ProcLastRecPtr points to the start of the last XLOG record inserted by the * current backend. It is updated for all inserts. XactLastRecEnd points to * end+1 of the last record, and is reset when we end a top-level transaction, * or start a new one; so it can be used to tell if the current transaction has * created any XLOG records. * ProcLastRecPtr points to the beginning of the last XLOG record inserted by the current backend. * it updates for all inserts. * XactLastRecEnd points to + 1 at the end of the last record, * and resets when ending a top-level transaction or starting a new transaction; * therefore, it can be used to determine whether any XLOG records have been created by the current transaction. * While in parallel mode, this may not be fully up to date. When committing, * a transaction can assume this covers all xlog records written either by the * user backend or by any parallel worker which was present at any point during * the transaction. But when aborting, or when still in parallel mode, other * parallel backends may have written WAL records at later LSNs than the value * stored here. The parallel leader advances its own copy, when necessary, * in WaitForParallelWorkersToFinish. * in parallel mode, this may not be entirely up to date. * at the time of commit, the transaction can be assumed to overwrite all xlog records of the user background process or parallel worker processes that occur during the transaction. * however, when aborted, or when still in parallel mode, other parallel background processes may have written WAL records in a later LSNs, * instead of the values stored here. * the leader of the parallel processing process pushes its own copy in WaitForParallelWorkersToFinish when needed. * / XLogRecPtr ProcLastRecPtr = InvalidXLogRecPtr; XLogRecPtr XactLastRecEnd = InvalidXLogRecPtr; XLogRecPtr XactLastCommitEnd = InvalidXLogRecPtr; / * For WALInsertLockAcquire/Release functions * / / for the WALInsertLockAcquire/Release function static int MyLockNo = 0; static bool holdingAllLocks = false; / * * Private, possibly out-of-date copy of shared LogwrtResult. * See discussion above. * copies of shared LogwrtResult variables that are private to the process that may have expired. * / static XLogwrtResult LogwrtResult = {0,0}; / * The number of bytes in a WAL segment usable for WAL data. * / / the number of bytes available for WAL data in WAL segment file (excluding page header) static int UsableBytesInSegment
Macro definition
The flags used by the XLogRegisterBuffer function
/ * flags # define REGBUF_FORCE_IMAGE 0x01 / * used by the XLogRegisterBuffer function to enforce full-page-write;force a full-page image * / # define REGBUF_NO_IMAGE 0x02 / * does not require FPI;don't take a full-page image * / # define REGBUF_WILL_INIT (0x04 | 0x02) / * reinitialize page on playback (indicates NO_IMAGE) * page will be re-initialized at * replay (implies NO_IMAGE) * / # define REGBUF_STANDARD 0x08 / * standard page layout (data between pd_lower and pd_upper will be skipped) * page follows "standard" page layout * (data between pd_lower and pd_upper * will be skipped) * / # define REGBUF_KEEP_DATA 0x10 / * include data even if a full-page image * is taken * / / * * Flag bits for the record being inserted, set using XLogSetRecordFlags (). * / # define XLOG_INCLUDE_ORIGIN 0x01 / * include the replication origin * / # define XLOG_MARK_UNIMPORTANT 0x02 / * record not important for durability * / # define XLogSegmentOffset (xlogptr, wal_segsz_bytes)\ (xlogptr) & ((wal_segsz_bytes)-1)) / * * Calculate the amount of space left on the page after 'endptr'. Beware * multiple evaluation! * calculates the remaining free space in the page after "endptr". Note multiple evaluation! * / # define INSERT_FREESPACE (endptr)\ (endptr)% XLOG_BLCKSZ = = 0)? 0: (XLOG_BLCKSZ-(endptr)% XLOG_BLCKSZ))
XLogRecData
The functions in xloginsert.c construct a chain of XLogRecData structures to identify the last WAL record
/ * * The functions in xloginsert.c construct a chain of XLogRecData structs * to represent the final WAL record. * the function in xloginsert.c constructs a XLogRecData structure chain to identify the next structure in the last WAL record * / typedef struct XLogRecData {/ / chain. If none, it is the starting address of NULL struct XLogRecData * next; / * next struct in chain, or NULL * / rmgr data starting address char * data; / * start of rmgr data to include * / / rmgr data size uint32 len / * length of rmgr data to include * /} XLogRecData; II. Source code interpretation
ReserveXLogInsertLocation
Set aside the appropriate space in WAL (buffer) for records of a given size. * StartPos is set to the beginning of the reserved section, and * EndPos is set to its end + 1. * PrePtr is set to the beginning of the previous record; it is used to set the xl_prev variable for that record.
/ * Reserves the right amount of space for a record of given size from the WAL. * * StartPos is set to the beginning of the reserved section, * EndPos to * its end+1. * PrevPtr is set to the beginning of the previous record; it is * used to set the xl_prev of this record. * Reserve appropriate space in WAL (buffer) for records of a given size. * * StartPos is set to the beginning of the reserved section, and * EndPos is set to its end + 1. * * PrePtr is set to the beginning of the previous record; it is used to set the xl_prev of that record. * This is the performance critical part of XLogInsert that must be serialized * across backends. The rest can happen mostly in parallel. Try to keep this * section as short as possible, insertpos_lck can be heavily contended on a * busy system. * this is a part of XLogInsert that is closely related to performance and must be executed sequentially between background processes. * most of the rest can happen simultaneously. * simplify this part of the logic as much as possible. Insertpos_lck can compete fiercely on busy systems. * * NB: The space calculation here must match the code in CopyXLogRecordToWAL, * where we actually copy the record to the reserved space. * Note: the space calculated here must be the same as the CopyXLogRecordToWAL () function. * the data will actually be copied to the reserved space in CopyXLogRecordToWAL. * / static void ReserveXLogInsertLocation (int size, XLogRecPtr * StartPos, XLogRecPtr * EndPos, XLogRecPtr * PrevPtr) {XLogCtlInsert * Insert = & XLogCtl- > Insert;// insert controller uint64 startbytepos;// start position uint64 endbytepos;// end position uint64 prevbytepos;// one position size = MAXALIGN (size); / / size alignment / * All (non xlog-switch) records should contain data. * / / all records should contain data except xlog-switch. Assert (size > SizeOfXLogRecord); / * The duration the spinlock needs to be held is minimized by minimizing * the calculations that have to be done while holding the lock. The * current tip of reserved WAL is kept in CurrBytePos, as a byte position * that only counts "usable" bytes in WAL, that is, it excludes all WAL * page headers. The mapping between "usable" byte positions and physical * positions (XLogRecPtrs) can be done outside the locked region, and * because the usable byte position doesn't include any headers, reserving * X bytes from WAL is almost as simple as "CurrBytePos + = X". * the time required for spinlock to hold is minimized by minimizing the computational logic that must hold the lock. * the reserved WAL space is saved by the CurrBytePos variable (size one byte), * it only calculates the "available" bytes in the WAL, that is, it excludes all WAL page header. * the mapping between the "available" byte position and the physical position (XLogRecPtrs) can be done outside the locked area, and because the available byte position does not contain any header, reserving the size of X bytes from WAL is almost as simple as "CurrBytePos + = X". * / SpinLockAcquire (& Insert- > insertpos_lck); / / apply for lock / / start position startbytepos = Insert- > CurrBytePos; / / end position endbytepos = startbytepos + size; / / previous position prevbytepos = Insert- > PrevBytePos; / / adjust controller related variables Insert- > CurrBytePos = endbytepos; Insert- > PrevBytePos = startbytepos; / / release lock SpinLockRelease (& Insert- > insertpos_lck) / / return value / / calculate start / end / previous position offset * StartPos = XLogBytePosToRecPtr (startbytepos); * EndPos = XLogBytePosToEndRecPtr (endbytepos); * PrevPtr = XLogBytePosToRecPtr (prevbytepos); / * Check that the conversions between "usable byte positions" and * XLogRecPtrs work consistently in both directions. * check that the values after two-way conversion are consistent. * / Assert (XLogRecPtrToBytePos (* StartPos) = = startbytepos); Assert (XLogRecPtrToBytePos (* EndPos) = = endbytepos); Assert (XLogRecPtrToBytePos (* PrevPtr) = = prevbytepos);} / * Converts a "usable byte position" to XLogRecPtr. A usable byte position * is the position starting from the beginning of WAL, excluding all WAL * page headers. * convert "available byte position" to XLogRecPtr. * available byte positions are those that start with WAL and do not include all WAL page header. * / static XLogRecPtr XLogBytePosToRecPtr (uint64 bytepos) {uint64 fullsegs; uint64 fullpages; uint64 bytesleft; uint32 seg_offset; XLogRecPtr result; fullsegs = bytepos / UsableBytesInSegment; bytesleft = bytepos% UsableBytesInSegment; if (bytesleft
< XLOG_BLCKSZ - SizeOfXLogLongPHD) { //剩余的字节数 < XLOG_BLCKSZ - SizeOfXLogLongPHD /* fits on first page of segment */ //填充在segment的第一个page中 seg_offset = bytesleft + SizeOfXLogLongPHD; } else { //剩余的字节数 >= XLOG_BLCKSZ-SizeOfXLogLongPHD / * account for the first page on segment with long header * / / explain in segment that long header seg_offset = XLOG_BLCKSZ; bytesleft-= XLOG_BLCKSZ-SizeOfXLogLongPHD; fullpages = bytesleft / UsableBytesInPage; bytesleft = bytesleft% UsableBytesInPage; seg_offset + = fullpages * XLOG_BLCKSZ + bytesleft + SizeOfXLogShortPHD;} XLogSegNoOffsetToRecPtr (fullsegs, seg_offset, wal_segment_size, result) Return result;} / * The number of bytes in a WAL segment usable for WAL data. * / / the number of bytes available for WAL data in WAL segment file (excluding page header) static int UsableBytesInSegment
CopyXLogRecordToWAL
CopyXLogRecordToWAL is a subprocess in XLogInsertRecord that is used to copy XLOG Record to a reserved area in WAL.
/ * Subroutine of XLogInsertRecord. Copies a WAL record to an already-reserved * area in the WAL. * Sub-processes in XLogInsertRecord. * copy XLOG Record to the reserved area in WAL. * / static void CopyXLogRecordToWAL (int write_len, bool isLogSwitch, XLogRecData * rdata, XLogRecPtr StartPos, XLogRecPtr EndPos) {char * currpos;// current pointer location int freespace;// free space int written;// size XLogRecPtr CurrPos;// transaction log location XLogPageHeader pagehdr that has been written / / Page Header / * * Get a pointer to the right place in the right WAL buffer to start * inserting to. * get the pointer in the appropriate WAL buffer to determine the insertion position * / CurrPos = StartPos;// is assigned to the start position currpos = GetXLogBuffer (CurrPos); / / get the buffer pointer freespace = INSERT_FREESPACE (CurrPos); / / get the free space size / * * there should be enough space for at least the first field (xl_tot_len) * on this page. * there is storage space for at least the first field (xl_tot_len) on this page * / Assert (freespace > = sizeof (uint32)); / * Copy record data * / / copy record data written = 0; while (rdata! = NULL) / / Loop {char * rdata_data = rdata- > data;// pointer int rdata_len = rdata- > len / / size while (rdata_len > freespace) / / Loop {/ * * Write what fits on this page, and continue on the next page. Write as much as you can on the page, and move on to the next page if you can't finish it. * / / ensure that at least SizeOfXLogShortPHD header data storage space Assert (CurrPos% XLOG_BLCKSZ > = SizeOfXLogShortPHD | | freespace = = 0) is left; / / memory copy memcpy (currpos, rdata_data, freespace); / / pointer adjustment rdata_data + = freespace; / / resize rdata_len-= freespace / / write resize written + = freespace; / / current relocation CurrPos + = freespace; / * * Get pointer to beginning of next page, and set the xlp_rem_len * in the page header. Set XLP_FIRST_IS_CONTRECORD. * get the start pointer of the next page and set xlp_rem_len. Xlp_rem_len in the header of the next page. * set XLP_FIRST_IS_CONTRECORD tag at the same time. * * It's safe to set the contrecord flag and xlp_rem_len without a * lock on the page. All the other flags were already set when the * page was initialized, in AdvanceXLInsertBuffer, and we're the * only backend that needs to set the contrecord flag. * it is safe to set contrecord tags and xlp_rem_len even if you do not hold a page lock. * when the page is initialized, all other tags have been initialized by the AdvanceXLInsertBuffer function. * We are the only background process that needs to set the contrecord tag, and there will be no other processes. * / currpos = GetXLogBuffer (CurrPos); / / get buffer pagehdr = (XLogPageHeader) currpos;// get page header pagehdr- > xlp_rem_len = write_len-written;// setting xlp_rem_len pagehdr- > xlp_info | = XLP_FIRST_IS_CONTRECORD / / set the tag / * skip over the page header * / / skip page header if (XLogSegmentOffset (CurrPos, wal_segment_size) = = 0) / / the first page {CurrPos + = SizeOfXLogLongPHD;//Long Header currpos + = SizeOfXLogLongPHD;} else {CurrPos + = SizeOfXLogShortPHD / / not the first page,Short Header currpos + = SizeOfXLogShortPHD;} freespace = INSERT_FREESPACE (CurrPos); / / get free space} / / verify Assert again (CurrPos% XLOG_BLCKSZ > = SizeOfXLogShortPHD | | rdata_len = = 0); / / memory copy (in this case rdata_len next / / next batch of data} Assert (written = = write_len); / / ensure that it has been written = = the size to be written / * * If this was an xlog-switch, it's not enough to write the switch record, * we also have to consume all the remaining space in the WAL segment. We * have already reserved that space, but we need to actually fill it. * if it is xlog-switch and there is not enough space to write the switching record, * at this time you have to consume the remaining space of WAL segment. * We have reserved space, but we need to perform the actual filling. * / if (isLogSwitch & & XLogSegmentOffset (CurrPos, wal_segment_size)! = 0) {/ * An xlog-switch record doesn't contain any data besides the header * / / after header, xlog-switch does not contain any data. Assert (write_len = = SizeOfXLogRecord); / * Assert that we did reserve the right amount of space * / / verify that the appropriate space Assert (XLogSegmentOffset (EndPos, wal_segment_size) = = 0) is reserved; / * Use up all the remaining space on the current page * / / use all the remaining space CurrPos + = freespace on the current page / * * Cause all remaining pages in the segment to be flushed, leaving the * XLog position where it should be, at the start of the next segment. * We do this one page at a time, to make sure we don't deadlock * against ourselves if wal_buffers
< wal_segment_size. * 由于该segment中所有剩余pages将被刷出,把XLog位置指向下一个segment的开始. * 一个page我们只做一次,在wal_buffers < wal_segment_size的情况下, * 确保我们自己不会出现死锁. */ while (CurrPos < EndPos)//循环 { /* * The minimal action to flush the page would be to call * WALInsertLockUpdateInsertingAt(CurrPos) followed by * AdvanceXLInsertBuffer(...). The page would be left initialized * mostly to zeros, except for the page header (always the short * variant, as this is never a segment's first page). * 刷出page的最小化动作是:调用WALInsertLockUpdateInsertingAt(CurrPos) * 然后接着调用AdvanceXLInsertBuffer(...). * 除了page header(通常为short格式,除了segment的第一个page)外,其余部分均初始化为ascii 0. * * The large vistas of zeros are good for compressibility, but the * headers interrupting them every XLOG_BLCKSZ (with values that * differ from page to page) are not. The effect varies with * compression tool, but bzip2 for instance compresses about an * order of magnitude worse if those headers are left in place. * 连续的ascii 0非常适合压缩,但每个page的头部数据(用于分隔page&page)把这些0隔开了. * 这种效果随压缩工具的不同而不同,但是如果保留这些头文件,则bzip2的压缩效果会差一个数量级。 * * Rather than complicating AdvanceXLInsertBuffer itself (which is * called in heavily-loaded circumstances as well as this lightly- * loaded one) with variant behavior, we just use GetXLogBuffer * (which itself calls the two methods we need) to get the pointer * and zero most of the page. Then we just zero the page header. * 与其让AdvanceXLInsertBuffer本身(在重载环境和这个负载较轻的环境中调用)变得复杂, * 不如使用GetXLogBuffer(调用了我们需要的两个方法)来初始化page(初始化为ascii 0)/ * 然后把page header设置为ascii 0. */ currpos = GetXLogBuffer(CurrPos);//获取buffer MemSet(currpos, 0, SizeOfXLogShortPHD);//设置头部为ascii 0 CurrPos += XLOG_BLCKSZ;//修改指针 } } else { /* Align the end position, so that the next record starts aligned */ //对齐末尾位置,以便下一个记录可以从对齐的位置开始 CurrPos = MAXALIGN64(CurrPos); } if (CurrPos != EndPos)//验证 elog(PANIC, "space reserved for WAL record does not match what was written"); }三、跟踪分析 测试脚本如下: drop table t_wal_longtext; create table t_wal_longtext(c1 int not null,c2 varchar(3000),c3 varchar(3000),c4 varchar(3000)); insert into t_wal_longtext(c1,c2,c3,c4) select i,rpad('C2-'||i,3000,'2'),rpad('C3-'||i,3000,'3'),rpad('C4-'||i,3000,'4') from generate_series(1,7) as i; ReserveXLogInsertLocation 插入数据: insert into t_wal_longtext(c1,c2,c3,c4) VALUES(8,'C2-8','C3-8','C4-8'); 设置断点,进入ReserveXLogInsertLocation (gdb) b ReserveXLogInsertLocation Breakpoint 1 at 0x54d574: file xlog.c, line 1244. (gdb) c Continuing. Breakpoint 1, ReserveXLogInsertLocation (size=74, StartPos=0x7ffebea9d768, EndPos=0x7ffebea9d760, PrevPtr=0x244f4c8) at xlog.c:1244 1244 XLogCtlInsert *Insert = &XLogCtl->Insert; (gdb)
Enter parameters:
Size=74, which is the size of the XLOG Record to be inserted, and the other three values to be set.
Continue to carry out.
Alignment, 74-> 80 (N times of 8 is required, unit64 occupies 8bytes, so multiples of 8 are required)
(gdb) n 1249 size = MAXALIGN (size); (gdb) 1252 Assert (size > SizeOfXLogRecord); (gdb) p size $1 = 80 (gdb)
View the information inserted into the controller, where:
CurrBytePos = 5498377520, hexadecimal is 0x147BA9530
PrevBytePos = 5498377464, hexadecimal is 0x147BA94F8
RedoRecPtr = 5514382312, hexadecimal is 0x148AECBE8-- > Latest checkpoint's REDO location in pg_control
(gdb) n 1264 SpinLockAcquire (& Insert- > insertpos_lck); (gdb) 1266 startbytepos = Insert- > CurrBytePos; (gdb) p * Insert $2 = {insertpos_lck = 1'\ 001, CurrBytePos = 5498377520, PrevBytePos = 5498377464, pad ='\ 000', RedoRecPtr = 5514382312, forcePageWrites = false, fullPageWrites = true, exclusiveBackupState = EXCLUSIVE_BACKUP_NONE, nonExclusiveBackups = 0, lastBackupStart = 0, WALInsertLocks = 0x7f97d1eeb100} (gdb)
Set the corresponding value.
It is worth noting that the location information inserted into the controller Insert does not include information such as page header, and it is purely available log data, so the value is smaller than that of WAL segment file.
(gdb) n 1267 endbytepos = startbytepos + size; (gdb) 1268 prevbytepos = Insert- > PrevBytePos; (gdb) 1269 Insert- > CurrBytePos = endbytepos; (gdb) 1270 Insert- > PrevBytePos = startbytepos; (gdb) 1272 SpinLockRelease (& Insert- > insertpos_lck);
As mentioned earlier, the "available byte position" needs to be converted to XLogRecPtr.
Calculate the actual start / end / previous position.
StartPos = 5514538672pr 0x148B12EB0
EndPos = 5514538752j0x148B12F00
PrevPtr = 5514538616jue 0x148B12E78
(gdb) n 1274 * StartPos = XLogBytePosToRecPtr (startbytepos); (gdb) 1275 * EndPos = XLogBytePosToEndRecPtr (endbytepos); (gdb) 1276 * PrevPtr = XLogBytePosToRecPtr (prevbytepos); (gdb) 1282 Assert (XLogRecPtrToBytePos (* StartPos) = = startbytepos); (gdb) p * StartPos $4 = 5514538672 (gdb) p * EndPos $5 = 5514538752 (gdb) p * PrevPtr $6 = 5514538616 (gdb)
There is no problem verifying the conversion to and from each other.
(gdb) n 1283 Assert (XLogRecPtrToBytePos (* EndPos) = = endbytepos); (gdb) 1284 Assert (XLogRecPtrToBytePos (* PrevPtr) = = prevbytepos); (gdb) 1285} (gdb) XLogInsertRecord (rdata=0xf9cc70, fpw_lsn=5514538520, flags=1'\ 001') at xlog.c:1072 1072 inserted = true; (gdb)
DONE!
CopyXLogRecordToWAL- scenario 1: not across WAL page
The test script is as follows:
Insert into t_wal_longtext (C1, c2, c3, 4) VALUES (8-8, C2-8, C3-8, C3-8, C4-8')
Continue the tracking of the previous SQL.
Set breakpoint and enter CopyXLogRecordToWAL
(gdb) b CopyXLogRecordToWAL Breakpoint 3 at 0x54dcdf: file xlog.c, line 1479. (gdb) c Continuing. Breakpoint 3, CopyXLogRecordToWAL (write_len=74, isLogSwitch=false, rdata=0xf9cc70, StartPos=5514538672, EndPos=5514538752) at xlog.c:1479 1479 CurrPos = StartPos; (gdb)
Enter parameters:
Write_len=74,-- > size to be written
IsLogSwitch=false,-- > whether to switch logs (not required)
Rdata=0xf9cc70,-- > address of data to be written
StartPos=5514538672,-- > start position
EndPos=5514538752-- > end position
(gdb) n 1480 currpos = GetXLogBuffer (CurrPos); (gdb)
Get the pointer in the appropriate WAL buffer to determine the location of the insertion.
Enter the function GetXLogBuffer, and the input parameter ptr is 5514538672, that is, the start position.
(gdb) step GetXLogBuffer (ptr=5514538672) at xlog.c:1854 1854 if (ptr / XLOG_BLCKSZ = = cachedPage) (gdb) p ptr / 8192-- > Die $7 = 673161 (gdb) (gdb) p cachedPage $8 = 673161 (gdb)
GetXLogBuffer- > ptr / XLOG_BLCKSZ = = cachedPage, enter the corresponding processing logic
Note: cachedPage is a static variable, where the value is assigned, which needs to be analyzed later
(gdb) n 1856 Assert (XLogPageHeader) cachedPos)-> xlp_magic = = XLOG_PAGE_MAGIC); (gdb) 1857 Assert (XLogPageHeader) cachedPos)-> xlp_pageaddr = = ptr-(ptr% XLOG_BLCKSZ)); (gdb) 1858 return cachedPos + ptr% XLOG_BLCKSZ
GetXLogBuffer- > cachedPos begins with the XLogPageHeader structure
(gdb) p * ((XLogPageHeader) cachedPos) $14 = {xlp_magic = 53400, xlp_info = 5, xlp_tli = 1, xlp_pageaddr = 5514534912, xlp_rem_len = 71} (gdb) (gdb) x/24bx (0x7f97d29fe000) 0x7f97d29fe000: 0x98 0xd0 0x05 0x00 0x01 0x00 0x00 0x00 0x7f97d29fe008: 0x00 0x20 0xb1 0x48 0x01 0x00 0x00 0x00 0x7f97d29fe010: 0x47 0x00 0x00 0x00 0x00 0x00 0x00 0x00
The address back to CopyXLogRecordToWAL,buffer is 0x7f97d29feeb0.
(gdb) n 1945} (gdb) CopyXLogRecordToWAL (write_len=74, isLogSwitch=false, rdata=0xf9cc70, StartPos=5514538672, EndPos=5514538752) at xlog.c:1481 1481 freespace = INSERT_FREESPACE (CurrPos); (gdb) (gdb) p currpos $16 = 0x7f97d29feeb0 "" (gdb)
Calculate the free space and make sure there is at least 4 bytes of storage for the first field (xl_tot_len) on the page.
(gdb) n 1487 Assert (freespace > = sizeof (uint32)); (gdb) p freespace $21 = 4432 (gdb)
Start copying recorded data.
(gdb) n 1490 written = 0;-- > record the size written (gdb) 1491 while (rdata! = NULL)
For details of the analysis of rdata, see part IV. Continue with the implementation.
(gdb) n 1493 char * rdata_data = rdata- > data; (gdb) 1494 int rdata_len = rdata- > len; (gdb) 1496 while (rdata_len > freespace) (gdb) p rdata_len $34 = 46 (gdb) p freespace $35 = 4432 (gdb)
Rdata_len
< freespace,无需进入子循环. 再次进行验证没有问题,执行内存拷贝. (gdb) n 1536 Assert(CurrPos % XLOG_BLCKSZ >= SizeOfXLogShortPHD | | rdata_len = = 0); (gdb) 1537 memcpy (currpos, rdata_data, rdata_len); (gdb) 1538 currpos + = rdata_len; (gdb) 1539 CurrPos + = rdata_len; (gdb) 1540 freespace-= rdata_len; (gdb) 1541 written + = rdata_len; (gdb) 1543 rdata = rdata- > next (gdb) 1491 while (rdata! = NULL) (gdb) p currpos $36 = 0x7f97d29feede "" (gdb) p CurrPos $37 = 5514538718 (gdb) p freespace $38 = 4386 (gdb) p written $39 = 46 (gdb)
Rdata has four parts, continue to write the second / third / fourth part.
.. 1491 while (rdata! = NULL) (gdb) 1493 char * rdata_data = rdata- > data; (gdb) 1494 int rdata_len = rdata- > len; (gdb) 1496 while (rdata_len > freespace) (gdb) 1536 Assert (CurrPos% XLOG_BLCKSZ > = SizeOfXLogShortPHD | rdata_len = = 0) (gdb) 1537 memcpy (currpos, rdata_data, rdata_len); (gdb) 1538 currpos + = rdata_len; (gdb) 1539 CurrPos + = rdata_len; (gdb) 1540 freespace-= rdata_len; (gdb) 1541 written + = rdata_len; (gdb) 1543 rdata = rdata- > next; (gdb) 1491 while (rdata! = NULL) (gdb)
Finish writing to 74bytes
(gdb) 1545 Assert (written = = write_len); (gdb) p written $40 = 74 (gdb)
There is no need to perform log switching operations.
Align CurrPos
(gdb) n 1552 if (isLogSwitch & & XLogSegmentOffset (CurrPos, wal_segment_size)! = 0) (gdb) 1599 CurrPos = MAXALIGN64 (CurrPos); (gdb) p CurrPos $41 = 5514538746 (gdb) n 1602 if (CurrPos! = EndPos) (gdb) p CurrPos $42 = 5514538752 (gdb) (gdb) p 5514538746% 8 $44 = 2-> 6 bytes need to be filled, 5514538746-> 5514538752
After alignment, CurrPos = = EndPos, otherwise report an error!
(gdb) p EndPos $45 = 5514538752
End the call
(gdb) n 1604} (gdb) XLogInsertRecord (rdata=0xf9cc70, fpw_lsn=5514538520, flags=1'\ 001') at xlog.c:1098 1098 if ((flags & XLOG_MARK_UNIMPORTANT) = = 0) (gdb)
DONE!
CopyXLogRecordToWAL- scenario 2: subsequent analysis across WAL page
IV. Further discussion on WAL Record
In memory, WAL Record is stored through rdata, which is actually a global static variable hdr_rdt, and the type XLogRecData,XLOG Record is organized by XLogRecData linked list (this design is great, write regardless of structure, write data one by one according to the linked list).
Rdata consists of four parts:
The first part is XLogRecord + XLogRecordBlockHeader + XLogRecordDataHeaderShort, a total of 46 bytes
The second part is xl_heap_header,5 bytes.
The third part is tuple data,20 bytes.
The fourth part is xl_heap_insert,3 bytes.
-1 (gdb) p * rdata $22 = {next = 0x244f2c0, data = 0x244f4c0 "J", len = 46} (gdb) p * (XLogRecord *) rdata- > data--> XLogRecord $27 = {xl_tot_len = 74, xl_xid = 2268, xl_prev = 5514538616 Xl_info = 0'\ 000mm, xl_rmid = 10'\ nmm, xl_crc = 1158677949} (gdb) p * (XLogRecordBlockHeader *) (0x244f4c0+24)-> XLogRecordBlockHeader $29 = {id = 0'\ 000mm, fork_flags = 32'' Data_length = 25} (gdb) x/2bx (0x244f4c0+44)-> XLogRecordDataHeaderShort 0x244f4ec: 0xff 0x03-2 (gdb) p * rdata- > next $23 = {next = 0x244f2d8, data = 0x7ffebea9d830 "\ 004" Len = 5} (gdb) p * (xl_heap_header *) rdata- > next- > data $32 = {t_infomask2 = 4, t_infomask = 2050 T_hoff = 24'\ 030'}-3 (gdb) p * rdata- > next- > next $24 = {next = 0x244f2a8, data = 0x24e6a2f "" Len = 20} (gdb) x/20bc 0x24e6a2f 0x24e6a2f: 0'\ 0008'\ b' 00000'\ 00000'\ 00011'\ v' 67'C' 50'2' 0x24e6a37: 45'-'56'8' 11'\ v' 67'C' 51'3' 45'-'56 '811'\ v '0x24e6a3f: 67' C' 52'4' 45 '-' 56'8' (gdb)-4 (gdb) p * rdata- > next- > next- > next $25 = {next = 0x0 Data = 0x7ffebea9d840 "\ b", len = 3} (gdb) (gdb) p * (xl_heap_insert *) rdata- > next- > data $33 = {offnum = 8, flags = 0'\ 000'} "what is the implementation logic of ReserveXLogInsertLocation and CopyXLogRecordToWAL functions in PostgreSQL"? Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.