Analysis of Old Master Node in PostgreSQL 04/16 Update SLTechnology News&Howtos

Analysis of Old Master Node in PostgreSQL

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "Old Master Node Analysis in PostgreSQL". Many people will encounter this dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

In the PostgreSQL HA environment based on streaming replication, such as network access / hardware failure, the Standby node is upgraded to Master node, but the Old Master node database is not damaged. After troubleshooting, the Old Master node can become the Standby node of the New Master node through the pg_rewind tool without backup.

What exactly did you do when you executed the command pg_rewind?

Zero, principle

In a PostgreSQL HA environment, after a Standby node is upgraded to a Master node, the timeline will be switched to a new timeline, such as from 1 to 2. While the timeline of the Old Master node is still the original timeline, for example, it is still 1, so using the pg_rewind tool, how can the Old Master node read the relevant data from the New Master node to become the new Standby node?

To put it simply, there are the following steps:

1. Determine the Checkpoint location of New Master and Old Master data consistency. At this location, the New Master and Old Master data are exactly the same. This can be obtained by reading the new Old Master node timeline history file, which is located in the $PGDATA/pg_wal/ directory and is named XX.history

The 2.Old Master node reads the local log file WAL Record according to the Checkpoint obtained in the previous step, obtains the Block that changes after this Checkpoint, and stores information such as Block number in the way of linked list.

3. Copy the corresponding Block from the New Master node according to the Block information obtained in step 2, and replace the corresponding Block of the Old Master node

4. Copy all other files except data files on the New Master node, including configuration files, etc. (if you copy data files, it is not different from backup)

5.Old Master starts the database and applies the WAL Record from Checkpoint.

After the master / standby handover, the timeline of the New Master node is switched to n + 1. Through pg_rewind, the Old Master can start to synchronize with the New Master at the bifurcation point and become a New Standby node.

I. data structure

XLogRecPtr

The WAL Record address space address of 64bit.

/ * Pointer to a location in the XLOG. These pointers are 64 bits wide, * because we don't want them ever to overflow. * point to the location in XLOG. * these pointers are 64bit in size to ensure that the pointer does not overflow. * / typedef uint64 XLogRecPtr

TimeLineID

Timeline ID

Typedef uint32 TimeLineID; II. Source code interpretation

Pg_rewind source code is relatively simple, please refer to the notes for details.

Intmain (int argc, char * * argv) {static struct option long_options [] = {{"help", no_argument, NULL,'?'}, {"target-pgdata", required_argument, NULL,'D'}, {"source-pgdata", required_argument, NULL, 1}, {"source-server", required_argument, NULL, 2}, {"version", no_argument, NULL,'V'} {"dry-run", no_argument, NULL,'n'}, {"no-sync", no_argument, NULL,'N'}, {"progress", no_argument, NULL,'P'}, {"debug", no_argument, NULL, 3}, {NULL, 0, NULL, 0}} / / Command option int option_index;// option No. Int c _ Bandash / character ASCII code XLogRecPtr divergerec;// branch point int lastcommontliIndex; XLogRecPtr chkptrec;//checkpoint Record location TimeLineID chkpttli;// timeline XLogRecPtr chkptredo;checkpoint REDO location size_t size; char * does the buffer;// buffer bool rewind_needed;// require rewind XLogRecPtr endrec / / end point TimeLineID endtli;// end timeline ControlFileData ControlFile_new;// new control file set_pglocale_pgservice (argv [0], PG_TEXTDOMAIN ("pg_rewind")); progname = get_progname (argv [0]) / * Process command-line arguments * / / process command line parameter if (argc > 1) {if (strcmp (argv [1], "--help") = = 0 | | strcmp (argv [1], "-?") = = 0) {usage (progname); exit (0) } if (strcmp (argv [1], "--version") = = 0 | | strcmp (argv [1], "- V") = = 0) {puts ("pg_rewind (PostgreSQL)" PG_VERSION); exit (0) }} while ((c = getopt_long (argc, argv, "D:nNP", long_options, & option_index))! =-1) {switch (c) {case'?': fprintf (stderr, _ ("Try\"% s-help\ "for more information.\ n"), progname); exit (1) Case'Pants: showprogress = true; break; case'nails: dry_run = true; break; case'Nables: do_sync = false; break; case 3: debug = true Break; case'Downs: / *-D or-- target-pgdata * / datadir_target = pg_strdup (optarg); break; case 1: / *-- source-pgdata * / datadir_source = pg_strdup (optarg); break Case 2: / *-- source-server * / connstr_source = pg_strdup (optarg); break;}} if (datadir_source = = NULL & & connstr_source = = NULL) {fprintf (stderr, _ ("% s: no source specified (--source-pgdata or-- source-server)\ n"), progname) Fprintf (stderr, _ ("Try\"% s-help\ "for more information.\ n"), progname); exit (1);} if (datadir_source! = NULL & & connstr_source! = NULL) {fprintf (stderr, _ ("% s: only one of-source-pgdata or-source-server can be specified\ n"), progname) Fprintf (stderr, _ ("Try\"% s-- help\ "for more information.\ n"), progname); exit (1);} if (datadir_target = = NULL) {fprintf (stderr, _ ("% s: no target data directory specified (--target-pgdata)\ n"), progname); fprintf (stderr, _ ("Try\"% s-help\ "for more information.\ n"), progname)) Exit (1);} if (optind

< argc) { fprintf(stderr, _("%s: too many command-line arguments (first is \"%s\")\n"), progname, argv[optind]); fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); } /* * Don't allow pg_rewind to be run as root, to avoid overwriting the * ownership of files in the data directory. We need only check for root * -- any other user won't have sufficient permissions to modify files in * the data directory. * 不需要以root用户运行pg_rewind,避免覆盖数据目录中的文件owner. * 只需要检查root用户,其他用户没有足够的权限更新数据目录中的文件. */#ifndef WIN32 if (geteuid() == 0) { //root用户 fprintf(stderr, _("cannot be executed by \"root\"\n")); fprintf(stderr, _("You must run %s as the PostgreSQL superuser.\n"), progname); exit(1); }#endif get_restricted_token(progname); /* Set mask based on PGDATA permissions */ //根据PGDATA的权限设置权限mask if (!GetDataDirectoryCreatePerm(datadir_target)) { fprintf(stderr, _("%s: could not read permissions of directory \"%s\": %s\n"), progname, datadir_target, strerror(errno)); exit(1); } umask(pg_mode_mask); /* Connect to remote server */ //连接到远程服务器 if (connstr_source) libpqConnect(connstr_source); /* * Ok, we have all the options and we're ready to start. Read in all the * information we need from both clusters. * 现在,我们有了相关的执行运行,准备开始运行. * 从两个db clusters中读取所有需要的信息. */ //读取目标控制文件 buffer = slurpFile(datadir_target, "global/pg_control", &size); digestControlFile(&ControlFile_target, buffer, size); pg_free(buffer); //读取源控制文件 buffer = fetchFile("global/pg_control", &size); digestControlFile(&ControlFile_source, buffer, size); pg_free(buffer); sanityChecks(); /* * If both clusters are already on the same timeline, there's nothing to * do. * 如果两个clusters已经是同一个时间线,没有什么好做的了,报错. */ if (ControlFile_target.checkPointCopy.ThisTimeLineID == ControlFile_source.checkPointCopy.ThisTimeLineID) { printf(_("source and target cluster are on the same timeline\n")); rewind_needed = false; } else { //找到分叉点 findCommonAncestorTimeline(&divergerec, &lastcommontliIndex); printf(_("servers diverged at WAL location %X/%X on timeline %u\n"), (uint32) (divergerec >

> 32), (uint32) divergerec, targetHistory [lastcommontliIndex] .tli); / * Check for the possibility that the target is in fact a direct * ancestor of the source. In that case, there is no divergent history * in the target that needs rewinding. * check the possibility that the target is the direct ancestor of the source. * in this case, there is no different history among the goals that need to be adjusted. * / if (ControlFile_target.checkPoint > = divergerec) {/ / if the checkpoint > bifurcation point of the target, rewind rewind_needed = true;} else {/ / target checkpoint > 32), (uint32) chkptrec, chkpttli); / * Build the filemap, by comparing the source and target data directories. * create filemap filemap_create () by comparing source and destination data directories to build filemap * / /; pg_log (PG_PROGRESS, "reading source file list\ n"); fetchSourceFileList (); pg_log (PG_PROGRESS, "reading target file list\ n"); traverse_datadir (datadir_target, & process_target_file) / * * Read the target WAL from last checkpoint before the point of fork, to * extract all the pages that were modified on the target cluster after * the fork. We can stop reading after reaching the final shutdown record. * XXX: If we supported rewinding a server that was not shut down cleanly, * we would need to replay until the end of WAL here. * start reading the target WAL Record from the last checkpoint before the bifurcation point, * extract all modified pages on the target cluster after the fork. * stop reading when the last shutdown record is reached. * XXX: if we support database rewind that is not normally closed, we need to replay WAL Record to the end of WAL here. * / / construct filemap pg_log (PG_PROGRESS, "reading WAL in target\ n"); extractPageMap (datadir_target, chkptrec, lastcommontliIndex, ControlFile_target.checkPoint); filemap_finalize (); if (showprogress) calculate_totals (); / * this is too verbose even for verbose mode * / / if debug mode, print filemap if (debug) print_filemap () / * Ok, we're ready to start copying things over. * now you can start copying. * / if (showprogress) {pg_log (PG_PROGRESS, "need to copy% lu MB (total source directory size is% lu MB)\ n", (unsigned long) (filemap- > fetch_size / (1024 * 1024)), (unsigned long) (filemap- > total_size / (1024 * 1024); fetch_size = filemap- > fetch_size; fetch_done = 0 } / * This is the point of no return. Once we start copying things, we have * modified the target directory and there is no turning back! * there is no turning back here. * once you start copying, you must update the destination path and cannot go back! * / executeFileMap (); progress_report (true); / / create the backup_label file and update the control file pg_log (PG_PROGRESS, "\ ncreating backup label and updating control file\ n"); createBackupLabel (chkptredo, chkpttli, chkptrec); / * Update control file of target. Make it ready to perform archive * recovery when restarting. * update the target control file. Archive recovery can be performed on restart. * minRecoveryPoint is set to the current WAL insert location in the * source server. Like in an online backup, it's important that we recover * all the WAL that was generated while we copied the files over. * minRecoveryPoint is set to the current WAL insertion location on the target server. * similar to online backup, it is important to recover from all generated WAL logs when copying and overwriting files. * / / Update the control file memcpy (& ControlFile_new, & ControlFile_source, sizeof (ControlFileData)); if (connstr_source) {/ / get the location where the source WAL is inserted endrec = libpqGetCurrentXlogInsertLocation (); / / get the timeline endtli = ControlFile_source.checkPointCopy.ThisTimeLineID;} else {endrec = ControlFile_source.checkPoint; endtli = ControlFile_source.checkPointCopy.ThisTimeLineID } / / Update control file ControlFile_new.minRecoveryPoint = endrec; ControlFile_new.minRecoveryPointTLI = endtli; ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY; update_controlfile (datadir_target, progname, & ControlFile_new, do_sync); pg_log (PG_PROGRESS, "syncing target data directory\ n"); / / synchronize data directory (except data files) syncTargetDirectory () Printf (_ ("Done!\ n"); return 0;} "Old Master Node Analysis in PostgreSQL" ends here. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.