What is the use of PG_REWIND in Postgresql 07/19 Update SLTechnology News&Howtos

What is the use of PG_REWIND in Postgresql

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the use of PG_REWIND in Postgresql, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

PostgreSQL in the process of operation, if another slave library, or the master database, is no longer synchronized with the master database for some reason, or the master library crash can not come up, what if you are making use of the current master library or standby library to create a twins?

In fact, PG has thought of this problem for a long time. PG has a unique command pg_rewind that can help you and recreate you.

Let's see what pg_rewind can do for us.

Pg_rewind works a bit like rsync. It can seamlessly read different data blocks between the source and destination directories, while duplicate data blocks will no longer be read. This approach is actually a good solution to the above problems, because if master-slave replication is broken on either side, using PG_REWIND can quickly synchronize data from one side of the data you think is complete to the other side without having to do full replication, the biggest advantage is to save time.

Of course, if you have a good chance of knowing checksum (including MYSQL's binlog checksum), you can think of how to know whether the data on these two sides are consistent, the necessary check block. If postgresql wants to use the pg_rewind function, you need to make some of the following settings

1 full_page_writes = on

2 wal_log_hints = on

3 hot_standby = on

4 if you are initializing a database cluster (the postgresql stand-alone is also called a database cluster, don't be confused with the meaning of the real cluster), it is also possible to do data checksum.

The main working principle is that the difference between the source and the destination in the destination cluster is when the data between the two servers begins to be out of sync. Start by knowing these differences.

1 copy using the file system

2 copy the data by establishing a connection with libpq

In addition to copying data files, you also need to copy transaction committed files, pg_xact, configuration files, and so on. Generate the backup label file, specify the wal log point to be restored at the beginning, apply the log after the recovery point, refresh the pg_control file (after the checkpoint is set and refresh the log, the location of the checkpoint is saved in the file pg_control), and finally execute initdb-S to brush the data to disk and close.

Question 1, how does PG_REWIND recognize that the two PG are once managed by primary and standby

In fact, it is identified by database system identifier, and the coding of the same master-slave database system identifier is the same. It also depends on whether version is consistent with catalog version number.

With regard to bakcup label, it includes check point, and subsequent replication depends on this check point point destination cluster so that the WAL logs from the source cluster after CHECKPOINT LOCATION can be applied continuously.

Of course, the principle is not only that, let's do an experiment to see the powerful functions of pg_rewind.

First of all, there are two PG, 192.168.198.120 master library 192.168.198.176 slave library

After data synchronization through pg_basebackup, some related database building, table and data insertion are carried out on 192.168.198.120 to see if PG_REWIND can synchronize related data.

Pg_basebackup command is not said, the default everyone will, will not be Baidu, or look at my previous things about this.

1 the data of the two servers below are already consistent, replicated through pg_basebackup

2perform promote operation on 176to simulate the failure of master library and replace master database with slave database, so there is no relationship between master library and slave library (the snooping address of postgresql of 176C needs to be modified, which is the basis of not mentioning it any more)

4 We carried out the operation of creating a pg_ rewind library on 176. at this time, the data of the two libraries are already inconsistent.

Finally, it is executed on 120

Pg_rewind-target-pgdata=/pgdata/data-source-server='host=192.168.198.176 port=5432 user=repl dbname=postgres password=repl'-P-debug

After collecting the data, modify the configuration file of 120 to simulate the failure of the master library and the "slave library" data that has been promoted to the master database.

We're done here, but in fact, some comments say that pg_rewind can do data synchronization. I'm interested in whether pg_rewind can do data synchronization in the end.

First of all, we have to confirm a few points, and on the basis of reaching a consensus, we can continue the following work.

Even if it is data synchronization, it must be that the relationship between the previous two nodes is a master-slave relationship, or at least there is a relationship, if there is no relationship between the two nodes, then this work cannot be done.

2 whether data synchronization requires promote operation as a prerequisite, or modify the recovery.conf file directly

Test 1:

Perform data synchronization, then shut down the slave library, change recover.conf to recover.done, then restart the slave library, change the data of the master library, and use pg_rewind for data synchronization

As a result, the failure can prove that if you want to upgrade the current slave library directly after the master-slave failure, the idea of data synchronization through pg_rewind can cool off.

Test 2 whether it is feasible to synchronize the data of the master database for a long time with the original slave database (the slave library has been promoted to the master database). The long time here actually depends on the amount of data. The following situation is that the pg_wal file cannot be found. Here, you can try to copy the missing pg_wal from the original slave library or turn on archive to ensure that your pg_wal is sufficient.

In the figure, the source master database reports that there is no 0000000400000000000000003F file. Find the original slave library. If the current master database copies this file to the original master database, pg_rewind will work normally.

Therefore, the relevant pg_wal files should also be kept well, and it is best to have archive as the backing for data recovery.

Thank you for reading this article carefully. I hope the article "what is the use of PG_REWIND in Postgresql" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.