Principle of PostgreSQL transaction mechanism 07/19 Update SLTechnology News&Howtos

Principle of PostgreSQL transaction mechanism

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "the principle of PostgreSQL transaction mechanism". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn the principle of PostgreSQL transaction mechanism.

Background note:

Transaction characterized by ACID is an important and basic function of relational database. Understanding the implementation principle of transactions is beneficial not only to the development of database products themselves, but also to the development of applications that use databases. Therefore, taking PostgreSQL as the object, this paper briefly introduces the principle of transaction implementation.

Summary of content:

The principle of transaction implementation can be interpreted as what technology DBMS adopts to ensure the ACID characteristics of transactions. PostgreSQL's implementation techniques for ACID are shown in the table below.

Table 1: four characteristics of transaction ACID and implementation techniques of response

ACID

Realization technology

Atomicity

MVCC

Consistency

Constraints (primary key, foreign key, etc.)

Isolation

MVCC

Persistence

WAL

You can see that MVCC and WAL are the two main technologies that support ACID in PostgreSQL. MVCC and WAL are two mature technologies, which are usually implemented in relational databases, but the specific implementation methods of each database are very different. Let's introduce the basic implementation principles of MVCC and WAL in PostgreSQL.

1. MVCC

MVCC (Multiversion Concurrency Control) is multi-version concurrency control, which can avoid blocking each other between read and write transactions. Compared with the usual blocking technology, it can greatly improve the concurrency performance of business. The principle of MVCC implementation in PostgreSQL can be summarized as follows:

1) multiple line versions of the same logical line in the data file (called Tuple)

2) ID for each row version of the header record to create and delete the row version of the transaction (called xmin and xmax, respectively)

3) the status of each transaction (running, aborting or committing) is recorded in the pg_clog file

4) according to the above data and applying certain rules, each transaction will only see a specific line version.

MVCC read and write transactions can work on different row versions, so they can be executed concurrently without conflict.

Figure 1: an example of data update based on MVCC

2. WAL

When the system goes down unexpectedly, you need to roll back the changes made by the outstanding transaction and ensure that the changes made by the submitted transaction are in effect. The first thing that is easy to do with the MVCC mentioned earlier in PostgreSQL is to set the status of all "running" transactions recorded in all pg_clog files to "aborted", which do not end at the time of downtime. For the second point, you must ensure that the changes are actually written to the persistent storage when the transaction commits. However, it is time-consuming to refresh the transaction modified data directly to disk, so WAL (Write-Ahead Log) is introduced to solve this problem.

The basic principles of WAL are as follows:

1) record the updates in the WAL log before updating the data page

2) Asynchronous refresh the dirty pages and WAL Buffer of data Buffer to disk

3) Buffer Manager ensures that dirty data is never refreshed to disk before the corresponding WAL record

4) flush the WAL log to disk synchronously when the transaction is committed

5) when Checkpoint occurs, flush all dirty pages of the data Buffer to disk

Figure 2: Buffer modification when data is updated

Figure 3: disk synchronization during update submission and Checkpoint

QroomA:

1. Does DDL support transactions in Q:PostgreSQL?

A: yes. DDL in PostgreSQL is handled in a similar way to normal DML and is transactional.

2. Does the processing of BLOB data in Q:PostgreSQL support transactions?

A: yes. The transaction processing of BLOB (bytea or large object) data is not much different from ordinary data, but because of the large BLOB data, transactions involving BLOB will result in large WAL log files.

3. How is the large data in Q:PostgreSQL, such as BLOB, stored in the data page?

A: the default size of the data page is 8K, which may cause a page not to hold the entire data row when there is a lot of big data. In response to this situation, PostgreSQL adopts a technique called TOAST, where larger columns only have a pointer in the row, and the complete data is placed in a separate TOAST table. In the TOAST table, the data is cut into several chunk, and each chunck is stored as a data row.

4. There is no illusion between repeatable readable and serializable isolation levels in Q:PostgreSQL, so what is the difference between them?

A: according to the SQL protocol, being able to avoid phantom reading has met the requirements of the "serializable" isolation level. However, the "serializability" defined by the SQL specification is not serializable in a strict sense, and just avoiding illusory reading does not mean that several transactions executed concurrently can be transformed into several transactions that are executed in a certain order. Repeatable readings in PostgreSQL can avoid illusion but not serializability in the strict sense, but serializability is. By the way, serializability in Oracle is not serializable in the strict sense, it is actually equivalent to repeatable readability in PostgreSQL.

5. What if the transaction ID in Q:PostgreSQL has been allocated?

A: redistribution from scratch (in fact, redistribution starts at 3, 0meme 1 and 2 has been used for a special purpose, which is called transaction rollback). However, this may lead to the problem of transaction ID conflicts, and there are two measures to solve this problem in PostgreSQL. First, periodically clean up the old transaction ID left in the data file and set them to a special value (2). When comparing the new transaction with the old transaction, this special transaction ID is always older than other ordinary transaction ID. This ensures that the scope of the transaction ID in the system will not be too large. Second, when comparing the new and old transactions, it is not simply to compare the arithmetic values of the two transaction ID, but also to take into account the special transaction ID and transaction rollback. For example, according to internal comparison rules, transaction ID 0x00000005 of unsigned INT type is newer than 0xFFFFFFFF. Because the first measure has ensured that the span between transaction ID in the system will not be too large (no more than 2 ^ 31), the 0x00000005 must be the result of the rollback of the transactional ID and the rollback of the 0xFFFFFFFF has not occurred (or one less rollback than 0x00000005).

At this point, I believe you have a deeper understanding of the "principle of PostgreSQL transaction mechanism", might as well come to the actual operation of it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.