What are the defects of the actual scene of PostgreSQL? 07/19 Update SLTechnology News&Howtos

What are the defects of the actual scene of PostgreSQL?

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what are the defects in the actual scene of PostgreSQL". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Defect 1: catastrophic XID solution

I suggest you check out more information on this point, and to be honest, this lack is really a headache. This problem led to a lot of long downtime failures, as long as several days. If you look at enough material, such as Google search, you will find that a lot of people are stepping on mines on this feature. Almost all PostgreSQL technicians who do not have senior expert experience will encounter this problem.

Maybe at some point in the future, XID may transition to 64-bit integers, but until then, we still have to deal with this challenge. But the good thing is, unlike the application software on the plane, this fault can be avoided as long as we don't use this feature.

Defective 2:failover failure may result in data loss

If the running primary server suddenly fails, the running stream replication settings will almost certainly lose committed data. One might say, "that's the price of asynchronous replication." But not all asynchronous replications lose data. Although PostgreSQL supports the preferred commit mechanism for synchronous replication to achieve fault-tolerant persistence, it will impose more complex design requirements on the application if it is to ensure a small range of performance impact.

In this case, although the wait does not consume system resources, the transaction lock remains until the transfer is confirmed. As a result, in order to avoid increased response time and resource contention, synchronous replication needs to be used with caution, as it may degrade the performance of database applications.

Synchronous replication preferred submission is useful in some cases, but I don't recommend it in general use cases. This mechanism is somewhat similar to Kafka's ISR replication, with acks = all and a defined min_isr, but when running any query, it will show slight differences depending on the database type and transaction processing principles on the target side. I have not known that through failover failover, there have been successful application arbitration submissions to achieve high availability and high durability in environments with large data scales. If readers have such a case, I would like to hear it!

As far as relational databases are concerned, Galera Cluster group replication is not perfect, but it is closer to the ideal state. They even encourage geographically distributed replication, but this is likely to be catastrophic for PostgreSQL replication settings submitted using arbitration.

Defect 3: inefficient replication will propagate failure

By far, stream replication is the most commonly used replication mechanism in production deployments. It is a form of physical replication that replicates changes in the disk binary data itself.

Each time a database page (4KB) on disk needs to be modified by a write operation, even if it is only one byte, a copy of the entire page edited with the requested change is written to the write-ahead log (WAL). Physical stream replication leverages this existing WAL infrastructure as a change log for streaming to replicas.

For example, with physical replication, large index builds create a large number of WAL entries, which can easily become a bottleneck for stream replication. The page granularity read-modify-copy process can lead to data corruption caused by hardware on the host, and it is easier to spread the damage to the copy, which I have personally witnessed in production.

This is the opposite of logical replication, which replicates only logical data changes. At least in theory, large index construction will only result in replication of a single command on the network. Although PostgreSQL has supported logical replication for quite a long time, most deployments use physical stream replication because it is more robust, more widely supported, and easier to use.

Defect 4:MVCC garbage collection occurs frequently

Like most major databases, PostgreSQL uses multi-version concurrency control (MVCC) to implement concurrent transactions. However, its specific implementation usually brings operational trouble to the data version of the junk row and its VACUUM. In general, the UPDATE operation creates a new copy (or "row version") of any modified rows, leaving the old version on disk until they can be cleared.

The situation has been improving steadily over the years, but it is a complex system that is a black box for anyone who is first exposed to the problem. For example, if you know about intra-heap tuples (HOT) and when to start, this operation is likely to fail for heavy in-place update workloads (such as continuous consistent counter columns). The default automatic vacuum setting works in most cases, but if it does fail, the consequences will be unimaginable.

In contrast, MySQL and Oracle use Redo and undo logs. They don't need a similar background garbage collection process. The tradeoffs made to achieve the entire functionality are mainly the additional delays in transaction commit and rollback operations.

Although there will be delays, perhaps at some point in the distant future, this feature will become very valuable.

Defect 5: each connection processing = scale pain

PostgreSQL generates a process for each connection, while most other databases use a more efficient connection concurrency model. Because there is a relatively low threshold, adding more connections to this threshold reduces performance (about 2 cores), which eventually leads to performance degradation, which can be high (difficult to estimate, highly dependent on the workload). This makes performance tuning more difficult.

The standard approach to connection pooling can certainly solve the problem, but it brings additional architectural complexity. In a particularly large deployment, I ended up having to layer in the second pgbouncer layer. One layer runs on the application server and the other layer runs on the database server. In total, it aggregates the connections of about one million client processes. At this time, there is only about 40% technical content for the tuning job, and the rest takes about 40% brute force and 10% chance.

Process scalability is gradually improving in each major release, but in the end, the performance of the architecture is limited compared to the "threads per connection" used in MySQL.

For more technical details, see https://brandur.org/postgres-connections.

Defect 6: primary key indexing is a waste of space

Tables in PostgreSQL have a primary key index and a separate row store called a heap. Other databases integrate them or support Index Organization tables. Under this mechanism of PostgreSQL, the primary key lookup process points directly to specific row data without the need to assist in obtaining the complete row and the necessary additional CPU and I / O utilization.

The CLUSTER command in PostgreSQL reorganizes the table according to the index to improve performance, but it doesn't really apply in most OLTP situations. It rewrites the entire table with a mutex, thereby preventing any reads or writes. PostgreSQL does not maintain the cluster layout of new data, so the operation must be run on a regular basis. Therefore, this mechanism cannot be used if you cannot accept that the database is offline for a long time.

But more crucially, tables organized by the index can save space because the index does not require separate copies of row data. For tables with small rows that are mainly covered by primary keys, such as join tables, this can easily halve the storage space of the table.

For example, for the following table, the table stores "likes" for the social interaction of any object:

CREATE TABLE likes (

Object_type INTEGER NOT NULL

Object_id BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY

User_id BIGINT NOT NULL

Created_at TIMESTAMP WITH TIME ZONE NOT NULL

PRIMARY KEY (object_type, object_id, user_id)

);

PostgreSQL maintains a primary key index that is separate from the base table storage. The index will be a complete copy of each row containing the object_type,object_id and user_id columns. 20 of the 28 bytes per line (about 70%) will be replicated. Let's think about it. If PostgreSQL supports tables organized by index, you don't need to consume all this extra space.

Defect 7: large version upgrade may require downtime

For large databases, some major version upgrades require several hours of downtime to achieve complete data transfer. If you use a typical stream replication mechanism, you cannot do this gracefully by upgrading the replica and performing a failover. The disk binary format is not compatible between large versions, so the wired protocol between master replicas is actually incompatible.

It is hoped that logical replication will eventually completely replace streaming replication so that users can enable the online rolling upgrade policy. In my previous large-scale horizontal deployment, we made a significant engineering investment in a custom infrastructure and used an additional trigger-based replication system (also used for sharding migration). In the end, these upgrades are guaranteed without downtime.

Defect 8: somewhat cumbersome replication settings

To be fair, ready-to-use replication of MySQL is much more troublesome, but compared with some NoSQL storage (such as MongoDB and Redis) or some cluster-oriented replication systems (such as MySQL Group Replication and Galera Cluster), there are still many shortcomings in setting up replication in PostgreSQL because of its ease of use and the view of avoiding marginalization. In theory, logical replication provides more flexibility for third-party solutions to fill these gaps, but so far, using it instead of stream replication has been a big problem.

Defect 9: lack of Planner hints mechanism

In a general database, you can guide users to use policies that they do not know how to use through Planner hints prompts. But the PostgreSQL development team has refused to support Planner hints program hints for years as if it were a smarter form of compiler parameters.

I do understand their reasons, mainly to prevent illegal users from attacking with query hints that should be addressed by writing appropriate queries. However, you get annoyed when you see the production database collapse sharply under a sudden and unexpected change in the query plan, without any hints or knowing what to do.

In many cases, the hint prompt to the user can alleviate the problem in minutes and save the engineering team the time they need to make proper repairs to the query, such as hours or even days. Although there are some indirect workarounds that involve disabling some query planner strategies, they are risky and should never be used without restrictions.

Of course, it would be perfect to meet these two needs at the same time.

Defect 10: no block compression

InnoDB's page compression in MySQL usually halves storage space and is almost "free" (unaffected) from a performance perspective. PostgreSQL only supports automatic compression of large values, but this is not useful for the most common way to store data in a relational database (there are rarely particularly large values). For most RDBMS use cases, one line is usually hundreds of bytes or less, which means that compression is only really effective when applied across multiple lines or chunks.

Block compression is indeed difficult to implement for PostgreSQL core data structures, but despite some shortcomings, the "punching" strategy adopted by the MySQL InnoDB storage engine seems to work well in practice.

The only widely used general-purpose block compression setting in the PostgreSQL world takes advantage of ZFS, which many people find easier to use. ZFS is now a production implementation on Linux, but it certainly brings some administrative overhead, while ZFS does not exist for more "off-the-shelf" file systems such as XFS or ext4.

This is the end of the content of "what are the defects in the actual scene of PostgreSQL". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.