What is the solution to PostgreSQL pain points? 07/09 Update SLTechnology News&Howtos

What is the solution to PostgreSQL pain points?

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail what is the solution to the pain point of PostgreSQL. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

The kernel must work for a wide range of workloads; it's not surprising that it doesn't always perform as well as some user communities expect. The PostgreSQL Relational Database Management system project is a community that sometimes feels a little left out. In response to the invitation of the organizers of the 2014 "Linux Storage, File system, and memory Management" summit, PostgreSQL developers Robert Haas,Andres Freund and Josh Berkus came to discuss their most painful problems and possible solutions.

PostgreSQL is a very old system dating back to 1996; it is run by many users on a variety of operating systems. As a result, PostgreSQL developers are limited by the amount of Linux specified code they can add. It is based on collaborative processes and does not use threads. System V shared memory is used for interprocess communication. Importantly, PostgreSQL maintains its own internal buffers, but also uses I _ cache O buffers to read and write disk data. This combination of buffers leads to some of the problems experienced by PostgreSQL users.

Slow synchronization

The first problem described is about how data is saved from the buffer cache to disk. PostgreSQL uses a form of logging, which they call "pre-written logging." The changes are first written to the log; once the log is safely saved on disk, the main database blocks can be written back. Much of this work is done through a "checkpoint" process; it writes log entries and then writes a batch of data back to various files on disk. These journal-capable writes are relatively small and continuous; they work well, and, according to Andres, PostgreSQL developers are satisfied with how this part of the system works on Linux. [Robert Haas]

Writing data is another matter. The checkpoint process adjusts these writes to prevent the Imax O system from overriding everything else. However, when it began to consider calling fsync () to ensure that the data was safely written, and all these adjusted writes were immediately pushed to the request queue, it caused the Imax O storm. According to them, the problem is not that fsync () is too slow, but that it is too fast. It exports so much data to the Iamp O system that everything else, including read requests from the application, is blocked. This brings pain not only for users, but also for PostgreSQL developers.

Ted Ts'o asked whether it would be helpful to limit the checkpoint process to a specific percentage of the available bandwidth. But Robert responded that the Iamp O priority should be better; the checkpoint process should use it more than 100% when other processes do not need bandwidth. The use of the iUnip O-friendly mechanism (which controls the Imax O priority in the CFQ scheduler) is proposed, but this is also problematic: it does not work with the Imax O operation initiated by the fsync () call. Even if data from the checkpoint process is written (not always), the priority is not implemented when fsync () actually starts to do the Icano operation.

Ric Wheeler recommends that PostgreSQL developers need better control over the speed at which they write data; Chris Mason adds that the O_DATASYNC option can be used to give better control when an Icano request is generated. The problem here is that the implementation of this approach requires PostgreSQL to know the speed of the storage device.

Let's go back to the topic of discussion and put it back to the priority of Icano. Because the request queue is maintained through the PostgreSQL O scheduler, most of the schedulers favored by PostgreSQL users tend to avoid using the CFQ scheduler (Completely Fair Queueing absolute Fair Scheduler), or there is no implementation of the Completely Fair Queueing priority mechanism at all. This is not the worst part, and even those places that offer the priority of Imax O limit the length of the request queue. A big data flush operation will quickly fill the queue, and at this time the Imax O priority will lose most of the effects. If there is no space to accommodate these request queues, a high-priority request will be inactivated and will not be able to achieve the expected high priority. It seems that giving priority to Icano will not solve the problem.

The right solution still seems so vague and irrelevant. Ted said that if PostgreSQL developers can provide a way to simply replicate these problems by providing Mini Program that uses running databases to build this Ihammer O pattern, then kernel developers can try many different ways to find solutions. Such a program may be similar to PostgreSQL's initialization configuration script, but a separate Mini Program is what the kernel developer community would prefer to see.

Double buffering technology

PostgreSQL needs to do its own buffering technology, because there are many cases where it uses I _ cache O buffering for a variety of reasons. This leads to a problem: database data is often stored in memory twice, once in the PostgreSQL buffer and once in the page cache (page cache). PostgreSQL greatly increases the use of memory to some extent, which is harmful to a complete system.

A large amount of memory waste should be effectively eliminated. Consider an example where there is a dirty data (dirty buffer) on PostgreSQL's cache, which is newer than the data on the page cache owned by the kernel. When PostgreSQL refreshes this dirty data, the important process that the page cache is rewritten will not occur, so the data will not be synchronized. In this case, PostgreSQL would be fine if it could tell the kernel to remove the corresponding page from the page cache, but the reality is that there is no good API to do this. According to Andres, calling the FADV_DONTNEED parameter of the fadvise () function is OK, in fact, this will cause the specified page to be read out, and few people can understand this behavior very well, but they all agree that it should not work in this way. Nor can they use the madvise () function without mapping to file processing, which could cause a large number of working processes to become very slow.

This looks good, but it may also move some pages in the opposite direction, and PostgreSQL may want to remove a clean page from its own buffer, but leave a copy in the page cache. It may be a special write operation that does not actually raise Imax O, or a system call that converts a physical page into a page cache. There is a lot of discussion on the surface, but no part of the discussion can come to a definite conclusion.

Reversion

Another common problem for PostgreSQL users is that some recent kernel features may have caused execution performance problems. For example, the transparent large pagination (transparent Huge page) feature has no benefit for PostgreSQL's workload, and it is significantly slower. Obviously, a lot of time is spent on tight code that is trying to run, but they don't really produce idle large pages (Huge page). As a result, in many systems, when the transparent large pagination (transparent Huge page) feature is turned off, the terrible performance problems simply disappear.

Mel Gorman replied: if compression is harming performance, this will be a flaw. Having said that, he did not find any defects in transparent large pages for quite a long time. And, he said, a patch has been released that limits the number of processes that can perform compression at any given time. However, the code for this patch has not been merged because no one's workload has ever encountered problems caused by too many processes running compression. He thought it might be time to re-examine that particular patch.

Another pain point comes from the zone recycling feature, which reclaims pages from some areas in the kernel, even if the entire system is not short of memory. Regional recycling slows down the workload of PostgreSQL. Usually * * simply disables this feature on the PostgreSQL server. Andres points out that he has dealt with performance issues related to area recycling several times as a consultant. It's a good way for him to make money. But it would be a good thing if these problems could be fixed.

According to Mel, the regional recycling model is written on the assumption that all processes in the system are integrated under a single NUMA node. This assumption no longer makes sense; it's out of date, he says, and the default value for this option is changed to "off". No one in the room seems to object to the idea, so it may change a little in the near future.

PostgreSQL developers point out that in general, kernel upgrades are often scary. The performance characteristics of the Linux kernel tend to vary greatly from one release to the next; this makes upgrades an uncertain matter. There has been some discussion about finding a new kernel running the PostgreSQL benchmark, but no clear conclusion has been reached. As a whole, though, the developers of the two projects are happy to talk about it; if nothing else, it represents a new level of communication between the two projects.

What is the solution to PostgreSQL pain points is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.