In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, the editor will bring you about why the default page size of Linux is 4KB. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
Why Why's THE Design is a series of articles about programming decisions in the computer field. In each article in this series, we will raise a specific question and discuss the advantages and disadvantages of this design and its impact on specific implementation from different perspectives. If you have any questions you want to know, you can leave a message at the bottom of the article.
We all know that Linux will manage memory on a page-by-page basis, whether it is loading the data in the disk into memory or writing the data in memory back to disk, the operating system will operate on a page-by-page basis, even if we only write one byte of data to the disk, we still need to brush all the data in the entire page into the disk.
Linux supports both normal-sized memory pages and large memory pages (Huge Page) [^ 1]. The default size of memory pages on most processors is 4KB. Although some processors use 8KB, 16KB or 64KB as the default page size, 4KB pages are still the mainstream of the operating system's default memory page configuration. In addition to the normal memory page size, different processors also contain large pages of different sizes, and we can use 2MB memory pages on x86 processors.
The memory page of 4KB is actually a problem left over from history, and the 4KB determined in the 1980s has been retained to this day. Although today's hardware is much richer than in the past, we still use the mainstream memory page size in the past. As shown in the following figure, people who have installed the machine should be very familiar with the memory sticks here:
Figure 1-Random access memory
Today, 4KB's memory page size may not be the best choice, and 8KB or 16KB may be the better choice, but this is a trade-off made in specific scenarios in the past. In this article, we should not dwell too much on the 4KB number, but should pay more attention to the factors that determine this result, so that we can consider the best choice from these aspects when we encounter similar scenarios. In this article, we will introduce the following two factors that affect the memory page size:
Too small page size will bring larger page table items to increase the lookup speed and extra overhead of TLB (Translation lookaside buffer) when addressing.
Excessive page size will waste memory space, cause memory fragmentation and reduce memory utilization.
In the last century, we fully considered the above two factors when designing the memory page size, and finally chose the memory page of 4KB as the most common page size of the operating system. We will describe in detail their impact on the performance of the operating system.
Page table item
We have introduced the virtual memory in Linux in the article why Linux needs virtual memory. What each process can see is an independent virtual memory space. Virtual memory space is only a logical concept. Processes still need to access the physical memory corresponding to virtual memory. The conversion from virtual memory to physical memory requires each process to hold the page table.
In order to store the mapping data of 128 TiB virtual memory in 64-bit operating system, Linux introduced four-layer page table auxiliary virtual address translation [^ 2] in 2.6.10 and a five-layer page table structure [^ 3] in 4.11. In the future, more layers of page table structure may be introduced to support 64-bit virtual addresses.
Figure 2-four-tier page table structure
In the four-tier page table structure shown in the figure above, the operating system uses the lowest 12 bits as the offset of the page, and the remaining 36 bits are divided into four groups to represent the index of the current level in the upper level. all virtual addresses can be found using the above multi-tier page table to find the corresponding physical address [^ 4].
Because the size of the virtual address space of the operating system is fixed, and the whole virtual address space is evenly divided into N memory pages of the same size, the size of the memory page will eventually determine the hierarchical structure and specific number of page table items in each process. The smaller the size of the virtual page, the more page table items and virtual pages in a single process.
Because the current virtual page size is 4096 bytes, the 12 bits at the end of the virtual address can represent the address in the virtual page. If the virtual page size is reduced to 512 bytes, then the original four-tier page table structure or five-layer page table structure will become five or six layers, which will not only increase the additional memory access overhead, but also increase the amount of memory occupied by page table items in each process.
Fragmentation
Because memory mapping devices work at the level of memory pages, the operating system believes that the smallest unit of memory allocation is virtual pages. Even if the user program only applies for 1 byte of memory, the operating system will apply for a virtual page for it. As shown in the following figure, if the size of the memory page is 24KB, then applying for 1 byte of memory will waste ~ 99.9939% of the space.
Figure 3-fragmentation of large memory
With the increase of memory page size, the fragmentation of memory will become more and more serious. Small memory pages will reduce the memory fragmentation in memory space and improve memory utilization. In the last century, memory resources were not as abundant as they are today. In most cases, memory is not a resource that limits the running of programs, and most online services need more CPU rather than more memory. However, in the last century, memory was also a scarce resource, so improving the utilization of scarce resources is something we have to consider:
Figure 4-the price of memory
In the eighties and nineties of the last century, there were only 512KB or 2MB, and the price was ridiculously expensive, but the memory of a few GB is very common today [^ 8], so although the utilization rate of memory is still very important, today, when the price of memory is greatly reduced, fragmented memory is no longer a key problem to be solved.
In addition to memory utilization, larger memory pages also increase the additional overhead of memory copying, because the write-time copy mechanism on Linux, when multiple processes share the same memory, when one of the processes modifies the shared virtual memory will trigger a copy of the memory page, and the smaller the memory page of the operating system, the less the extra overhead of copying when writing.
Summary
As we mentioned above, the memory page of 4KB is the default setting decided in the last century. From today's point of view, this is probably the wrong choice. Architectures such as arm64 and ia64 can already support memory pages of 8KB, 16KB and other sizes. As the price of memory becomes lower and lower and the system memory becomes larger and larger, larger memory may be a better choice for the operating system. Let's review two factors that determine the page size of memory:
Too small page size will bring larger page table items to increase the search speed and additional overhead of TLB (Translation lookaside buffer) when addressing, but it will also reduce memory fragmentation in the program and improve memory utilization.
Excessive page size will waste memory space, cause memory fragmentation and reduce memory utilization, but it can reduce the page table items in the process and the addressing time of TLB.
This kind of similar scenario is also common when we do system design. To take an example that is not particularly appropriate, when we want to deploy services on a cluster, the resources on each node are limited. The resources occupied by a single service may affect the resource utilization of the cluster or the additional overhead of the system. If we deploy 32 services with 1 CPU in the cluster, we can make full use of the resources in the cluster, but such a large number of instances will bring a lot of additional overhead; if we deploy 4 services with 8 CPU in the cluster, the extra cost of these services is small, but it may leave a lot of gaps in the nodes.
This is why the default page size of Linux is 4KB. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.