What is the reverse mapping mechanism of Linux kernel? 04/18 Update SLTechnology News&Howtos

What is the reverse mapping mechanism of Linux kernel?

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article focuses on "what is the Linux kernel reverse mapping mechanism", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn what the Linux kernel reverse mapping mechanism is.

1. The Development of reverse Mapping

In fact, in the early Linux kernel version, there was no concept of reverse mapping. At that time, in order to find the corresponding page table item of a physical page, you had to traverse the linked list of all the mm in the system, and then traverse each vma for each mm, and then see if the vma maps the page. This process is extremely long and inefficient. Sometimes you have to go through all the mm before you can find all the pte mapped to this page.

Later, people found this problem, and then describe the physical page's page structure to add a pointer to solve, through this pointer to find an array structure that describes all the pte mapping this page, which is easy for reverse mapping to find all the pte, but it brings a waste of memory.

Then, in kernel 2.6, the kernel gods thought of reusing the mapping field in the page structure, and then organized all the vma that mapped the page through a red-black tree, forming a reverse mapping mechanism between anonymous pages and file pages.

The following is an illustration of the reverse mapping of anonymous pages:

The following is an illustration of the reverse mapping of the file page:

But later, the reverse mapping of anonymous pages encountered the problem of efficiency and lock competition, which prompted the current way to contact each level of reverse mapping structure through avc and then reduce the granularity of locks. It can be seen that the development of reverse mapping is accompanied by the development of the Linux kernel, which is a process of continuous optimization and evolution.

two。 Reverse mapping application scenario

So why do you need a reverse mapping mechanism in the Linux kernel? What kind of problems does it solve?

Imagine the following scenario:

(1) A physical page is mapped by the vma of multiple processes, and there is insufficient memory in the system, so we need to recycle some pages, just to find that this page is suitable for our recycling, can we directly return this page to the partner system? The answer must be no. Because this page is shared by many processes, all we have to do is break the mapping of the page, which is what reverse mapping does.

(2) in some cases, we need to migrate one page to another, but it affects the whole body, there may be some processes that have mapped the page to be migrated to their own vma, so at this time we also need to know which vma this page is mapped by. This is also what reverse mapping does.

In fact, the main application scenarios of reverse mapping are memory recovery and page migration. When memory recovery and page migration occur in the system, the Linux kernel of each candidate page will judge whether it is a mapping page. If so, it will call try_to_unmap to unmap the page table. This paper mainly interprets the reverse mapping mechanism from the try_to_unmap function.

If we look at other kernel subsystems in detail, we will find that the key work done by reverse mapping can be found in various scenarios such as memory recovery, memory defragmentation, CMA, giant pages, page migration, and so on. All understanding the implementation of reverse mapping mechanism in the Linux kernel is to understand the basis and key points of these subsystems, otherwise you will not be able to understand the spinal cord behind these technologies. So understanding the reverse mapping mechanism is essential to understanding Linux kernel memory management!

3. Reverse mapping of anonymous pages

The sharing of anonymous pages mainly occurs when the parent process forks the child process. When the parent forks the child process, it copies all vma to the child process, and establishes the rmap of the child process and the rmap relationship structure with the elder process by calling dup_mmap- > anon_vma_fork:

The vma of all the child processes of the page sharing the parent process is linked mainly through the red-black tree in the data structure of anon_vma (through anon_vma_chain to contact the corresponding vma and av). Of course, the establishment of this relationship is more complicated, involving data structures such as vma,avc and av. .

The page and vma are associated when the page fault exception do_anonymous_page.

When memory is reclaimed or pages are migrated, the kernel path eventually calls:

Try_to_unmap / / mm/rmap.c-> rmap_walk-> rmap_walk_anon-> anon_vma_interval_tree_foreach (avc, & anon_vma- > rb_root,pgoff_start, pgoff_end)-> rwc- > rmap_one-> try_to_unmap_one

For the candidate page, it gets the anon_vma associated with the candidate page, then traverses all the vma that share the page from the red-black tree of anon_vma, and then unmaps the corresponding page table items for each vma through try_to_unmap_one.

4. Reverse mapping of file pages

The sharing of file pages mainly occurs when multiple processes share libc libraries. The same library file can be read to page cache only once, and then mapped to the vma of each process through the page table of each process.

Manage shared file pages, so vma is managed through the interval tree of address_space, and vma is added to this interval tree when mmap or fork:

When a file mapping page fault exception occurs, associate page with address_space.

When memory is reclaimed or pages are migrated, the kernel path eventually calls:

Try_to_unmap / / mm/rmap.c-> rmap_walk-> rmap_walk_file-> vma_interval_tree_foreach (vma, & mapping > pgoff_end)-> rwc- > rmap_one

For each candidate file page, if it is a mapping page, it will traverse the interval tree of the address_space corresponding to the page. For each vma that meets the conditions, call try_to_unmap_one to find the pte and unmap the relationship.

Reverse mapping of 5.ksm pages

The ksm mechanism is that the kernel merges pages with the same page content (all anonymous pages are managed by ksm), marks the page table items mapped to this page as read-only, and then releases the original page table, so as to save a lot of memory, which is very useful for application scenarios with multiple virtual machines in host.

Two red-black trees are managed in the ksm mechanism, one is stable tree and the other is each node in unstable tree,stable tree. The pages managed in stable_node are identical pages (called kpage), and the page table entries of pages that share kpage are marked as read-only. And for the original candidate page, there will be a rmap_item to describe his reverse mapping (the red-black tree of the anon_vma member is a collection of all the vma that describes the mapping of the candidate page), which will be added to the corresponding stable tree node and linked list when merged.

When memory is reclaimed or pages are migrated, the kernel path eventually calls:

Try_to_unmap / / mm/rmap.c-> rmap_walk-> rmap_walk_ksm / / mm/ksm.c-> hlist_for_each_entry (rmap_item, & stable_node- > hlist, hlist)-> anon_vma_interval_tree_foreach (vmac, & anon_vma- > rb_root,0, ULONG_MAX)-> rwc- > rmap_one

For a ksm page, when reverse mapping, you will get the corresponding node of the ksm page, then traverse the node's hlist linked list, get each anon_vma, and then find all the vma from the red-black tree of anon_vma, just like the reverse mapping of the anonymous page described above, and finally try_to_unmap_one to find the pte and unmap the relationship.

At this point, I believe you have a deeper understanding of "what the Linux kernel reverse mapping mechanism is". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.