Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Anonymous Inode

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail the example analysis of anonymous Inode. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

01 only documents win the hearts of the people

When a girl asks you to catch 100 fireflies for her, she must not torture you, but because she is in love with you. When you have experienced countless grudges and hurt each other, she asks you to catch 100 fireflies for her again, it must be because she still loves you.

Why? Because this is the routine, which is summed up by an occasional glance at costume soap operas.

The biggest trick in Linux is "everything is a file". If you love someone, catch fireflies for her; do one thing and let it become a "document".

Why can not be kept affectionately since ancient times, only the "document" won the hearts of the people? Because the most intuitive form of a file in user mode is to get a fd with an open, with this fd, you can basically do whatever you want inside and outside the Great Wall:

In this process, the most intuitive operations of fd are open, close, mmap, ioctl, poll and so on. Mmap gives you the ability to transmit fd to memory, so you can access the contents of the file through pointers. In addition, this mmap, if the underlying transmission is framebuffer, V4L2, DRM, etc., then we have the ability to operate underlying video memory and multimedia data from user mode; for example, both V4L2 and DRM support exporting the underlying dma_buf to fd. Poll provides users with the ability to block and wait for an event to occur. As for ioctl, not to mention, you have the flexibility to add control commands to fd through ioctl.

In the case of cross-process, Linux supports cross-process socket transmission of fd, which enables shared memory, dma_buf cross-process sharing, and so on. For example, a process can send fd through send_fd:

Another process can receive the fd through recv_fd:

This kind of fd can be visited each other inside and outside the Great Wall, and fd can eventually point to dma_buf and be mmap at the same time, while dma_buf can eventually be accessed by video cards, display controllers, video decoder/encoder and other devices, allowing fd to break through device, CPU and cross-process barriers and walk sideways from now on.

02inode source file running water

We think of the file as an object, then inode describes the source, and the final object corresponds to one-to-one; dentry is a path vest of inode, for example, we can create many hard-link vests for the same inode through the "ln" command; and file is live water, and the process "open" the object at a time to get a file, causing the user state to get a "fd" handle to operate the object.

The classic model in which no one is absent from inode, dentry, and file goes like this:

In the figure above, we have an inode, this inode has two dentry, processes An and B open are the first dentry;, and processes C and D open are the second dentry. What has changed is file and fd, the same is inode, and the dentry vest in the middle is not that important.

But in the classic iron triangle of inode, dentry and file, there can always be an absentee, and that is dentry, because sometimes users want to get the convenience of walking inside and outside the Great Wall, but do not want this inode to leave a path trace in the file system. To put it simply, I want to have a fd, but this fd, you can't find it in any path you search from "/" down, it doesn't exist under the root file system, it's a John Doe, it doesn't have a vest, it's a legend.

For example, the recently famous swordsman usefaultfd allows us to deal with page fault in user space. We first get a fd through the system call userfaultfd, and then we can perform various ioctl on it:

We got a fd through the userfaultfd system that has no path under a file system like / xxx/yyy/zzz. Fd in this case corresponds to an anonymous inode without a name, so you obviously have no way to be like fd = open ("xxx",..). To get the fd of anonymous inode, because "xxx" is a path, and anonymous inode does not have xxx, so you get the corresponding fd of anon_inode in your process directly through a system call such as syscall userfaultfd:

When a man leaves his name, a wild goose leaves a voice; a murderer beats a tiger and a tiger. But anon inode does not like this, it is a great light master, it gives the ability to walk inside and outside the fd Great Wall, but it has never been in the file system. This is the real need of users, and it would be a bit superfluous if it had to be achieved through a dentry open.

03 kernel instance of anonymous inode

We can then casually open an example of anon inode to see how it works. First, userfaultd is a system call:

The core of this code is that it passes:

Anon_inode_getfd_secure ()

Generate an anonymous inode and get a handle fd. Don't forget that this kind of "file" can also have file_operations, such as userfaultfd_fops in the above anon_inode_getfd_secure () parameter:

In this way, we can implement our own special "file" logic in callback such as ioctl,poll,read of file_operations, which is the stage for us to play freely.

Speaking of anon_inode_getfd_secure (), one level further down is _ _ anon_inode_getfd ():

The next step is _ _ anon_inode_getfile ():

So in essence, it is to create an anon_inode, then create a file of pseudo on this anon_inode, and finally entangle fd and file through fd_install (fd, file). Again, users can do whatever they want with this fd, while the kernel itself does so through different implementations of file_operations.

Add a system call to anon_inode, create a special fd, let users go to poll, go to ioctl, enlarge the imagination space. This method of implementation is so cool and flexible that it has become a routine in itself. For example, under the fs directory in the kernel:

Eventfd,eventpoll,fscontext,io_uring,fanotify,inotify,signalfd,timerfd.

As the saying goes, when autumn comes to September 8, I will kill a hundred flowers after blooming. In battle array my fragrance rises sky-high; The capital with my golden armor will blend. The documents, even if they are anonymous in the end, are filled with incense all over the world of Linux.

04 users use anonymous inode

When it's time to say goodbye, what users can see is fd, using anonymous inode through fd. Let's create an example of page fault and let the user state handle it, which is directly simplified from userfaultfd's man page. In the main thread, we request a page of memory through mmap, then tell the kernel the start address and length of the page through userfaultfd's ioctl, and tell the kernel through UFFDIO_REGISTER that the page fault of this page wants to handle user space:

Then in the fault_handler_thread thread created by pthread_create (), poll userfaultfd waits for the event, and then copies a page full of 0x66 to the page where page fault occurs:

The output we get from running this program is as follows:

Our main thread triggers page fault when it executes addr [0] = 0x5A5A5A5A. In the fault thread, poll blocking returns after page fault occurs, and then the user reads a uffd_msg structure through read (), and the member contains the address of the page fault. After that, we copy the page whose content is 0x66 to the page of page fault through the ioctl of UFFDIO_COPY.

So, in the end, when the main thread performs printf printing, 5A5A5A5A is read in addr [0] and 66666666 in the rest of addr [1]. Seeing that page fault is handled so flexibly by users, my friends are scared to pee.

You can see that:

What is poll () waiting for? it's completely customized.

What read () can read is completely customized.

What ioctl () can control is completely customized.

Through the constant stillness of "file", we have created the dexterity and ease of poll, read and ioctl.

This is the end of this article on "sample Analysis of Anonymous Inode". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report