In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article will explain in detail what is the method of fd allocation under the Linux system. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.
In recent days, there is a lot of code for network communication in the company, so it will naturally involve the problem of IO event monitoring method. I was surprised to find that the method of select rotation training is still very popular there. I told them that select should be abandoned now, whether on Linux or windows, because there is a fatal hole in select system calls on both platforms.
The number of socket handle contained in a single fd_set above the windows cannot exceed FD_SETSIZE (64 in win32 winsock2.h, whichever is the VS2010 version), and the fd_set structure uses an array to accommodate these socket handle, each time the FD_ set macro puts a socket handle into this array, and in the process is limited to no more than FD_SETSIZE, please check the definition of FD_ set macro in winsock2.h.
The problem here is
If the number of socket handle in the fd_set has reached FD_SETSIZE, then the subsequent FD_SET operation will actually have no effect, and the IO event corresponding to socket handle will be omitted!
Under the Linux system, the problem actually lies in the structure of fd_set and FD_SET macros. In this case, the fd_set structure uses a sequence of bit bits to record the fd of each IO event to be detected. The recording method is slightly more complicated, as follows
/ usr/include/sys/select.h
Typedef long int _ _ fd_mask; # define _ _ NFDBITS (8 * sizeof (_ _ fd_mask)) # define _ _ FDELT (d) ((d) / _ NFDBITS) # define _ FDMASK (d) ((_ _ fd_mask) 1 fds_bits) # else _ fd_mask _ fds_ bits [_ _ FD_SETSIZE / _ _ NFDBITS] # define _ FDS_BITS (set) ((set)-> _ fds_bits) # endif} fd_set; # define FD_SET (fd, fdsetp) _ FD_SET (fd, fdsetp)
/ usr/include/bits/select.h
1 # define _ _ FD_SET (d, set) (_ FDS_BITS (set) [_ _ FDELT (d)] | = _ _ FDMASK (d))
As you can see, in the above process, the position of each bit in the bit sequence of fd_set corresponds to the value of fd. The number of bit bits in the fd_set structure is defined by _ _ FD_SETSIZE, and _ _ FD_SETSIZE is defined as 1024 in / usr/include/bits/typesize.h (including the relationship sys/socket.h-> bits/types.h-> bits/typesizes.h).
The problem now is that when fd > = 1024, FD_ set macros actually cause memory writes to cross the bounds. In fact, it has been clearly stated in man select, as follows
NOTES
An fd_set is a fixed size buffer. Executing FD_CLR () or FD_SET () with a value of fd that is negative or is equal to or
Larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor.
This includes me before, which is not noticed by many people, and the blog post "collapse caused by select" by Yunfeng God also describes this problem.
It can be seen that select is not safe in the Linux system. If you want to use it, you have to be careful to confirm whether the fd reaches 1024, but this is very difficult to do, otherwise you should honestly use poll or epoll.
It goes a little too far, but it also leads to the theme of this article, that is, how the FD value is allocated and determined under the Linux system. We all know that fd is the int type, but how its value increases. In the following content, I have made a little analysis of this. Take the 2.6.30 version of kernel as an example, welcome to shoot bricks.
First of all, you need to know which function is used to allocate fd. I'll take pipe as an example, which is a typical syscall for allocating fd. The syscall implementation of pipe and pipe2 is defined in fs/pipe.c, as follows
SYSCALL_DEFINE2 (pipe2, int _ user *, fildes, int, flags) {int fd [2]; int error; error = do_pipe_flags (fd, flags); if (! error) {if (copy_to_user (fildes, fd, sizeof (fd)) {sys_close (fd [0]); sys_close (fd [1]); error =-EFAULT }} return error;} SYSCALL_DEFINE1 (pipe, int _ user *, fildes) {return sys_pipe2 (fildes, 0);}
Further analysis of the do_pipe_flags () implementation shows that it uses get_unused_fd_flags (flags) to allocate fd, which is a macro
# define get_unused_fd_flags (flags) alloc_fd (0, (flags)), located in include/linux/fs.h
Well, we found the protagonist, which is alloc_fd (), which is the function that the kernel actually performs the fd allocation. It is located in fs/file.c, and the implementation is simple, as follows
Int alloc_fd (unsigned start, unsigned flags) {struct files_struct * files = current- > files; unsigned int fd; int error; struct fdtable * fdt; spin_lock (& files- > file_lock); repeat: fdt = files_fdtable (files); fd = start; if (fd)
< files->Next_fd) fd = files- > next_fd; if (fd
< fdt->Max_fds) fd = find_next_zero_bit (fdt- > open_fds- > fds_bits, fdt- > max_fds, fd); error = expand_files (files, fd); if (error
< 0) goto out; /* * If we needed to expand the fs array we * might have blocked - try again. */ if (error) goto repeat; if (start next_fd) files->Next_fd = fd + 1; FD_SET (fd, fdt- > open_fds); if (flags & O_CLOEXEC) FD_SET (fd, fdt- > close_on_exec); else FD_CLR (fd, fdt- > close_on_exec); error = fd # if 1 / * Sanity check * / if (rcu_dereference (fdt- > fd [fd])! = NULL) {printk (KERN_WARNING "alloc_fd: slot% d not NULL!\ n", fd); rcu_assign_pointer (fdt- > fd [fd], NULL);} # endif out: spin_unlock (& files- > file_lock); return error;}
In the system call of pipe, the start value is always 0, and the key expand_files () function determines whether the open file table of the process needs to be expanded according to the FD value given. The function header is annotated as follows.
/ * Expand files. * This function will expand the file structures, if the requested size exceeds * the current capacity and there is room for expansion. * Return file_lock should be held on entry, and will be held on exit. , /
Let's not delve into its implementation here, but back to alloc_fd (), we can now see that the principle of allocating fd is
Each time the idle fd with the lowest FD value is allocated first, when the allocation is not successful, the error code of EMFILE is returned, which means that there are too many FD in the current process.
This also confirms that in the server program written by the company (kernel is 2.6.18), every time printing the fd corresponding to the client link is worth changing, if the FD value assigned to a new connection is 8, then after it is closed, the fd assigned to the new link is also 8, and then the FD value of the new link is gradually increased by 1.
To this end, I continued to look for socket corresponding to the fd allocation method, and found that the final alloc_fd (0, (flags)), the call sequence is as follows
Socket (sys_call)-> sock_map_fd ()-> sock_alloc_fd ()-> get_unused_fd_flags ()
The open system call also uses get_unused_fd_flags (), which is not enumerated here.
Now I'd like to go back to the question of select at the beginning. Because of the allocation rule of fd in Linux system, the FD value of each time has been guaranteed to be as small as possible. In general, the probability of FD value reaching 1024 in a process is relatively small for non-IO frequent systems. Therefore, we can not completely make an absolute conclusion on whether we should abandon select or not. If the designed system does have other measures to ensure that the FD value is less than 1024, then there is nothing wrong with using select.
But in the case of network communication programs, this assumption should never be made, so try not to use select!
On the Linux system what is the method of fd allocation is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.