In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What is the difference between Select, Poll and Epoll? for this question, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.
1 select
Select is essentially the next step of processing by setting or checking the data structure that holds the fd flag bits.
This brings disadvantages:
The number of fd that can be monitored by a single process is limited, that is, the number of ports that can listen is limited.
The maximum number of connections that a single process can open is defined by the FD_SETSIZE macro, which is the size of 32 integers (3232 on 32-bit machines and 3264 on 64-bit machines). Of course, we can modify it and then recompile the kernel, but performance may be affected, which requires further testing
Generally speaking, this number has a lot to do with the system memory, and the specific number can be seen by cat / proc/sys/fs/file-max. The default is 1024 for 32-bit machines and 2048 for 64-bit machines.
For socket, it is a linear scan, that is, polling, which is inefficient:
Only know that there is an IWeiO event, but do not know which streams, will only poll all streams without difference to find streams that can read data or write data to operate. The more streams are processed at the same time, the longer the indifference polling time is-O (n).
When there is a lot of socket, each select has to traverse FD_SETSIZE socket, whether active or not, which wastes a lot of CPU time. If you can register a callback function with socket and automatically complete the relevant operations when they are active, you can avoid polling, which is epoll and kqueue.
Calling procedure
Shortcoming
The kernel needs to pass messages to user space, all of which require kernel copy actions. You need to maintain a data structure that holds a large amount of fd, making it expensive for user space and kernel space to copy the structure.
Every time you call select, you need to copy the fd collection from the user state to the kernel state, which is very expensive in many cases of fd.
At the same time, each call to select requires the kernel to traverse all the fd passed in, which is also very expensive in many cases of fd.
The number of file descriptors supported by select is too small. The default is 1024.
2 poll
The implementation of poll is very similar to select, except that the fd collection is described in a different way. Poll uses the pollfd structure instead of select's fd_set structure, and the others are similar. Managing multiple descriptors is also polled and processed according to the status of the descriptors, but poll has no limit on the maximum number of file descriptors. Poll and select also have the disadvantage that an array containing a large number of file descriptors is copied as a whole between the user state and the kernel address space, and its overhead increases linearly with the number of file descriptors, regardless of whether these file descriptors are ready or not.
It copies the array passed in by the user to the kernel space
Then query the device status corresponding to each fd:
If the device is ready, add an item to the device waiting queue to continue traversing
After going through all the fd, no ready device is found.
Suspends the current process until the device is ready or times out actively, and after it is woken up, it traverses the fd again. This process goes through many meaningless traverses.
There is no limit on the maximum number of connections because it is based on linked list storage
Shortcoming
A large number of fd arrays are copied as a whole between the user state and the kernel address space, regardless of whether it is meaningful or not
If the fd is not processed after it is reported, the fd will be reported again at the next poll
3 epoll
It can be understood that event poll,epoll will notify us of which stream and which iUnix O events occur. So epoll is event-driven (each event is associated with a fd), and our operations on these streams make sense. The complexity is also reduced to O (1).
3.1 trigger mode
EPOLLLT and EPOLLET:
LT, default mode (horizontal trigger)
As long as the fd has data to read, each epoll_wait will return its event, reminding the user program to operate
ET is in "high speed" mode (edge trigger)
It will only be prompted once, and it will not be prompted again until the next time data flows in, regardless of whether there is still data to read in the fd. Therefore, in ET mode, when you read a fd, you must finish reading its buffer, that is, read that the return value of read is less than the requested value or encounter an EAGAIN error.
Epoll registers the fd through the epoll_ctl using an "event" ready notification, and once the fd is ready, the kernel uses a similar callback mechanism to activate the fd,epoll_wait to receive the notification.
3.2 benefits
There is no limit on maximum concurrent connections, and the upper limit of FD that can be opened is much greater than 1024 (about 100000 ports can be monitored on 1 GB of memory)
Efficiency improvement, not polling, will not decrease as the number of FD increases. Only FD that is active will call the callback function
That is, the greatest advantage of Epoll is that it only cares about "active" connections and has nothing to do with the total number of connections, so in the actual network environment, the efficiency of Epoll will be much higher than that of select and poll.
Memory copy, which uses mmap () file mapping memory to accelerate messaging with kernel space; that is, epoll uses mmap to reduce replication overhead.
Epoll is implemented by sharing a piece of memory between the kernel and user space.
On the surface, epoll has the best performance, but when the number of connections is small and the connections are very active, the performance of select and poll may be better than epoll. After all, the notification mechanism of epoll requires a lot of function callbacks.
Both epoll and select can provide the solution of multi-channel I-map O-multiplexing. It can be supported in the current Linux kernel, in which epoll is unique to Linux, while select should be specified by POSIX, which is generally implemented in operating systems.
4 Summary
Select,poll,epoll is an IO multiplexing mechanism, that is, you can monitor multiple descriptors, and once a descriptor is ready (read or write ready), you can tell the program to read and write accordingly.
But select,poll,epoll is essentially synchronous IMab O, because they all need to be responsible for reading and writing after the read-write event is ready, that is to say, the read-write process is blocked, while asynchronous IWeiO is not responsible for reading and writing on its own, and the implementation of Asynchronous IWeiO is responsible for copying data from the kernel to user space.
This is the answer to the question about what is the difference between Select, Poll and Epoll. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.