Summary of IO Multiplexing 07/12 Update SLTechnology News&Howtos

Summary of IO Multiplexing

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

Interview question: describe the IO multiplexing model you know, and explain why IO multiplexing is efficient?

Select poll,epoll is a mechanism of IO multiplexing, that is, multiple file descriptors can be monitored through one mechanism, and once a file descriptor is ready (usually read or write ready), it can inform the process to do the corresponding read and write operations. All three of them are essentially synchronous IO, because they all need to be responsible for the read and write operations after the read and write events are ready, that is, they are blocked during the read and write process. On the other hand, the asynchronous IO does not need to read and write on its own, it is only responsible for initiating the event and the specific implementation is completed by someone else.

The implementation of select is similar to that of poll. Epoll is the enhanced version of poll and select.

Select:

Select is essentially the next step of processing by setting or checking the data structure that holds the fd flag bits. The disadvantages of this are:

1. The number of fd that can be monitored by a single process is limited, that is, the size of the listening port is limited.

Generally speaking, this number has a lot to do with system memory, and the specific number can be seen by cat / proc/sys/fs/file-max. The default for 32-bit phones is 1024. The default for a 64-bit machine is 2048.

2. Socket is scanned linearly, that is, the polling method is used, and the efficiency is low:

When there are many sockets, select () completes the scheduling by traversing FD_SETSIZE Socket each time, no matter which Socket is active. This will waste a lot of CPU time. If you can register a callback function with sockets and automatically complete the relevant operation when they are active, you avoid polling, which is what epoll and kqueue do.

3. It is necessary to maintain a data structure for storing a large amount of fd, which makes it expensive for user space and kernel space to copy the structure.

Poll:

Poll is essentially no different from select. It copies the array passed in by the user to the kernel space, then queries the device status corresponding to each fd. If the device is ready, it adds an item in the device waiting queue and continues traversing. If no ready device is found after traversing all the fd, it suspends the current process until the device is ready or times out actively, and then it traverses the fd again after being woken up. This process has gone through many unnecessary traverses.

It has no limit on the maximum number of connections because it is stored based on linked lists, but it also has one drawback:

1. A large number of fd arrays are copied between the user mode and the kernel address space, regardless of whether such replication is meaningful or not. 2. Another feature of poll is "horizontal trigger". If the fd is not processed after it is reported, the fd will be reported again the next time poll is reported.

Epoll:

Epoll supports horizontal trigger and edge trigger, the biggest feature is edge trigger, it only tells the process which fd has just become on demand, and will only be notified once. Another feature is that epoll uses the ready notification method of "events" to register the fd through epoll_ctl. Once the fd is ready, the kernel will use a callback mechanism similar to callback to activate the fd,epoll_wait to receive notifications.

Advantages of epoll:

1. There is no limit on the maximum number of concurrent connections, and the upper limit of FD that can be opened is much greater than 1024 (about 100000 ports can be monitored on 1 GB of memory). 2. The efficiency is improved, not by polling, and will not decrease with the increase of the number of FD. Only FD that is active will call the callback function; that is, the biggest advantage of Epoll is that it only cares about your "active" connections, not the total number of connections, so in the actual network environment, Epoll will be much more efficient than select and poll.

3. Memory copy, using mmap () file mapping memory to accelerate message delivery with kernel space; that is, epoll uses mmap to reduce replication overhead. Summary of differences among select, poll and epoll:

1. Support the maximum number of connections a process can open

2. The problem of IO efficiency caused by the sharp increase of FD.

3. Message delivery mode

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.