In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about the Linux network I/O+Reactor model, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
Preface
The network Icano can be understood as the data flow on the network. Usually we set up a TCP or UDP channel with the remote end based on socket, and then read and write. A single socket can be handled efficiently with a single thread; however, how can we achieve high performance processing if it is 10K socket connections, or more?
Introduction to basic concepts
Process (thread) switching
All systems have the ability to schedule processes, which can suspend a currently running process and restore previously suspended processes
Blocking of processes (threads)
A running process sometimes waits for the execution of other events to be completed, such as waiting for a lock and requesting the read and write of Imax O; while waiting, the process will be blocked by the automatic execution of the system, and the process does not take up CPU
File descriptor
In Linux, a file descriptor is an abstract concept used to express a reference to a file, which is a non-negative integer. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process
Linux signal processing
The Linux process can accept the signal value from the system or process, and then run the corresponding capture function according to the signal value; the signal is equivalent to the software simulation of hardware interrupt
User space and kernel space and buffers have been introduced in the zero copy mechanism chapter, which is omitted here
The Reading and Writing process of Network IO
When a read to the socket socket is initiated in user space, it causes context switching and user process blocking (R1) waits for the network data flow to arrive and copies from the Nic to the kernel; (R2) then copies from the kernel buffer to the user process buffer.
When a send operation to the socket is initiated in user space, it causes a context switch and the user process blocks waiting for (1) data to be copied from the user process buffer to the kernel buffer. When the data copy completes, the process switches and resumes.
Five Network IO models of linux
Blocking iCandle O (blocking IO)
Ssize_t recvfrom (int sockfd,void * buf,size_t len,unsigned int flags, struct sockaddr * from,socket_t * fromlen)
The most basic Imax O model is the blocking Imax O model, which is also the simplest model. All operations are performed sequentially
In the blocking IO model, an application in user space performs a system call (recvform), which causes the application to be blocked until the data in the kernel buffer is ready and copied from the kernel to the user process. Finally, the process is awakened by the system to process the data.
In two consecutive phases of R1 and R2, the whole process is blocked.
Non-blocking iCandle O (nonblocking IO)
Non-blocking IO is also a synchronous IO. It is implemented based on the polling (polling) mechanism, in which sockets are opened in a non-blocking form. This means that the operation will not be completed immediately, but the operation will return an error code (EWOULDBLOCK) indicating that the operation has not been completed
Polling checks kernel data and returns EWOULDBLOCK if the data is not ready. The process continues to make recvfrom calls, and of course you can pause to do something else
Until the kernel data is ready, copy the data to user space, and then the process gets the non-error code data and then processes the data. It should be noted that during the whole process of copying data, the process is still in a blocking state.
The process is blocked in the R2 phase, although it is not blocked in the R1 phase, but requires constant polling
Multiplexing iMago (IO multiplexing)
Generally speaking, there are a large number of socket connections in back-end services. If you can query the read and write status of multiple sockets at a time, if any one is ready, then deal with it, which will be much more efficient. This is called "socket O multiplexing". Multiplexing refers to multiple socket sockets, and multiplexing refers to multiplexing the same process.
Select or poll, epoll are blocking calls
Unlike blocking IO, select does not wait for all the socket data to arrive before processing, but will resume the user process to process when some of the socket data is ready. How do you know that some of the data is ready in the kernel? Answer: leave it to the system.
Processes are also blocked in R1 and R2, but there is a trick in R1. In a multi-process, multi-threaded programming environment, we can assign only one process (thread) to block calls to select, and other threads can be liberated.
Signal-driven iCandle O (SIGIO)
You need to provide a signal capture function and associate it with the socket socket; after making the sigaction call, the process is free to deal with other things
When the data is ready in the kernel, the process receives a SIGIO signal, then interrupts to run the signal capture function, calls recvfrom to read the data from the kernel into user space, and then processes the data.
You can see that the user process will not block during the R1 phase, but R2 will still block the wait.
Asynchronous IO (POSIX's aio_ series of functions)
Relative to synchronous IO, asynchronous IO does not block the current process after the user process initiates an asynchronous read (aio_read) system call, regardless of whether the kernel buffer data is ready or not; after the aio_read system call returns, the process can handle other logic
When the socket data is ready in the kernel, the system copies the data directly from the kernel to the user space, and then uses the signal to notify the user process.
The process is non-blocking in both R1 and R2 phases.
Multiplexing IO in-depth understanding of a wave
Select
Int select (int nfds, fd_set * readfds, fd_set * writefds, fd_set * exceptfds, struct timeval * timeout)
1) use copy_from_user to copy fd_set from user space to kernel space
2) register callback function _ _ pollwait
3) iterate through all fd and call their corresponding poll method (for socket, this poll method is sock_poll,sock_poll will call tcp_poll,udp_poll or datagram_poll according to the situation)
4) take tcp_poll as an example, its core implementation is _ _ pollwait, which is the callback function registered above.
5) the main job of _ _ pollwait is to hang the current (current process) in the waiting queue of the device. Different devices have different waiting queues. For tcp_poll, the waiting queue is sk- > sk_sleep (note that hanging the process to the waiting queue does not mean that the process is asleep). After the device receives a message (network device) or fills in the file data (disk device), the device wakes up the process waiting for sleep on the queue, and current is awakened.
6) when the poll method returns, it returns a mask mask that describes whether the read and write operations are ready, and assigns a value to fd_set according to this mask mask
7) if you have not returned a read-write mask mask after traversing all the fd, the process that calls select (that is, current) will be called schedule_timeout to sleep
8) when the device driver has its own resources readable and writable, it will wake up the process of sleeping in the waiting queue. If no one wakes up after a certain timeout (specified by timeout), the process calling select will be awakened to get the CPU, and then re-traverse the fd to determine if there is a ready fd.
9) copy fd_set from kernel space to user space
Shortcomings of select
Every time you call select, you need to copy the fd collection from the user state to the kernel state, which is very expensive in many cases of fd.
At the same time, each call to select requires the kernel to traverse all the fd passed in, which is also very expensive in many cases of fd.
The number of file descriptors supported by select is too small. The default is 1024.
Epoll
Int epoll_create (int size); int epoll_ctl (int epfd, int op, int fd, struct epoll_event * event); int epoll_wait (int epfd, struct epoll_event * events,int maxevents, int timeout)
When epoll_create is called, a red-black tree is built in the kernel cache to store future socket from epoll_ctl, and a rdllist bi-directional linked list is also created to store ready events. When epoll_wait is called, you can only view the rdllist two-way linked list data
When epoll_ctl adds, modifies and deletes events to the epoll object, it operates in the rbr red-black tree, which is very fast.
The event added to epoll establishes a callback relationship with the device (such as a network card). When the corresponding event occurs on the device, the callback method is called to add the event to the rdllist two-way linked list; this callback method is called ep_poll_callback in the kernel.
Two trigger modes of epoll
Epoll has two trigger modes: EPOLLLT and EPOLLET. LT is the default mode, and ET is the "high speed" mode (only no-block socket is supported).
In LT (horizontal trigger) mode, each epoll_wait triggers its read event as long as the file descriptor still has data to read.
In ET (Edge triggering) mode, when an epoll_wait event is detected, you will get a file descriptor with event notification. For a file descriptor, if readable, you must read the file descriptor until empty (or return EWOULDBLOCK), otherwise the next epoll_wait will not trigger the event.
Advantages of epoll over select
Solve the three shortcomings of select
For the first disadvantage: the solution for epoll is in the epoll_ctl function. Each time a new event is registered into the epoll handle (specify EPOLL_CTL_ADD in epoll_ctl), all fd is copied into the kernel instead of duplicated during epoll_wait. Epoll ensures that each fd will only be copied once in the whole process (epoll_wait does not need to be copied)
For the second disadvantage: epoll specifies a callback function for each fd, which is called when the device is ready to wake up the waiters on the waiting queue, and this callback function adds the ready fd to a ready list. Epoll_wait 's job is actually to see if there is a ready fd in this ready list (no need to traverse)
For the third disadvantage: epoll does not have this limitation. The upper limit of FD it supports is the maximum number of files that can be opened, which is generally much greater than 2048. For example, on a machine with 1GB memory, it is about 100000. Generally speaking, this number has a lot to do with system memory.
High performance of epoll
Epoll uses a red-black tree to save file descriptor events that need to be monitored. Epoll_ctl adds, deletes and modifies quickly.
Epoll does not need to traverse to get the ready fd, but can directly return to the ready list
Mmap technology is used after linux2.6, and data no longer needs to be copied from the kernel to user space, zero copy.
The question that the IO model of epoll is synchronous and asynchronous
Concept definition
Synchronous Iripple O operation: causes the request process to block until the Iripple O operation completes
Asynchronous Iripo operation: does not cause blocking of the request process. Async only processes the notification after the completion of the Icano operation, and does not actively read and write data. The system kernel completes the reading and writing of the data.
Blocking, non-blocking: whether the data to be accessed by the process / thread is ready, and whether the process / thread needs to wait
The concept of asynchronous IO is to require a non-blocking I _ peg O call. As described earlier, the Iamp O operation is divided into two phases: R1 waits for the data to be ready. R2 copies data from the kernel to the process. Although epoll uses the mmap mechanism after the kernel 2.6 so that it does not need to be replicated in the R2 phase, it is still blocked on R1. So it's classified as synchronous IO.
Reactor model
The central idea of Reactor is to register all the iUnip O events to be processed on a central iUnix O multiplexer, while the main thread / process blocks on the multiplexer; once the iUnip O event arrives or is ready, the multiplexer returns and distributes the corresponding pre-registered iUnip O events to the corresponding processors.
Introduction to related concepts:
Event: is the state; for example, the read ready event refers to the state in which we can read data from the kernel
Event separator: generally, the waiting of the event is handed over to epoll and select;, and the arrival of the event is random and asynchronous, so you need to call epoll in a loop. The corresponding encapsulated module in the framework is the event separator (simply understood as encapsulating epoll).
Event handler: after an event occurs, it needs to be handled by a process or thread. This handler is the event handler, which is generally a different thread from the event splitter.
General process of Reactor
1) the application registers read-write-ready events and read-write-ready event handlers with the event splitter
2) the event splitter waits for the read-write ready event to occur
3) read-write-ready event occurs, the event splitter is activated, and the splitter calls the read-write event handler
4) the event handler reads the data from the kernel to the user space before processing the data
Single thread + Reactor
Multithreading + Reactor
Multithreading + multiple Reactor
General flow of Proactor Model
1) the application registers the read completion event and the read completion event handler in the event separator, and sends an asynchronous read request to the system.
2) the event splitter waits for the completion of the read event
3) in the process of waiting for the splitter, the system uses parallel kernel threads to perform the actual read operation, copies the data into the process buffer, and finally informs the event separator that the reading is complete.
4) the event separator listens to the read completion event and activates the processor of the read completion event
5) the read completion event handler directly processes the data in the user process buffer
The difference between Proactor and Reactor
Proactor is based on the concept of asynchronous Imax O, while Reactor is generally based on the concept of multiplexing Imax O
Proactor does not need to copy data from the kernel to user space, which is done by the system.
The above is what the Linux network I/O+Reactor model is like. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.