In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
This article comes from the official account of Wechat: developing Internal skills practice (ID:kfngxl). Author: Zhang Yanfei allen
As we all know, when creating a server program, you need to listen before you can receive requests from the client. For example, the following code is all too familiar to us.
Int main (int argc, char const * argv []) {int fd = socket (AF_INET, SOCK_STREAM, 0); bind (fd,); listen (fd, 128); accept (fd,); so let's think about a question today: why do we need listen to receive a connection? Or in other words, what exactly did listen do when it was executed internally?
If you also want to find out these secrets inside listen, please follow me!
The first thing you need to do to create a socket server is to create a socket. Specifically, by calling the socket function. When the socket function is finished, we see that a file descriptor fd is returned from the user layer perspective. But in the kernel, it is actually a combination of kernel objects, and the general structure is as follows.
Here is a simple understanding of this structure, later we need to look back at it when we see a function pointer call in the source code.
Kernel executes listen2.1 listen system call I found the source code of listen system call under net / socket.c.
/ / file: net/socket.cSYSCALL_DEFINE2 (listen, int, fd, int, backlog) {/ / find the socket kernel object sock = sockfd_lookup_light (fd, & err, & fput_needed) according to fd; if (sock) {/ / get kernel parameters net.core.somaxconn somaxconn = sock_net (sock- > sk)-> core.sysctl_somaxconn; if ((unsigned int) backlog > somaxconn) backlog = somaxconn / / call the listen function err = sock- > ops- > listen (sock, backlog) registered in the protocol stack;} the socket file descriptor in user mode is just an integer, and the kernel cannot use it directly. So the first line of code in this function is to find the corresponding socket kernel object based on the file descriptor passed in by the user.
Then get the value of the net.core.somaxconn kernel parameter in the system, compare it with the backlog passed in by the user, and pass a minimum value to the next step.
So, although listen allows us to pass in backlog (this value is related to both semi-connected queues and fully connected queues). But it won't work if the user passes in something larger than net.core.somaxconn.
Then enter the listen function of the protocol stack by calling sock- > ops- > listen.
2.2 Protocol stack listen here we need to use the socket kernel object structure diagram in the first section, from which we can see that sock- > ops- > listen actually executes inet_listen.
/ / file: net/ipv4/af_inet.cint inet_listen (struct socket * sock, int backlog) {/ / not listen status (not yet listen) if (old_state! = TCP_LISTEN) {/ / start listening err = inet_csk_listen_start (sk, backlog);} / / set full connection queue length sk- > sk_max_ack_backlog = backlog } here let's take a look at the bottom line. Sk- > sk_max_ack_backlog is the maximum length of the fully connected queue. So here we know a key technical point: the full connection queue length of the server is the smaller value between the backlog and net.core.somaxconn passed in when listen.
If you encounter a full connection queue overflow online and want to increase the queue length, you may need to consider both the backlog and net.core.somaxconn passed in listen.
Look back at the inet_csk_listen_start function.
/ / file: net/ipv4/inet_connection_sock.cint inet_csk_listen_start (struct sock * sk, const int nr_table_entries) {struct inet_connection_sock * icsk = inet_csk (sk); / / icsk-icsk_accept_queue is the receiving queue. For more information, please see Section 2.3 / / Application and initialization of the kernel object of the receiving queue. For more information, see int rc = reqsk_queue_alloc (& icsk-icsk_accept_queue, nr_table_entries) in Section 2.4. At the beginning of the function, the struct sock object is cast to inet_connection_sock, called icsk.
Here's a brief explanation of why it can be so cast, because inet_connection_sock contains sock. Tcp_sock, inet_connection_sock, inet_sock, and sock are layer-by-layer nested relationships, similar to the concept of inheritance in object-oriented.
For TCP's socket, the sock object is actually a tcp_sock. Therefore, the sock object in TCP can be used at any time by forcing types to be converted to tcp_sock, inet_connection_sock, and inet_sock.
The next line of reqsk_queue_alloc actually contains two important things. One is the definition of the data structure of the receiving queue. The second is to receive the application and initialization of the queue. These two pieces are more important, which we introduce in sections 2.3 and 2.4, respectively.
Icsk- > icsk_accept_queue is defined under inet_connection_sock and is an object of type request_sock_queue. Is the main data structure used by the kernel to receive client requests. The fully connected queues and semi-connected queues we usually talk about are all implemented in this data structure.
Let's look at the specific code.
/ / file: include/net/inet_connection_sock.hstruct inet_connection_sock {struct inet_sock icsk_inet; struct request_sock_queue icsk_accept_queue;.} Let's find the definition of request_sock_queue again, as follows.
/ / file: include/net/request_sock.hstruct request_sock_queue {/ / fully connected queue struct request_sock * rskq_accept_head; struct request_sock * rskq_accept_tail; / / semi-connected queue struct listen_sock * listen_opt;}; for fully connected queues, there is no need to do complicated search work on it, just first-in-first-out (FIFO) acceptance during accept. So fully connected queues are managed as linked lists through rskq_accept_head and rskq_accept_tail.
The data object associated with semi-connected queues is listen_opt, which is of type listen_sock.
/ / file: struct listen_sock {u8 max_qlen_log; u32 nr_table_entries;. Struct request_sock * syn_table [0];}; because the server needs to quickly find out the request_sock object retained in the first handshake during the third handshake, it is actually managed by a hash table, that is, struct request_sock * syn_table [0]. Both max_qlen_log and nr_table_entries are related to the length of semi-connected queues.
Receiving queue requests and initialization now that we understand the full / semi-connected queue data structure, let's go back to the inet_csk_listen_start function. It calls reqsk_queue_alloc to apply for and initialize the important object icsk_accept_queue.
/ / file: net/ipv4/inet_connection_sock.cint inet_csk_listen_start (struct sock * sk, const int nr_table_entries) {int rc = reqsk_queue_alloc (& icsk-icsk_accept_queue, nr_table_entries);} completes the creation and initialization of the receive queue request_sock_queue kernel object in the reqsk_queue_alloc function. It includes memory request, calculation of semi-connected queue length, initialization of fully connected queue header, and so on.
Let's get into its source code:
/ / file: net/core/request_sock.cint reqsk_queue_alloc (struct request_sock_queue * queue, unsigned int nr_table_entries) {size_t lopt_size = sizeof (struct listen_sock); struct listen_sock * lopt; / / calculate the length of the semi-connected queue nr_table_entries = min_t (U32, nr_table_entries, sysctl_max_syn_backlog); nr_table_entries =. / / request memory for listen_sock object, which contains semi-connected queue lopt_size + = nr_table_entries * sizeof (struct request_sock *); if (lopt_size > PAGE_SIZE) lopt = vzalloc (lopt_size); else lopt = kzalloc (lopt_size, GFP_KERNEL); / / fully connected queue header initialization queue- > rskq_accept_head = NULL; / / semi-connected queue setup lopt- > nr_table_entries = nr_table_entries; queue- > listen_opt = lopt A struct listen_sock pointer is defined at the beginning. This listen_sock is what we often call a semi-connected queue.
Next, calculate the length of the semi-connected queue. After calculating the actual size, start to apply for memory. Finally, the fully connected queue header queue- > rskq_accept_head is set to NULL, and the semi-connected queue is hung on the receiving queue queue.
One detail to note here is that each element on the semi-join queue is assigned a pointer size (sizeof (struct request_sock *)). This is actually a Hash table. The real semi-connected request_sock object is allocated during the handshake and hangs on the Hash table after calculating the Hash value.
The length of the semi-connected queue is calculated in the previous section, we mentioned that the length of the semi-connected queue is calculated in the reqsk_queue_alloc function, because this is a bit complicated, so we will discuss this in a separate section.
/ / file: net/core/request_sock.cint reqsk_queue_alloc (struct request_sock_queue * queue, unsigned int nr_table_entries) {/ / calculate the length of the semi-connected queue nr_table_entries = min_t (U32, nr_table_entries, sysctl_max_syn_backlog); nr_table_entries = max_t (U32, nr_table_entries, 8); nr_table_entries = roundup_pow_of_two (nr_table_entries + 1) / / for the sake of efficiency, instead of recording nr_table_entries / /, record the powers of 2 equals nr_table_entries for (lopt- > max_qlen_log = 3; (1 max_qlen_log)
< nr_table_entries; lopt->Max_qlen_log++);.} the incoming nr_table_entries can be seen at the place where the reqsk_queue_alloc was originally called, which is the smaller value between the kernel parameter net.core.somaxconn and the backlog passed in when the user calls listen.
In this reqsk_queue_alloc function, three more comparisons and calculations will be completed.
Min_t (U32, nr_table_entries, sysctl_max_syn_backlog) this is once again minimized with the sysctl_max_syn_backlog kernel object.
Max_t (U32, nr_table_entries, 8) this sentence ensures that the nr_table_entries cannot be less than 8, which is used to prevent novice users from passing in a value that is too small to establish a connection.
Roundup_pow_of_two (nr_table_entries + 1) is used to align to the integer power of 2.
Speaking of which, you may already have a headache. It is true that such a description is a bit abstract. Let's put it another way and calculate it with two actual Case.
Suppose: the kernel parameter net.core.somaxconn on a server is 128. Ipv4. TCP maxillary syndromes backlog is 8192. So how long is the semi-connection queue when the user backlog passes in 5?
Like the code, we also divide the calculation into four steps, and the final result is 16.
Min (backlog, somaxconn) = min (5,128) = 5
Min (5, tcp_max_syn_backlog) = min (5, 8192) = 5
Max (5,8) = 8
Roundup_pow_of_two (8 + 1) = 16
Somaxconn and tcp_max_syn_backlog remain the same, and the backlog at listen increases to 512. Again, the result is 256.
Min (backlog, somaxconn) = min (512,128) = 128
Min (128, tcp_max_syn_backlog) = min (128, 8192) = 128
Max (128,8) = 128
Roundup_pow_of_two (128 + 1) = 256
At this point, I sum up the calculation of the semi-connected queue length into a sentence, the semi-connected queue length is min (backlog, somaxconn, tcp_max_syn_backlog) + 1 and then rounded up to the power of 2, but the minimum cannot be less than 16. The kernel source code I use is 3.10, and the kernel version you have may be a little different from this.
If you encounter a semi-connection queue overflow online and want to increase the queue length, you need to consider the three kernel parameters somaxconn, backlog, and tcp_max_syn_backlog at the same time.
Finally, in order to improve comparison performance, the kernel does not directly record the length of semi-connected queues. Instead, it uses an obscure method to record only its power assuming that the queue length is 16, then the record max_qlen_log is 4 (the fourth power of 2 equals 16), and assuming the queue length is 256, the record max_qlen_log is 8 (2 to the power of 8 equals 16). You just need to know that this thing is designed to improve performance.
Finally, it is concluded that computer science students memorize the server-side socket program flow like an eight-part essay: first bind, then listen, then accept. As for why we need to listen before we can accpet, we seem to pay little attention to it.
Through a simple browse of the listen source code today, we find that the main job of listen is to apply for and initialize the receiving queue, including the fully connected queue and the semi-connected queue. The full join queue is a linked list, while the semi-join queue uses a hash table because it needs to be looked up quickly (in fact, a more accurate name for the semi-join queue should be called the semi-join hash table).
Full / half queues are two important data structures in the three-way handshake, with which the server can respond normally to the three-way handshake from the client. So listen is needed on the server side.
In addition to that, we learned how the kernel determines the length of full / semi-connected queues.
1. The length of the fully connected queue, for the fully connected queue, the maximum length is the smaller value between the backlog and net.core.somaxconn passed in listen. If you need to increase the length of the full connection queue, adjust the backlog and somaxconn.
two。 The length of semi-connected queues in the process of listen, we also saw that for semi-connected queues, the maximum length is the power of min (backlog, somaxconn, tcp_max_syn_backlog) + 1 and then rounded to 2, but the minimum cannot be less than 16. If you need to increase the length of the semi-connection queue, you need to consider both backlog,somaxconn and tcp_max_syn_backlog. Any article on the Internet that tells you that changing a parameter can increase the length of the semi-connection queue is wrong.
So, do not let go of a detail, you may have unexpected gains!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.