In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "the sending process of Linux packets". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Socket layer
+-+ | Application | +-+ | | ↓ +-+ | socket (AF_INET SOCK_DGRAM, IPPROTO_UDP) | +-+ | | ↓ +-+ | sendto (sock) ) | +-+ | | ↓ +-+ | inet_sendmsg | +- -+ | | ↓ +-+ | inet_autobind | +-+ | | ↓ +-+ | UDP layer | +-+
Socket (...): create a socket structure and initialize the corresponding operation function. Since we define the socket of UDP, all the functions related to UDP are stored in it.
Sendto (sock,...): the application layer program (Application) calls this function to start sending packets, and the number of functions will call the following inet_sendmsg
Inet_sendmsg: this function mainly checks whether the current socket has a bound source port. If not, call inet_autobind to assign one, and then call the UDP layer function.
Inet_autobind: this function will call the get_port function bound on socket to get an available port. Since the socket is the socket of UDP, the get_port function will be called to the corresponding function in the UDP code.
UDP layer
| | ↓ +-+ | udp_sendmsg | +-+ | | ↓ +-+ | ip_route_output_flow | +- -+ | | ↓ +-+ | ip_make_skb | +-+ | | ↓ +- -+ | udp_send_skb (skb Fl4) | +-+ | | ↓ +-+ | IP layer | +-+
Udp_sendmsg: the entrance to the packet sent by the udp module. This function is long. In this function, you will first call ip_route_output_flow to obtain routing information (including source IP and network card), then call ip_make_skb to construct a skb structure, and * associate the information of the network card with the skb.
Ip_route_output_flow: this function will find the device from which the packet should be sent based on the routing table and the destination IP. If the socket is not bound to the source IP, the function will also find the most appropriate source IP based on the routing table. If the socket has been bound to the source IP, but the destination address cannot be reached from the Nic corresponding to the source IP according to the routing table, the packet will be discarded and the data transmission fails, and the sendto function will return an error. The function * * will cram the found device and source IP into the flowi4 structure and return it to udp_sendmsg
Ip_make_skb: the function of this function is to construct the skb package. The IP header has been assigned in the constructed skb package, and some information has been initialized (this is where the source IP of the IP header is set). At the same time, the function will call _ _ ip_append_dat. If sharding is needed, it will fragment in the _ _ ip_append_data function. At the same time, it will check whether the send buffer of socket has been used up. Return ENOBUFS if it is used up
Udp_send_skb (skb, fl4) mainly fills the header of UDP into skb, processes checksum at the same time, and then calls the corresponding functions of IP layer.
IP layer
| | ↓ +-+ | ip_send_skb | +-+ | | ↓ +-- -+ | _ _ ip_local_out_sk |-> | NF_INET_LOCAL_OUT |-> | dst_output_sk | +-+ | | ↓ +-+ | +-- +-+ | ip_finish_output | | neigh_resolve_output | +- -- + | | ↓ +-+ | dev_queue_xmit | +-+
Ip_send_skb: the entry point for the IP module to send data packets. This function simply calls the following function.
_ _ ip_local_out_sk: set the length and checksum of the IP header, and then call the hook of the following netfilter
NF_INET_LOCAL_OUT: the hook for netfilter. You can configure how to handle the packet through iptables. If the packet is not discarded, continue to go down.
Dst_output_sk: this function calls the corresponding output function based on the information in skb. In the case of UDP IPv4, ip_output will be called.
Ip_output: write the Nic information obtained by udp_sendmsg above to skb, and then call the hook of NF_INET_POST_ROUTING
NF_INET_POST_ROUTING: here, the user may have configured SNAT, resulting in a change in the routing information of the skb
Ip_finish_output: this will determine whether the routing information has changed after the previous step. If so, you need to call dst_output_sk again (when you re-call this function, you may not go to ip_output, but to the output function specified by netfilter, which may be xfrm4_transport_output). Otherwise, go down.
Ip_finish_output2: find the address of the next hop (nexthop) in the routing table according to the destination IP, and then call _ _ ipv4_neigh_lookup_noref to find the neigh information of the next hop in the arp table. If not, call _ _ neigh_create to construct an empty neigh structure.
Dst_neigh_output: in this function, if the ip_finish_output2 does not get the neigh information in the previous step, it will go to the function neigh_resolve_output, otherwise neigh_hh_output will be called directly. In this function, the mac address in the neigh information will be filled in the skb, and then dev_queue_xmit will be called to send the packet.
Neigh_resolve_output: this function will send an arp request to get the mac address of the next hop, then fill in the mac address into the skb and call dev_queue_xmit
Netdevice subsystem
| | ↓ +-+-| dev_queue_xmit | | +-+ | | | ↓ | +-+ | | Traffic Control | | +-+ | loopback | | or +-| -+ | IP tunnels ↓ | | ↓ | | +-+ Failed +-- + +-+ +-> | dev_hard_start_xmit |- ---> | raise NET_TX_SOFTIRQ |-> | net_tx_action | +-+ +-+-+ | +-+ | | ↓ ↓ + -+ +-+ | ndo_start_xmit | | packet taps (AF_PACKET) | +-+- -+
Dev_queue_xmit: the entry function of the netdevice subsystem. In this function, the qdisc corresponding to the device is obtained first. If not (such as loopback or IP tunnels), dev_hard_start_xmit is called directly, otherwise the packet will be processed by the Traffic Control module.
Traffic Control: here we mainly do some filtering and priority processing. Here, if the queue is full, the packet will be discarded. For details, please refer to the documentation. When this step is completed, you will also go to dev_hard_start_xmit.
Dev_hard_start_xmit: in this function, you first copy a copy of skb to "packet taps", where tcpdump gets the data, and then call ndo_start_xmit. If dev_hard_start_xmit returns an error (probably NETDEV_TX_BUSY in most cases), the function calling it will put skb in one place, then throw a soft interrupt NET_TX_SOFTIRQ and give it to the soft interrupt handler net_tx_action to try again later (if it is loopback or IP tunnels, there will be no retry logic after failure)
Ndo_start_xmit: this is a function pointer that points to the function that specific driver sends data to.
Device Driver
Ndo_start_xmit will be bound to the corresponding function of the specific Nic driver. After this step, it will be managed by the Nic driver. Different Nic drivers have different processing methods, which are not described in detail here. The flow chart is as follows:
Put the skb into the sending queue of the network card.
Notify the network card to send a packet
Send the interrupt to CPU after the network card is sent.
Clean up the skb after receiving the interruption
In the process of sending data by the Nic driver, there will be some places that need to deal with the netdevice subsystem. For example, if the queue of the Nic is full, you need to tell the upper layer not to send it again, and then notify the upper layer to send data when the queue is free.
Other
SO_SNDBUF: as you can see from the above process, there is no corresponding send buffer for UDP. SO_SNDBUF is just a restriction. When the memory used by the skb allocated by this socket exceeds this value, ENOBUFS will be returned, so it is meaningless to increase this value as long as there is no ENOBUFS error. I can see this sentence in the help file of the sendto function: (Normally, this does not occur in Linux. Packets are just silently dropped when a device queue overflows.). The device queue here should refer to the queue in Traffic Control, indicating that in linux, the default SO_ SNDBUF value is enough for queue. The doubt is that the length and number of queue can be configured. If the configuration is too large, it is reasonable that ENOBUFS may occur.
Txqueuelen: it is said in many places that this controls the length of queue in qdisc, but it seems that only some types of qdisc use this configuration, such as linux's default pfifo_fast.
Hardware RX: generally, the network card has its own ring queue. The size of this queue can be configured through ethtool. When the driver receives a request, it usually puts it in this queue, and then tells the network card to send data. When the queue is full, it will return NETDEV_TX_BUSY to the upper layer call.
Packet taps (AF_PACKET): when sending a packet for * times and retrying to send a packet, it will pass through here. If there is a retry, I am not sure whether tcpdump will catch the packet twice. According to reason, it should not. It may be that I do not understand.
This is the end of the content of "the process of sending Linux packets". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.