Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The handling method of tcp_mark_head_lost error reported by linux system

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Problem description

Recently, a host reported the following kernel information:

Jul 8 10:47:42 cztest kernel:-[cut here]-Jul 8 10:47:42 cztest kernel: WARNING: at net/ipv4/tcp_input.c:2269 tcp_mark_head_lost+0x113/0x290 () Jul 8 10:47:42 cztest kernel: Modules linked in: iptable_filter ip_tables binfmt_misc cdc_ether usbnet mii xt_multiport dm_mirror dm_region_hash dm_log dm_mod intel _ powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif ipmi_devintf ipmi_si mei_me pcspkr iTCO_wdt mxm_wmi iTCO_vendor_support dcdbas mei sg sb_edac edac_core ipmi_msghandler shpchp lpc_ich wmi acpi_power_meter xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper crct10dif_pclmul crct10dif_common syscopyarea crc32c_intel sysfillrect sysimgblt fb_sys_fops igb ttm ptp drm ahci pps _ core libahci dca i2c_algo_bit libata megaraid_sas i2c_core fjes [last unloaded: ip_tables] Jul 8 10:47:42 cztest kernel: CPU: 10 PID: 0 Comm: swapper/10 Tainted: GW-3.10.0-514.16.1.el7.x86_64 # 1Jul 8 10:47:42 cztest kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP BIOS 2.3.4 11/08/2016Jul 810: 47:42 cztest kernel: 0000000000000000 dd79fe633eacd853 ffff88103e743880 ffffffff81686ac3Jul 810: 47:42 cztest kernel: ffff88103e7438b8 ffffffff81085cb0 ffff8806d5c57800 ffff88010a4e6c80Jul 810: 47:42 cztest kernel: 0000000000000001 00000000f90e778c 0000000000000001 ffff88103e7438c8Jul 810: 47:42 cztest kernel: Call Trace:Jul 810: 47:42 cztest kernel: [] dump_stack+0x19/0x1bJul 810: 47:42 cztest kernel: [] warn_slowpath_common+0x70/0xb0Jul 810: 47:42 cztest kernel: [] warn_slowpath_null+0x1a/0x20Jul 810 : 47:42 cztest kernel: [] tcp_mark_head_lost+0x113/0x290Jul 8 10:47:42 cztest kernel: [] tcp_update_scoreboard+0x67/0x80Jul 8 10:47:42 cztest kernel: [] tcp_fastretrans_alert+0x6dd/0xb50Jul 8 10:47:42 cztest kernel: [] tcp_ack+0x8dd/0x12e0Jul 8 10:47:42 cztest kernel: [] tcp_rcv_established+0x118/0x760Jul 8 10:47:42 cztest kernel: [] tcp_v4_do_rcv+0x10a/0x340Jul 8 10:47:42 cztest kernel: []? Security_sock_rcv_skb+0x16/0x20Jul 8 10:47:42 cztest kernel: [] tcp_v4_rcv+0x799/0x9a0Jul 8 10:47:42 cztest kernel: []? Iptable_filter_hook+0x36/0x80 [iptable_filter] Jul 8 10:47:42 cztest kernel: [] ip_local_deliver_finish+0xb4/0x1f0Jul 8 10:47:42 cztest kernel: [] ip_local_deliver+0x59/0xd0Jul 8 10:47:42 cztest kernel: []? Ip_rcv_finish+0x350/0x350Jul 8 10:47:42 cztest kernel: [] ip_rcv_finish+0x8a/0x350Jul 8 10:47:42 cztest kernel: [] ip_rcv+0x2b6/0x410Jul 8 10:47:42 cztest kernel: [] _ _ netif_receive_skb_core+0x582/0x800Jul 8 10:47:42 cztest kernel: []? Tcp4_gro_receive+0x134/0x1b0Jul 8 10:47:42 cztest kernel: []? _ _ slab_free+0x81/0x2f0Jul 8 10:47:42 cztest kernel: [] _ _ netif_receive_skb+0x18/0x60Jul 8 10:47:42 cztest kernel: [] netif_receive_skb_internal+0x40/0xc0Jul 8 10:47:42 cztest kernel: [] napi_gro_receive+0xd8/0x130Jul 8 10:47:42 cztest kernel: [] igb_clean_rx_irq+0x387/0x700 [igb] Jul 8 10:47:42 cztest kernel: []? Skb_release_data+0xf2/0x140Jul 8 10:47:42 cztest kernel: [] igb_poll+0x383/0x770 [igb] Jul 8 10:47:42 cztest kernel: []? Tcp_write_timer_handler+0x200/0x200Jul 8 10:47:42 cztest kernel: [] net_rx_action+0x170/0x380Jul 8 10:47:42 cztest kernel: [] _ _ do_softirq+0xef/0x280Jul 8 10:47:42 cztest kernel: [] call_softirq+0x1c/0x30Jul 8 10:47:42 cztest kernel: [] do_softirq+0x65/0xa0Jul 8 10:47:42 cztest kernel: [] irq_exit+0x115/0x120Jul 8 10:47:42 cztest kernel: [] do_IRQ+0x58/0xf0Jul 8 10:47 : 42 cztest kernel: [] common_interrupt+0x6d/0x6dJul 8 10:47:42 cztest kernel: []? Cpuidle_enter_state+0x52/0xc0Jul 8 10:47:42 cztest kernel: [] cpuidle_idle_call+0xd9/0x210Jul 8 10:47:42 cztest kernel: [] arch_cpu_idle+0xe/0x30Jul 8 10:47:42 cztest kernel: [] cpu_startup_entry+0x245/0x290Jul 8 10:47:42 cztest kernel: [] start_secondary+0x1ba/0x230Jul 8 10:47:42 cztest kernel:-- [end trace 6bc65b0c591c1794]--

The host environment is as follows:

System | Dell Inc.; PowerEdge R620

Platform | Linux

Kernel | Centos 3.10.0-514.16.1.el7.x86_64

Total Memory | 64g

Processing instructions

The printing process of the stack is similar to the xfs alarm processing. The general process is to enable sack for the kernel. After the fack function is enabled, the fast retransmission and selective retransmission needed in the network transmission process will be processed through the tcp_mark_head_lost function of the tcp_input.c file, which mainly marks the number of messages lost in the transmission process, as shown below. The kernel stack information reported by the system is triggered by the tcp_verify_left_out function call in the tcp_mark_head_lost function:

/ / source/include/net/tcp.h # define tcp_verify_left_out (tp) WARN_ON (tcp_left_out (tp) > tp- > packets_out) static inline unsigned int tcp_left_out (const struct tcp_sock * tp) {return tp- > sacked_out + tp- > lost_out } / / source/include/asm-generic/bug.h # define _ WARN () warn_slowpath_null (_ _ FILE__, _ LINE__) # ifndef WARN_ON#define WARN_ON (condition) ({\ _ _ WARN ();\}) # endif// source/net/ipv4/tcp_input.c/* Detect loss in event "A" above by marking head of queue up as lost. * For FACK or non-SACK (Reno) senders, the first "packets" number of segments * are considered lost. For RFC3517 SACK, a segment is considered lost if it * has at least tp- > reordering SACKed seqments above it; "packets" refers to * the maximum SACKed segments to pass before reaching this limit. * / static void tcp_mark_head_lost (struct sock * sk, int packets, int mark_head) {struct tcp_sock * tp = tcp_sk (sk); Tcp_verify_left_out (tp); / / trigger dump_stack}... static void tcp_update_scoreboard (struct sock * sk, int fast_rexmit) {struct tcp_sock * tp = tcp_sk (sk); if (tcp_is_reno (tp)) {tcp_mark_head_lost (sk, 1,1);} else if (tcp_is_fack (tp)) {int lost = tp- > fackets_out-tp- > reordering; if (lost sacked_out-tp- > reordering) If (sacked_upto > = 0) tcp_mark_head_lost (sk, sacked_upto, 0); else if (fast_rexmit) tcp_mark_head_lost (sk, 1,1);}}

From the perspective described in redhat-536483, this error message is generally caused by tcp bug and can be triggered when the kernel uses the released tcp socket buffer linked list:

Root Cause

A use after free issue related to the TCP kernel socket buffer linked list. Thus it is a bug in the TCP kernel code. Although the bug is in TCP kernel code, but it could get triggered in multiple ways. It could get triggered due to NFS, or due to even an application (say java process).

Treatment mode

Upgrade kernel

As shown below, redhat may have fixed the use after free-related bug of tcp_*-related functions in version 3.10.0-520, and you can try to upgrade to deal with this problem:

Centos 7.x changelog

* Thu Nov 03 2016 Rafael Aquini [3.10.0-520.el7]-[net] tcp: fix use after free in tcp_xmit_retransmit_queue () (Mateusz Guzik) [1379531] {CVE-2016-6828}

Turn off the fack/sack function

Judging from the documents in the Red Hat knowledge base, the tcp_mark_head_lost function is mainly used to mark the number of messages lost in the process of fast retransmission and selective confirmation, so you may be able to temporarily turn off the fack/sack parameter to avoid this problem:

Sysctl-w net.ipv4.tcp_fack=0sysctl-w net.ipv4.tcp_sack=0

You can give priority to try the second way, if there is still a problem, then consider upgrading the kernel version.

Referenc

Redhat-536483

Bug-1367091

Cve-2016-6828

Kernel-commit

Summary

The above is the whole content of this article. I hope the content of this article has a certain reference and learning value for everyone's study or work. Thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report