Linux Advanced flow Control tc usage 07/06 Update SLTechnology News&Howtos

Linux Advanced flow Control tc usage

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

When doing MHA testing, an important step is to test the network conditions of MHA Manager nodes and Master nodes. If jitter occurs, MHA itself provides a parameter secondary_check to guarantee, but if your deployment environment is one master and one slave, this parameter will not work, because latest slave and oldest slave are the same library, to put it simply, it is impossible to connect, as for whether it is cut or not. It's hard to say. In the scenario we tested, sometimes we cut, sometimes we didn't. So the MHA0.57 version we originally tested was downgraded to 0.56. after careful testing, we found that there was also such a problem. After summing up again and again, we canceled the secondary_check and directly adjusted the configuration of timeouts in the MHA code (the default is 4 times).

The next problem comes, if we do more in-depth testing, we are bound to need a complete simulation of network jitter, at this time the traditional service network stop; sleep xxx; service network start approach will be limited. One potential reason is that after restarting the service, VIP will be gone.

However, it can basically simulate the MHA scene to ensure that the jitter will not be mistakenly cut within the specified time range.

Therefore, after a comprehensive test, we have a clear mind, how to adjust those aspects, those details need to continue to study, have some experience and experience.

But the network test actually does not feel thorough enough, after all, the real network jitter will not be unavailable network card, but the network timeout, packet loss and so on.

So if we can simulate the network problems as much as possible and cooperate with MHA to debug and test, we will be able to basically simulate the real problem scenario. So the plan of tc came into my sight.

In the network flow control of Linux, the sending rate can only be controlled where the bottleneck network card is generated. There are two kinds of flow control processes (refer to https://www.ibm.com/developerworks/cn/linux/1412_xiehy_tc/index.html below).

Queue control is QOS, the rule control of the sending queue at the bottleneck, the common one is SFQ PRIO

Flow control, that is, bandwidth control, queue shaping, usually TBF HTB

There are two kinds of Linux flow control algorithms:

Classless algorithms are used for leaf-level queues without branches, such as SFQ

Classification algorithm for multi-branch queues, for example: PRIO TBF HTB

The flow control algorithms SFQ and TBF need to be understood simply.

SFQ (Stochastic Fairness Queueing Random Fair queue) is a simple implementation of a family of fair queuing algorithms. It is not as accurate as other methods, but it achieves a high degree of fairness and requires less computation.

Among them, SFQ will only occur on the network card where the data is congested and the waiting queue is generated. If the exit network card has no waiting queue, SFQ will not work.

Token bucket filter (TBF) is a simple queue rule: only packets that arrive at no more than a predetermined rate are allowed to pass, but may allow short bursts of traffic to pass the set value.

First, simply simulate the network timeout 100ms

Use the following command to deal with the specific situation of the network card and modify the configuration.

# tc qdisc add dev eth2 root netem delay 100ms

If you test on native ping. The delay is still very low. Level 0.0x.

[root@oel642 ~] # ping 192.168.253.129

PING 192.168.253.129 (192.168.253.129) 56 (84) bytes of data.

64 bytes from 192.168.253.129: icmp_seq=1 ttl=64 time=0.011 ms

64 bytes from 192.168.253.129: icmp_seq=2 ttl=64 time=0.044 ms

64 bytes from 192.168.253.129: icmp_seq=3 ttl=64 time=0.051 ms

If the timeout option is set, the specified delay will be generated evenly.

[root@oel643 ~] # ping 192.168.253.129

PING 192.168.253.129 (192.168.253.129) 56 (84) bytes of data.

64 bytes from 192.168.253.129: icmp_seq=1 ttl=64 time=202 ms

64 bytes from 192.168.253.129: icmp_seq=2 ttl=64 time=101ms

64 bytes from 192.168.253.129: icmp_seq=3 ttl=64 time=101ms

64 bytes from 192.168.253.129: icmp_seq=4 ttl=64 time=101ms

64 bytes from 192.168.253.129: icmp_seq=5 ttl=64 time=100 ms

To unset tc, you can use the

Tc qdisc del dev eth2 root netem

The following ways result in a range of delays, such as the default delay of 100 milliseconds, fluctuating up and down by 10 milliseconds.

[root@oel642 ~] # tc qdisc add dev eth2 root netem delay 100ms 10ms

The results of ping are as follows:

64 bytes from 192.168.253.129: icmp_seq=278 ttl=64 time=98.3 ms

64 bytes from 192.168.253.129: icmp_seq=279 ttl=64 time=99.1 ms

64 bytes from 192.168.253.129: icmp_seq=280 ttl=64 time=93.4 ms

64 bytes from 192.168.253.129: icmp_seq=281 ttl=64 time=95.5 ms

There are several types of network situations to consider, such as packet loss. In the scenario of traffic hijacking, the packet loss rate is a scenario that needs to be paid attention to.

We can play bigger, with a packet loss rate of 10%, which is a more serious problem.

[root@oel642 ~] # tc qdisc add dev eth2 root netem loss 10%

The results of ping are as follows. As you can see in the summary, the packet loss rate is basically within the basic range of 10%, which is currently 8%.

64 bytes from 192.168.253.129: icmp_seq=421 ttl=64 time=0.486 ms

64 bytes from 192.168.253.129: icmp_seq=422 ttl=64 time=0.413 ms

64 bytes from 192.168.253.129: icmp_seq=423 ttl=64 time=0.616 ms

^ C

-192.168.253.129 ping statistics-

426 packets transmitted, 390 received, 8% packet loss, time 425724ms

Rtt min/avg/max/mdev = 0.144 ms 64.257 ms 120.621 Universe 49.069

What to do if the packet is duplicated. For example, the proportion of repeated packages is set to 50%.

> tc qdisc add dev eth2 root netem duplicate 50%

The results of using ping are as follows:

PING 192.168.253.128 (192.168.253.128) 56 (84) bytes of data.

64 bytes from 192.168.253.128: icmp_seq=1 ttl=64 time=0.402 ms

64 bytes from 192.168.253.128: icmp_seq=1 ttl=64 time=0.409 ms (DUP!)

64 bytes from 192.168.253.128: icmp_seq=2 ttl=64 time=0.788 ms

64 bytes from 192.168.253.128: icmp_seq=3 ttl=64 time=0.887 ms

64 bytes from 192.168.253.128: icmp_seq=4 ttl=64 time=0.721 ms

64 bytes from 192.168.253.128: icmp_seq=4 ttl=64 time=0.757 ms (DUP!)

64 bytes from 192.168.253.128: icmp_seq=5 ttl=64 time=1.33 ms

For example, in the case of bad packets.

Tc qdisc add dev eth2 root netem corrupt 50%

The results of ping are as follows:

64 bytes from 192.168.253.128: icmp_seq=51 ttl=64 time=0.468 ms

64 bytes from 192.168.253.128: icmp_seq=52 ttl=64 time=0.822 ms

Wrong data byte # 23 should be 0x17 but was 0x15

# 16 10 11 12 13 14 15 16 15 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f

# 48 30 31 32 33 34 35 36 37

64 bytes from 192.168.253.128: icmp_seq=53 ttl=64 time=1.71 ms

Wrong data byte # 53 should be 0x35 but was 0x37

# 16 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f

# 48 30 31 32 33 34 37 36 37

64 bytes from 192.168.253.128: icmp_seq=54 ttl=64 time=0.000 ms

64 bytes from 192.168.253.128: icmp_seq=56 ttl=64 time=0.000 ms

If the packets are out of order, we can add randomness, 25% of the packets are sent immediately, the delay of the other packets is 10 milliseconds, and the coefficient is 50%.

[root@oel641 ~] # tc qdisc change dev eth2 root netem delay 10ms reorder 25% 50%

The results of ping are as follows:

64 bytes from 192.168.253.128: icmp_seq=200 ttl=64 time=1.24 ms

64 bytes from 192.168.253.128: icmp_seq=201 ttl=64 time=0.587 ms

64 bytes from 192.168.253.128: icmp_seq=202 ttl=64 time=1.01 ms

64 bytes from 192.168.253.128: icmp_seq=203 ttl=64 time=0.790 ms

64 bytes from 192.168.253.128: icmp_seq=204 ttl=64 time=0.998 ms

64 bytes from 192.168.253.128: icmp_seq=205 ttl=64 time=0.285 ms

64 bytes from 192.168.253.128: icmp_seq=206 ttl=64 time=0.882 ms

For more complex scenarios, for example, we can consider adding traffic restrictions. The network speed is controlled at 256k and the maximum delay is 50ms.

[root@oel641 ~] # tc qdisc add dev eth2 root handle 1:0 netem delay 100ms

[root@oel641 ~] # tc qdisc add dev eth2 parent 1:1 handle 10: tbf rate 256kbit burst 10000 latency 50ms

Rate 256kbit burst transmission 10k maximum delay 50ms

If there is no flow control, the transmission can reach 90m by default.

[root@oel642] # scp 192.168.253.128:~/Percona-Server-5.6.14-rel62.0-483.Linux.x86_64.tar.gz.

Percona-Server-5.6.14-rel62.0-483.Linux.x86_64.tar.gz 100% 93MB 92.9MB/s 00:01

If the flow control scenario is set, it will definitely remain within a specified range.

[root@oel642] # scp 192.168.253.128:~/Percona-Server-5.6.14-rel62.0-483.Linux.x86_64.tar.gz.

Percona-Server-5.6.14-rel62.0-483.Linux.x86_64.tar.gz 0% 208KB 16.8KB/s 1:34:05 ETA

Of course, the above scenarios need to be simulated in the test environment, or the loss will outweigh the gain if there are unexpected problems.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.