Why network services must have load balancing and high availability 07/08 Update SLTechnology News&Howtos

Why network services must have load balancing and high availability

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

The following brings you why network services must have load balancing ability and high availability. I hope it can bring some help to you in practical application. There are many things involved in load balancing, there are not many theories, and there are many books on the Internet. Today, we will use the accumulated experience in the industry to make an answer.

In many cases, some key network services in the enterprise not only have a large data throughput, but also are not allowed to be offline at will, so our network services must have load balancing ability and high availability. Because of the large data throughput, many people have said that it is simple to install more network cards on our key business cloud servers to balance the load. However, if there are more network cards and more IP, it is not only a waste of IP resources, but what is more troublesome is that if a certain network card is offline during customer access, you need to reconnect to another IP network card in order to continue the conversation, which is a headache. Is there a way to have the best of both worlds? So that our equipment can not only use multiple network cards to balance the load, but also use the same IP for external services. Yes, today I would like to share with you how to bind multiple NICs to the same IP, which can not only achieve load balancing but also achieve high availability of network services.

I. Environmental needs

Switch equipment: two switches that support dynamic chain aggregation or one ordinary switch

Network card equipment: two network cards

Operation: Centos6.8

Service requirements: deactivate NetworkManager services

II. Bonding technology

Bind multiple NICs to the same IP address to provide services to achieve high availability or load balancing. Of course, it is impossible to directly set the same IP address to two network cards. Through bonding technology, we can change the MAC address of two network cards to the same, so that we can use the same IP to provide services continuously.

Third, the working mode of Bonding (7 modes)

Mode 0: (balance-rr) Round-robin policy (balanced rotation strategy)

Features: the data packets are transmitted sequentially (that is, the first packet goes eth0, and the next packet goes eth3..... This mode provides load balancing and high availability (fault tolerance) until the last transmission is completed, but we know that if the packets of a connection or session are sent from different interfaces and pass through different links, the problem of packet disorderly arrival is likely to occur on the client, and packets that arrive out of order need to be re-required to be sent. In this way, the throughput capacity of the network will decline.

Mode 1: (active-backup) Active-backup policy (active-backup strategy)

Features: when an active SLAVE connection fails, another standby SLAVE is actively activated, and only one bound MAC address is externally valid to avoid confusion on the switch. This mode only provides high availability (fault tolerance), so it can be seen that the advantage of this algorithm is that it can provide high availability of network connections, but its resource utilization is low, and only one interface is working. In the case of N network interfaces, the resource utilization is 1max N

Mode 2: (balance-xor) XOR policy (balancing strategy)

Features: data packets are transmitted based on the specified transmission HASH policy. The default policy is: (source MAC address XOR destination MAC address)% number of slave. Other transport policies can be specified through the xmit_hash_policy option, which provides load balancing and fault tolerance

Mode 3: (broadcast) broadcast strategy

Features: transmitting each packet on each slave interface, this mode provides fault tolerance but increases the throughput of network devices

Mode 4: (IEEE 802.3ad Dynamic link aggregation) dynamic link aggregation

Features: create an aggregation group that shares the same speed and duplex settings. Multiple slave work under the same active polymer according to the 802.3ad specification.

The slave election for outbound traffic is based on the transport hash policy, which can be changed from the default XOR policy to other policies through the xmit_hash_policy option. It should be noted that not all transmission strategies are 802.3ad adaptive, especially considering the packet disorder mentioned in section 43.2.4 of 802.3ad standard. Different implementations may have different adaptations.

Necessary conditions:

Conditional 1:ethtool supports getting the rate and duplex settings for each slave

Conditional 2:switch (switch) supports IEEE 802.3ad Dynamic link aggregation

Condition 3: most switch (switches) require specific configuration to support 802.3ad mode

Mode 5: (balance-tlb) Adaptive transmit load balancing (Adapter Transport load balancing)

Features: do not need any special switch (switch) support channel bonding. Outbound traffic is allocated on each slave based on the current load (based on speed). If the slave that is receiving data fails, another slave takes over the MAC address of the failed slave. This mode provides load balancing capability.

Necessary condition: ethtool must support the rate of getting each slave

Mode 6: (balance-alb) Adaptive load balancing (Adapter Adaptive load balancing)

Features: this mode includes balance-tlb mode, plus receiving load balancing (receive load balance, rlb) for IPV4 traffic, and does not require any switch (switch) support. Receiving load balancing is implemented through ARP negotiation. The bonding driver intercepts the ARP reply sent by the local machine and rewrites the source hardware address to the unique hardware address of a slave in the bond, so that different peers use different hardware addresses to communicate. This mode provides load balancing and high availability (fault tolerance).

Received traffic from the server side is also balanced. When the local ARP request is sent, the bonding driver copies and saves the IP information of the peer from the ARP package. When the ARP reply arrives from the peer, the bonding driver extracts its hardware address and initiates an ARP reply to a slave in the bond. One of the problems with using ARP negotiation for load balancing is that every time an ARP request is broadcast, the hardware address of bond is used, so after the peer learns this hardware address, all received traffic will flow to the current slave. This problem is solved by sending updates (ARP replies) to all peers, which contain their unique hardware addresses, resulting in a redistribution of traffic. Received traffic is also redistributed when a new slave is added to the bond, or when an inactive slave is reactivated. The received load is sequentially distributed (round robin) on the highest speed slave in the bond.

When a link is reconnected, or a new slave is added to the bond, received traffic is redistributed across all currently active slave, and an ARP reply is initiated for each client using the specified MAC address. The updelay parameter described below must be set to a value greater than or equal to the switch (switch) forwarding delay to ensure that ARP replies sent to the peer will not be blocked by switch (switch).

Necessary conditions:

Conditional 1:ethtool must support obtaining the rate of each slave

Condition 2: the underlying driver supports setting the hardware address of a device, so that there is always a slave (curr_active_slave) using the hardware address of bond, while ensuring that the slave in each bond has a unique hardware address. If curr_active_slave fails, its hardware address will be taken over by the newly selected curr_active_slave

Mode0, mode1 and mode6 are widely used in the actual production environment, and there should not be many other scenarios.

4. Create an instance

Let's take the creation of an activity-backup mode as an example to demonstrate the creation process. For other modes, you only need to modify mode= {0-6} and restart the network service.

Step 1: set up a binding file for dual network cards

[root@Centos6 network-scripts] # vim ifcfg-bond0

[root@Centos6 network-scripts] # cat ifcfg-bond0

DEVCIE=bond0

BOOTPROTO=none

BONDING_OPTS= "miimon=100 mode=1"

IPADDR=10.1.253.253

PREFIX=16

[root@Centos6 network-scripts] #

Step 2: modify the configuration files of the two network cards

[root@Centos6 network-scripts] # vim ifcfg-eth0

[root@Centos6 network-scripts] # cat ifcfg-eth0

DEVICE=eth0

BOOTPROTO=none

MASTER=bond0

SLAVE=yes

[root@Centos6 network-scripts] #

[root@Centos6 network-scripts] # vim ifcfg-eth3

[root@Centos6 network-scripts] # cat ifcfg-eth3

DEVICE=eth3

BOOTPROTO=none

MASTER=bond0

SLAVE=yes

[root@Centos6 network-scripts] #

The basic work of file configuration is complete, and you can move on to the next step.

Step 3: restart the network service (make sure the NetworkManager service is stopped)

[root@Centos6 network-scripts] # service network restart

Shutting down interface bond0: [OK]

Shutting down loopback interface: [OK]

Bringing up loopback interface: [OK]

Bringing up interface bond0: Determining if ip address 10.1.253.253 is already in use for device bond0...

[OK]

[root@Centos6 network-scripts] # ifconfig

Bond0 Link encap:Ethernet HWaddr 00:0C:29:C8:72:26

Inet addr:10.1.253.253 Bcast:10.1.255.255 Mask:255.255.0.0

Inet6 addr: fe80::20c:29ff:fec8:7226/64 Scope:Link

UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1

RX packets:13897 errors:0 dropped:0 overruns:0 frame:0

TX packets:869 errors:0 dropped:0 overruns:0 carrier:0

Collisions:0 txqueuelen:0

RX bytes:1295315 (1.2MiB) TX bytes:84869 (82.8KiB)

Eth0 Link encap:Ethernet HWaddr 00:0C:29:C8:72:26

UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1

RX packets:12376 errors:0 dropped:0 overruns:0 frame:0

TX packets:789 errors:0 dropped:0 overruns:0 carrier:0

Collisions:0 txqueuelen:1000

RX bytes:1145956 (1.0 MiB) TX bytes:77363 (75.5 KiB)

Eth3 Link encap:Ethernet HWaddr 00:0C:29:C8:72:26

UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1

RX packets:1528 errors:0 dropped:0 overruns:0 frame:0

TX packets:80 errors:0 dropped:0 overruns:0 carrier:0

Collisions:0 txqueuelen:1000

RX bytes:149913 (146.3 KiB) TX bytes:7506 (7.3KiB)

Lo Link encap:Local Loopback

Inet addr:127.0.0.1 Mask:255.0.0.0

Inet6 addr: 1/128 Scope:Host

UP LOOPBACK RUNNING MTU:65536 Metric:1

RX packets:344 errors:0 dropped:0 overruns:0 frame:0

TX packets:344 errors:0 dropped:0 overruns:0 carrier:0

Collisions:0 txqueuelen:0

RX bytes:24240 (23.6 KiB) TX bytes:24240 (23.6 KiB)

[root@Centos6 network-scripts] #

Use cat / proc/net/bonding/bond0 to view the working status of the current network card

[root@Centos6 ~] # cat / proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)

Primary Slave: None

Currently Active Slave: eth0

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth0

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:26

Slave queue ID: 0

Slave Interface: eth3

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:30

Slave queue ID: 0

[root@Centos6 ~] #

Step 4: on Centos7, ping has just configured the Centos6 bound by dual network cards, and randomly disable any of the network cards to observe whether packets will be lost in the process of ping.

Now we turn on ping Centos6 in Centos7, everything is fine, and then disable Centos6's network card 2 to see if there is any packet loss in the process of ping.

[root@Centos6 network-scripts] # cat / proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)

Primary Slave: None

Currently Active Slave: eth0

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth0

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:26

Slave queue ID: 0

Slave Interface: eth3

MII Status: down

Speed: Unknown

Duplex: Unknown

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:30

Slave queue ID: 0

[root@Centos6 network-scripts] #

Disable the network card 2, the network is normally connected, and if the performance of the machine is poor, you will lose at most two or three packets.

Start network card 2 again, deactivate network card 1, and continue to observe the process of ping

[root@Centos6 network-scripts] # cat / proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)

Primary Slave: None

Currently Active Slave: eth3

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth0

MII Status: down

Speed: Unknown

Duplex: Unknown

Link Failure Count: 1

Permanent HW addr: 00:0c:29:c8:72:26

Slave queue ID: 0

Slave Interface: eth3

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:30

Slave queue ID: 0

[root@Centos6 network-scripts] #

After Nic 2 is enabled and Nic 1 is disabled, the network is still normally connected. Even if there is a loss, only two or three will be lost. External service will not have a big impact. When I disable any network card here, there is no packet loss.

Therefore, any network card failure in mode1 mode can provide services normally, that is to say, it has high availability, but does not have the ability of load balancing, and the resource utilization is too low.

5. Comparison of bonding models

1. Balanced round robin policy (mode0) can only allow network card 2 to fail without going through the switch. if two network cards are connected to two switches that support dynamic link aggregation, either of the two network cards can fail at any time, so the round robin mode provides load balancing and high availability, but two network cards are required to be connected across both sides of the switch that supports dynamic link aggregation.

[root@Centos6 network-scripts] # cat / proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth0

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:26

Slave queue ID: 0

Slave Interface: eth3

MII Status: down

Speed: Unknown

Duplex: Unknown

Link Failure Count: 1

Permanent HW addr: 00:0c:29:c8:72:30

Slave queue ID: 0

[root@Centos6 network-scripts] #

2. Adapter adaptive load balancing strategy (mode6) there are similarities between this strategy and balanced rotation strategy (mode0), except that the mode6 mode does not need the switch to support dynamic link aggregation to achieve balanced rotation, and the two network cards do not need to be bundled with the same MAC, that is, they can use their own MAC, and they do not need to be connected to both sides of the switch, and the two network cards can be connected to the same ordinary switch.

[root@Centos6 network-scripts] # cat / proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: adaptive load balancing

Primary Slave: None

Currently Active Slave: eth3

MII Status: up

MII Polling Interval (ms): 100

Up Delay (ms): 0

Down Delay (ms): 0

Slave Interface: eth0

MII Status: down

Speed: Unknown

Duplex: Unknown

Link Failure Count: 1

Permanent HW addr: 00:0c:29:c8:72:26

Slave queue ID: 0

Slave Interface: eth3

MII Status: up

Speed: 1000 Mbps

Duplex: full

Link Failure Count: 0

Permanent HW addr: 00:0c:29:c8:72:30

Slave queue ID: 0

[root@Centos6 network-scripts] #

Here, I need to remind you that the balanced Round Robin Policy (mode0) can only be verified when two network cards are connected to two switches that support dynamic link aggregation. The experimental environment on the virtual machine is not satisfied, so it cannot be verified. If you want to have high availability and redundancy in the actual production environment, it is recommended to choose mode0, because if the two network cards are connected to different switches, there is another one when the network card is broken, and there is another one when the switch is broken, so it is the most guaranteed. Of course, the cost of environmental construction is even higher. The two network cards of mode6 are connected to the same switch, so if the switch fails, the so-called load balancing and high availability will not be discussed. If it is an intranet service, you can consider mode6.

After reading the above about why network services must have load balancing ability and high availability, if there is anything else you need to know, you can find out what you are interested in in the industry information or find our professional and technical engineers for answers. Technical engineers have more than ten years of experience in the industry.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.