Continue to 1.Hyper-V converged cluster network 07/09 Update SLTechnology News&Howtos

Continue to 1.Hyper-V converged cluster network

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

I have always wanted to write this blog and discuss this technology with you. Hyper-V Converged Network, I translated it into converged network in Chinese. Later, Lao Wang found that this technology does not seem to make much sense without clustering, so he added the word cluster. Few people have written blogs in this field in China before, but many people abroad have tried this new technology since 2012. The converted Network model in Windows Server 2016 supports RDMA, so I decided to write this blog, hoping to let more Microsoft ITpro in China know about this technology. Welcome to discuss and correct what is wrong.

Before introducing the converged network, I would like to discuss with you the planning of the cluster network on Windows. A few years ago, Lao Wang had the honor to participate in the implementation of some projects, and found that the engineers implementing Microsoft clusters in China do not seem to care much about the planning of the cluster network. In some small scenarios, they even put a network card to carry all the traffic of the cluster, and some have a conscience to put two fast cards. When some large enterprises implement, they will separate management, VM, heartbeat, storage and storage. This seems to be the standard network configuration for domestic large and medium-sized enterprises when they cluster. Tuhao will configure Teaming or MPIO, multi-channel and other technologies for each type of network.

According to Microsoft's best practices, Microsoft recommends cutting five kinds of cluster networks, management, VM, migration, heartbeat, storage, and Microsoft also suggests isolating migration traffic separately. In fact, I understand this consideration of Microsoft. I guess Microsoft headquarters considers this, perhaps considering that in a large scenario, it may be a private cloud or a virtualization architecture that hosts many virtual machines. There may be frequent large-scale migration scenarios, and VM is also faced with large concurrent access to the business, at this time, a group of networks may be difficult to support, which will lead to performance degradation in some scenarios, but in fact, according to Lao Wang's observation, domestic enterprises that use Hyper-V rarely achieve the scenario of Microsoft. Usually, how can a group of LACP teaming 20GB traffic hold live in VM access + migration traffic? Therefore, it is rare to see a cluster architecture independent of the migration network in China.

So this kind of management, VM+ migration, heartbeat, storage architecture may be more common for domestic users. Some companies may directly put management, VM, and migration together with a group of LACP teaming, and then, heartbeat, storage, a total of three cluster networks, which is also a common domestic cluster network architecture.

What Lao Wang wants to discuss with you is the inside "heartbeat" cluster network. In China, this type of cluster network is usually called heartbeat network. It seems that this network is only used for heartbeat detection. In fact, this is not the case. Strictly speaking, this type of network should be called cluster communication network, and some people abroad also call it CSV network.

The cluster communication network actually undertakes the traffic of three functions, and some changes have taken place since 2012.

1. Health inspection network, cluster everyone knows that all nodes need to access the same shared storage, in the cluster to create a cluster role or cluster function, the data generated by the role will have shared storage, and then health testing will be done between clusters. Why do this test, just to know whether other nodes are alive now, such as the current two nodes in the cluster? Every second, they will perform an actual handshake test with the other party through port 3343. Each time they send a packet of 134 byte, in addition to confirming that the other party is online, the other party will also confirm that you are online, and both parties will confirm that you are online. When the detection reaches a certain threshold, for example, the whole network is checked every second in the same subnet by default. If a node does not respond for five times, it will be regarded as offline. Other nodes in the cluster will check their own cluster registry to see the role function of the node, and then migrate out. Cross-subnet is also detected every other second, and five failures are regarded as offline of the node. If the cluster node has the Hyper-v role enabled, the default local subnet is checked for the whole network every second, and 10 failures are regarded as offline of the node, cross-subnet is detected once for the whole network every other second, and 20 failures are regarded as offline of the node. The cross-site detection mode has been added in 2016. If it is detected that the cluster has set up a site, the cross-site detection policy will be better than the cross-subnet detection mode when cross-site, cross-subnet or no cross-subnet.

As mentioned above, the health inspection network is a very important content in the cluster, and the detection results are used to confirm whether a node is alive. Therefore, the quality of the network is extremely high and the bandwidth is not high, but the quality must be ensured. Ensure that it will not be disturbed by other traffic, and ensure that packet loss will not occur, otherwise, packet loss or traffic interference will lead to loss of accuracy in the health detection of the cluster. The surviving nodes are offline because of inaccurate detection.

Therefore, Lao Wang suggested that a cluster should be divided into at least two types of networks, one responsible for managing traffic such as VM migration and storage, and the other for carrying the traffic of the cluster communication network. Please note that this health test is a network-wide test within the cluster, and each node sends the test to all nodes to ensure that each node in the cluster can know the status of the other.

two。 Intra-cluster database synchronization traffic

It is mentioned above that when a node goes offline, the cluster triggers the failover operation, and the other nodes check their own cluster database to see what role is hosted by the offline node, and then go online and mount the cluster group that the role depends on, and then the client visits again. This is the simplest failover principle of the cluster. You may notice that there is a cluster database in it. This database has nothing to do with SQL's database and exchange's database. It is a database used by the cluster to store the status of each node and the status of each node's hosting role. The database mainly exists in the registry of each cluster node. If there is a disk witness, then there will also be a copy in the disk witness. For example, node 1 is currently online, which runs the DHCP role and file server role. The current status is online, or node 1 assumes the role of HyperV, and now there are ten virtual machines, five online and five offline.

The main purpose of the cluster database is to record this kind of data, cluster configuration, cluster member configuration, add, create, start, delete, stop, offline and other state changes, and then synchronize with other nodes. bandwidth usage is not very large, but must be accurate, the cluster database between each node must be consistent, this is the most important requirement, so Each node of the cluster needs to synchronize the cluster database in real time to ensure that the database contents of each node are consistent, so that when a node is offline, other nodes can accurately undertake his work. When we add roles in the cluster, delete roles, and add new cluster resources, the database synchronization operation will be triggered, so make sure that the synchronization operation is timely. Accurate access to each node, similarly, this type of traffic also requires high quality of the network, it does not need much bandwidth, but must ensure the quality of the network.

3.CSV metadata updates

CSV is a good thing. In the past, there was no CSV in 2008. If you want to migrate a hyper-v cluster, you can only migrate virtual machines bound to the cluster hard disk together. Later, it is much more convenient to have CSV. Unlike traditional cluster disks, CSV is not mounted on one node at a time. When failing over, you need to go offline, uninstall the NTFS and retain it, then go online, mount it for a long time, and it is also very inconvenient. With CSV, the created CSV volumes can be accessed on all nodes, and they can be accessed in the order of the CSV orchestrator. When one node is powered off and the other node is powered off, the other node can go online directly. There is no need to unmount the cluster disk, because the CSV volume is always mounted. The so-called metadata update means that when we add, delete or modify the contents of the CSV volume on a single node, the backend CSV will synchronize our operations, from east to west, or from north to south, not the specific file content, but just metadata. Similarly, this type of traffic requires high quality and does not require much bandwidth, once there is a delay. What may happen is that it is very slow for us to update a file on CSV before synchronization. We need to note that CSV in 2012 is heavily dependent on SMB. If you use the CSV feature, please do not disable SMB-related services.

In fact, the above three kinds of traffic are included in the known "heartbeat network". Without looking at it, we are shocked to see that the heartbeat card, which has been ignored by us, actually undertakes so much work, and it is also very important. Through my introduction, I believe you already have a basic understanding that when planning a cluster network, we must separate this type of cluster communication network to ensure that it can obtain the highest quality. Prevent interference by other traffic, especially isolated from migration traffic, that is, there should be at least two types of cluster networks.

In general, this type of cluster communication network requires very low network bandwidth, even 10m 100m, but there are also some special scenarios, such as SQL,exchange clusters, which often need to track application changes, resulting in frequent updates of the cluster database. In some Hyper-V scenarios, frequent CSV creation and migration operations will lead to frequent updates of CSV metadata traffic. Lao Wang suggested that it is enough to configure a single 1GB card, or two 1GB cards. It is totally unnecessary to do SMB multi-channel or Teaming,10GB. It is a waste. It would be nice to have so much bandwidth allocated to virtual machine migration and storage. Haha.

In the 2012R2 era, some foreign experts have said that the traffic of CSV metadata updates can be separated from the cluster communication traffic, but Lao Wang has not seen a successful case yet, so for the time being, let's think that the cluster communication traffic includes health inspection, cluster database update, and CSV metadata update.

OK, the above foundation has been laid. Let's start today's highlight, converged network. I won't explain too much first. Let's start to enter a cluster environment. Let's slowly see what it really looks like in the environment.

Today's converged network environment is structured as follows

12DC: 80.0.0.1 assume the role of domain control

ISCSI 50.0.0.99 60.0.0.100 assumes the role of ISCSI server

HV01

Four network cards and two fast cards constitute teaming, and three cards are separated by Convered Network technology.

Mgmt 80.0.0.2

Clus 90.0.0.2

VM 100.0.0.2

The other two cards are used as ISCSI MPIO 50.0.0.1 60.0.0.2

HV02

Four network cards and two fast cards constitute teaming, and three cards are separated by Convered Network technology.

Mgmt 80.0.0.3

Clus 90.0.0.3

VM 100.0.0.3

The other two cards make ISCSI MPIO 50.0.0.3 60.0.0.4.

This article does not explain how to install the server and how to configure ISCSI,MPIO. Let's go directly to the scenario of Convered Network technology configuration.

First of all, in the current scenario, you can see five network cards on each host, four of which are what we said above, a teaming, switch independent, address hash, and the name is vTeam on all our nodes.

In the real scene, I recommend that everyone configure teaming in LACP mode, two 10GB, in and out of dual network cards.

I am using two 1GB here, if the production environment recommends at least four 1GB or two 10GB

At this point, the article is no different from our traditional cluster, and then the big story comes. What's going on with the Convered network that we've been talking about?

Let's briefly introduce Lao Wang here. To put it simply, Covered Network is to create multiple virtual network adapters based on a single network card or teaming in the parent partition of Hyper-v. This is a new technology that began to appear in 2012. We always thought that a network card could only produce one virtual network adapter, but in fact, it has changed since 2012. We can build multiple virtual network adapters on a network card or teaming based on Convered Network technology.

At the same time, IP,QOS,VLAN can be configured for each virtual network adapter created, isn't it cool? a physical network card creates a virtual switch, but it can correspond to multiple virtual network adapters that can be configured with IP, and each virtual network adapter can be configured with a separate QOS,IP,VLAN, which used to be the technology of hardware manufacturers. Microsoft also implemented it on its own virtualization platform in 2012. Without saying much, we will begin to implement it.

# create a virtual switch based on vTeam combination. The minimum bandwidth mode is set to be determined by weight. Since we will create a virtual network adapter for management separately, we will first set AllowManagementOS to False. If set to True, a virtual network adapter will be generated.

New-VMSwitch-Name vSwitch-NetAdapterName vTeam-MinimumBandwidthMode Weight-AllowManagementOS $False

# set the default minimum bandwidth weight of the virtual switch to 20. By default, any traffic sent by the virtual network adapter is connected to this virtual switch, and the minimum bandwidth that is not allocated will be filtered into this bucket. That is, if the traffic that does not match the other minimum bandwidth weight uses at most 20% of the default total bandwidth weight.

Set-VMSwitch "vSwitch"-DefaultFlowMinimumBandwidthWeight 20

# create three virtual network adapters for mgmt,clus,vm based on a single virtual switch

Add-VMNetworkAdapter-ManagementOS-Name "Mgmt"-SwitchName "vSwitch"

Add-VMNetworkAdapter-ManagementOS-Name "Clus"-SwitchName "vSwitch"

Add-VMNetworkAdapter-ManagementOS-Name "VM"-SwitchName "vSwitch"

# set the weight of the total bandwidth that each virtual network adapter can use at most, of which VM (VM+Live Migration) takes up the most traffic. We assign the traffic priority weight to it, the second weight to the management traffic, and the least weight to the cluster network. The minimum Band width weight value is calculated based on the relative weight, which ranges from 0 to 100. it is recommended that the virtual network adapter ownership weight + default flow weight does not exceed 100 in total. The backend will also rebalance the weight calculation after exceeding 100. It is recommended to set the weights of all virtual network adapters to a total of 80 or 70 90, and subtract the rest from 100 to leave to the default stream.

Set-VMNetworkAdapter-ManagementOS-Name "Mgmt"-MinimumBandwidthWeight 25

Set-VMNetworkAdapter-ManagementOS-Name "Clus"-MinimumBandwidthWeight 10

Set-VMNetworkAdapter-ManagementOS-Name "VM"-MinimumBandwidthWeight 45

# when setting the weights of all virtual network cards, you can add the parameter-Access-VlanId to set up a separate Vlan for each virtual network adapter for isolation.

# assign IP addresses to each virtual network adapter according to the predefined configuration. Note that the DNS assigned to the mgmt network card is 80.0.0.1 and cancel the netbios registration of the Clus network card.

New-NetIPAddress-InterfaceAlias "vEthernet (mgmt)"-IPAddress 80.0.0.2-PrefixLength "24"

New-NetIPAddress-InterfaceAlias "vEthernet (Clus)"-IPAddress 90.0.0.2-PrefixLength "24"

New-NetIPAddress-InterfaceAlias "vEthernet (VM)"-IPAddress 100.0.0.2-PrefixLength "24"

# make sure that the network cards are in the following order. Mgmt is the primary outbound network DNS access resolution.

The HV02 configuration is the same as above. After the configuration is completed, like HV01, you can see three virtual network cards and two ISCSI network cards established based on the teaming composed of two network cards.

Does everyone feel wonderful here? in fact, at first, Lao Wang guessed that it would be the same technology as the vmware port group. The virtual machine is connected to the port group, each port group can use a different network card, and it can also do failover fault tolerance. VLAN ID,QOS settings, Microsoft, except that each virtual network adapter cannot do failover, is almost the same, but it actually looks a little different when it is done. After the port group on vmware esxi is finished, the virtual machine can be directly connected to the port group. Relatively speaking, the effect is more intuitive and easy to understand.

And the virtual network adapter created by Microsoft's converged network can hardly see any special effect on a single machine, because the virtual machine is still connected to the virtual switch vswitch, but the migration traffic goes to that card. We don't know what can be seen by a single machine. We can only see, ah, we can make a teaming based on two network cards, and then make a virtual switch. Based on this virtual switch, we have made several virtual network adapters, each of which can be configured with a separate QOS,IP,VLAN, but how to cut the network traffic and what the purpose is, we have not seen a significant effect so far. Many articles on the Internet have stopped here, making Lao Wang confused. In fact, such a structure on a single machine doesn't make much sense to me, it just looks good. After it is done, it will not split the traffic on a single machine unless it is configured with VLAN,host files and other settings.

Therefore, when we come to the cluster, we can see clearly the specific steps of how to build the cluster. Lao Wang does not explain it here. I believe everyone will. In the cluster, we can see the five networks added.

After adjustment, as shown in the following figure, each of the five network cards is enabled. The CLU network card is mainly used for cluster communication, health testing, cluster database update, CLU metadata update. Once the CLU network card is not available, the cluster will be automatically routed to other mgmt or VM cards to bear cluster communication traffic. The two ISCSI01 ISCSI02 network cards are MPIO networks and do not participate in cluster communication traffic, management traffic, and access traffic. Set the VM network card to use the network card for real-time migration. In fact, Microsoft suggests that we can set up multiple blocks here, such as setting VM and MGMT,VM networks to fail, or we can also use MGMT to complete the migration, or we can use powershell to select the second network according to the metric and priority value for real-time migration, or we can add multiple networks and set the metric value of the two networks close to each other. In this way, the real-time migration will be directly load balanced between the two cards.

The article is drawing to a close. Think back to what we did. We used a teaming made of two cards at the bottom, then built a virtual switch based on this set of teaming, created three virtual network adapters based on the virtual switch, and assigned a separate QOS policy to ensure that each virtual network adapter operates within its own bandwidth weight and will not interfere with other networks. At the same time, we configure the IP address for each virtual network, and finally join the cluster network. With the cooperation of the cluster network mode, we achieve the real traffic segmentation, and each card is enabled.

Lao Wang believes that this is a new way of thinking in cluster network planning. Assuming that the server has only two or four ports, or the switch port is limited, and we are not allowed to occupy six cards and eight cards, then we can consider saving the cost of ports through the architectural design of this converged network, and be able to reasonably manage and control the virtual network adapter through QOS technology. At the same time, our bottom layer is teaming. You can also obtain the fault tolerance of teaming. For the upper virtual network adapter, it only knows that it corresponds to the virtual switch, and the virtual switch only knows that it corresponds to teaming, that is, as long as the teaming is not broken, the whole architecture can operate normally. Once a single network card in the teaming fails, it will not lead to the paralysis of the entire architecture. You only need to add the network card to come in, which will not have much awareness for the upper architecture.

In the architecture of converged network, Lao Wang has never mentioned why ISCSI you want to come out two fast cards alone, why not all virtual out through a group of teaming, in fact, Lao Wang can all be put in a group of teaming, but I want to show you the most ideal architecture. After reading foreign materials, Lao Wang learned that ISCSI storage technology does not support NIC teaming, if you want to obtain the MPIO capabilities of ISCSI Only two separate network cards, or two ISCSI network cards, corresponding to two virtual switches, so that the physical layer can use MPIO technology, cluster virtual machines can also be obtained.

In addition, NIC teaming technology in the 2012 era does not support RDMA technology, if your fusion cluster bottom is SOFS architecture, then you use two network cards to make teaming, will also lose the ability of RDMA, if it is SMB architecture, in the 2012 era Lao Wang suggested separate two 10GB cards, do not do teaming, make full use of SMB multi-channel and SMB RDMA technology.

Of course, Microsoft finally listened to the voice of the masses. In 2016, it provided a new technology, Switch Embedded Teaming, and created a virtual switch that will support the use of technologies such as RDMA VRSS in converged networks.

2016 example of new commands for converged networks:

# create a Switch Embedded Teaming switch that supports RDMA and VRSS

New-VMSwitch-Name CNSwitch-AllowManagementOS $True-NetAdapterName NIC01,NIC02-EnableEmbeddedTeaming $Tru

# enable RDMA for storage networks and migrating virtual network adapters

Get-NetAdapterRDMA-Name * Storage* | Enable-NetAdapterRDMA

Get-NetAdapterRDMA-Name * Live-Migration* | Enable-NetAdapterRDMA

# set the association between vNIC and physical NIC

Set-VMNetworkAdapterTeamMapping-VMNetworkAdapterName Storage1-ManagementOS-PhysicalNetAdapterName NIC1

Set-VMNetworkAdapterTeamMapping-VMNetworkAdapterName Storage2-ManagementOS-PhysicalNetAdapterName NIC2

Extended reading

Discussion on the converged Network of ISCSI and 2012

Https://community.spiceworks.com/topic/314930-iscsi-in-hyper-v-2012-10gb-converged-fabric

Https://social.technet.microsoft.com/Forums/windowsserver/en-US/1c23a379-e7d6-4d47-8e21-0963ad931484/iscsi-in-converged-network-scenario?forum=winserverhyperv

Http://www.aidanfinn.com/?p=13947

Http://www.aidanfinn.com/?p=14509

Applicable scenarios and examples of converged network

Http://www.davidmercer.co.uk/windows-2012-r2-hyper-v-converged-network-setup/

Http://www.windows-infrastructure.de/hyper-v-converged-network/

Https://www.starwindsoftware.com/blog/musings-on-windows-server-converged-networking-storage

About the use and in-depth explanation of DefaultFlowMinimumBandwidthAbsolute and MinimumBandwidthWeight parameters

Https://blogs.technet.microsoft.com/meamcs/2012/05/06/converged-fabric-in-windows-server-2012-hyper-v/

Https://social.technet.microsoft.com/Forums/windowsserver/en-US/7c265109-b703-4e50-aab0-b7508b32207f/defaultflowminimumbandwidthweight-setvmswitch-question?forum=winserverpowershell

Http://www.aidanfinn.com/?p=13891

Converged network architecture through SCVMM management

Https://charbelnemnom.com/2014/05/create-a-converged-network-fabric-in-vmm-2012-r2/

Https://www.starwindsoftware.com/blog/how-to-deploy-switch-embedded-teaming-set-using-scvmm-2016

2016 Switch Embedded Teaming

Http://www.aidanfinn.com/?p=18813

Https://technet.microsoft.com/en-us/library/mt403349.aspx

Http://www.tech-coffee.net/how-to-deploy-a-converged-network-with-windows-server-2016/

Https://community.mellanox.com/docs/DOC-2904

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.