WSFC2016 fault domain site awareness 10/30 Update SLTechnology News&Howtos

WSFC2016 fault domain site awareness

2025-10-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

It gives me great pleasure to introduce another cool technology of WSFC2016, that is, the fault domain feature.

Similar technologies include CloudStack,Openastack and so on.

Take CloudStack as an example, in CloudStack, we can manually define the architecture model for the resource architecture, from the top Region,Zone,Pod,Cluster to the lowest Host. By planning such an architecture model, we can define the architecture at the bottom of the cloud resource pool on top of the cloud management software, which can not only shield the technical details from the users, but also make the cloud management software operate according to the architecture model.

The uppermost and largest architecture is Region, that is, region. For example, when we use Aliyun and Azure,AWS to create virtual machines, we only need to select one region. This is the concept corresponding to the background. The concept of Region is region, and different Region should mean different regions. If users deploy two virtual machines on two different Region, the probability of failure at the same time will be very low.

Secondly, in Zone,CloudStack, Zone mainly refers to the data center. We say a Zone, that is, a data center, that is, a data center may have multiple Pod, that is, racks, different racks may use different network architecture facilities, a rack may have multiple Cluster, and a Cluster may have multiple Host, and the same Zone may share a secondary storage.

After defining such an architecture, users don't need to know much, they just need to know that we need to deploy to the Region close to where we visit. If my different virtual machines are placed in different Region, they will be placed in different, far away places, and will not fail at the same time. This is enough.

Some cloud platforms, such as Azure, allow users to choose different availability sets. The corresponding concept is the rack here. If you select virtual machines in different availability sets, it means that the virtual machine is placed in different Pod, Microsoft also known as Rack. If you deploy the availability set, Azure can guarantee the SLA at 99.95%.

For a cloud management system, for resource administrators, the administrator is mainly concerned with continuously ensuring the SLA of tenant resources, preparing a variety of high availability solutions, disaster preparedness solutions, and so on. Here we are mainly talking about high availability scenarios. In high availability scenarios, usually in the cloud computing industry, people talk about failure domains.

What is a failure domain? for example, if one of my nodes is broken, the virtual machine above can drift to other nodes. At this time, the node is a failure domain, and if a single node fails, we deploy a cluster system, which will fail over and will not affect the SLA.

By default, the failover level of most cloud management platforms is the node level within the Cluster. Ideally, Lao Wang believes that all defined Region,zone,pod,Cluster should be aware. For example, if a node in the cluster fails, I can first migrate the virtual machine to other nodes in the Cluster, if not, then migrate to another cluster above the Pod, if not, then migrate to a different Pod in the Zone. If you cannot migrate to a different Zone, or even eventually the whole Zone is paralyzed, you can also transfer it to another Region to resume operation. If you can do this, it will be too powerful, and it will really achieve the goal of continuous availability of cloud computing.

However, in reality, according to Lao Wang's observation, the Region,zone,pod,Cluster defined by our software is partly for viewing and partly for use. assuming that these things such as Region,zone,pod are not responsive to the cloud resource failure system, then it will not have the effect mentioned above by Lao Wang. At most, we can set the policy for the Cluster or Host in the same Pod. You can have all Pod,Cluster in a Zone center use the same shared secondary storage, or output reports through a report set, and so on.

In practice, some cloud computing vendors usually realize Region-level, Pod-level, Cluster-level fault domain awareness, such as Cluster failure, will be restored to other Region to continue to run, or Cluster maintenance, and other Cluster above Pod will temporarily take care of the service.

Lao Wang has said so much here in the hope of bringing to you the concept of fault domain, which is common, not only in Microsoft WSFC 2016, but also in other cloud platform management software such as Cloudstack,Openstack. Lao Wang believes that this feature will be very common in cloud platform management, or in large data centers or large managed data centers, and it is a good feature for standardized management.

This technology calls it a software-defined failure domain or a fault domain, for example.

For example, we know that the tenant's virtual machine runs on the HV01 node, Cluster01, Cluster01 on Pod01, and Pod01 on Zone01,Zone01 on Region Beijing. We knew in mind that now we can use software definition in the management system.

If the Host level of the HV01 node is bad, because Cluster automatically implements the host-level failure domain, the node will be transferred to other nodes to run. Note that as long as it is a cluster composed of a system that can fail over normally, then the default has been physically implemented as a host fault domain.

If the Cluster is broken, then Cluster is a failure domain. There will be many virtual machines on the Cluster, and the downtime will follow. If there is no cross-Cluster transfer mechanism, downtime will occur here.

If the Pod level is bad, then Pod is a failure domain, and there will be a lot of Cluster,Cluster on Pod and a lot of applications in it, all of which will be down.

If the Zone level is broken, then the entire Zone is a failure domain

If the Region level is broken, then the entire Region is a failure domain

As you can see, the greater the level, the more the impact, so for architects, if you are a cloud provider or a large managed data center provider, then first of all, you should strengthen reliability for each failure domain level, try to do a good disaster recovery mechanism, what level of failure domain fails, how to recover, and secondly You should find ways to transfer users' cloud resources across fault domains. For example, if you are currently operating in Cluster, if the Cluster is broken, can you transfer to another Pod or other Zone to continue to run? after implementation, it is recommended that users deploy resources on different inductive failure domains.

The fault domain level, usually this kind of thing, is something that the administrator has in mind, or what is written on the contract. Through some cloud platform software or management software, we only realize the fault domain, which can be seen on the computer interface and can be managed centrally. Report, architecture view

But more importantly, we need to realize the perception of the fault domain, to ensure that the same resources selected by users to different failure domains will not fail at the same time, to ensure that the occurrence of the failure domain can be perceived across the failure domain, and to allow the user virtual machine to run normally on the cluster node under the same Region. If the same Region fails, the virtual machine can be migrated across the Region directly to continue to run, so the definition of the fault domain has the following functions

1. Implement the fault domain architecture at the software level, which is easy to record and view and correspond to the physical architecture.

two。 Implement cloud management strategy allocation strategy according to fault domain architecture

two。 Realize fault domain awareness to ensure that the same resources selected by users to different failure domains will not fail at the same time.

3. Realize cross-failure domain awareness, and allow resources to be transferred across different levels of failure domains in the event of a failure

4. Through the fault domain function, we can improve the stickiness of storage, network and computing resources in the same fault domain, and ensure that the storage and computing resources in the same fault domain can work quickly.

To realize the fault domain, the main work should be divided into two steps, 1. Logical definition 2. Concrete realization

Logical definition, that is, we define it first, so that the domain data is created at the software level, and then the resources are added, so it looks like we have a fault domain, but it just looks good. If the cloud platform supports it, you can do some reports or policy control.

Specific implementation, not only really achieve the function of the fault domain, so that the resources of users in different scope will not be down at the same time, make the low-level fault domain unavailable, and the user resources can also be migrated to a higher level across the failure domain. To achieve this step, part of us can rely on technical means. If the technical means are not in place, we can also rely on manual, that is to say, the human brain.

Defining a fault domain is not as simple as saying a few words. Administrators should be aware of it when doing maintenance. For example, if you cannot maintain all fault domains at the same time, you must first maintain one fault domain and then maintain another. If the technical means do not work and fail to fail across the fault domain level, then the real fault domain level is bad. The administrator needs to manually remove the user resources and then recover to another failure domain.

Therefore, the failure domain is not only the definition and technical implementation of the software, but also the idea that administrators are required to maintain the fault domain. If the user resources are in different fault domains, how should I maintain them? different fault domain levels are broken, I should pay attention to those contents, refer to what process to restore, how to restore across failure domain levels, and so on, after the software technical level is implemented. It's more about maintaining the process.

Ok, I've seen a lot of general concepts above. Let's take a look at the answer sheet handed over by Microsoft WSFC 2016 on the failure domain.

In WSFC 2016, Microsoft has launched four levels for the failure domain, namely Site,Rack,Chassis,Node, in which Node is the default after the cluster installation, site, rack, chassis, we can create and build the nesting level between them, and for each failure domain level can be described in detail, easy to view, I am surprised that the fault domain in WSFC 2016 is not just talking about it. But the real WSFC itself can help us realize the function of fault domain awareness. At present, Lao Wang has observed that the three fault domain levels of Site,Rack,Chassis have their own effective scenarios.

For example, the Node on the same Site will fail over within the Site by default. If all the cluster nodes of the Site are unavailable and then transferred to a different Site, then there are many Site failure domain-level adhesion optimizations, you can configure the preferred Site at the cluster level, the preferred Site at the application level, the same Site virtual machine using the same Site storage, if the storage is moved to another Site, the virtual machine is also moved, and so on. I will also focus on fault domain awareness at the WSFC 2016 Site level later in this article.

There is another scenario, Storage Direct Spaces, which I believe many people who are concerned about Microsoft technology have tested, similar to VSAN, which can contribute the storage in the belly of each node to form a storage pool, and then build storage space based on this storage pool, CSV,SOFS, and finally deliver it to the application. In the SDS scenario, when we write data to SDS storage space, the data will be scattered on different nodes. After configuring fault-tolerant technologies such as two-way mirroring, three-way mirroring and double check, the data will be scattered more than one copy. At that time, if the fault-tolerant technology allows, you can allow the specified number of nodes to be broken, and then the new nodes can be restored, or the disk will be broken, and the new disk will be restored. Therefore, by default, SDS already has disk-level failure domains and node-level fault domains.

We can also combine SDS technology with the new features of WSFC 2016 fault domain. Before turning on the SDS function, we build a Rack or Chassis fault domain for nodes. When SDS performs fault tolerance, it will operate according to the topology. For example, it will ensure that data is always sprinkled on different Rack or different chassis nodes, so that even if it is a Rack or a chassis failure, it will not affect SDS. Note, if you want to achieve this function, It is recommended that you plan the failure domain level before the SDS is enabled, otherwise the SDS is planned after it is built, and the nodes need to be removed again before joining the failure domain.

WSFC 2016 failure domain cluster configuration commands:

Get-ClusterFaultDomain: gets the cluster failure domain architecture, either at the cluster level or at any failure domain level

Set-ClusterFaultDomain: the fault domain to which the mobile resource belongs. Configure the parameters related to the fault domain.

New-ClusterFaultDomain: create a cluster failure domain, which can be selected at the Site,Rack,Chassis level

Remove-ClusterFaultDomain: delete failure domain

Lab example:

# create Beijing site and Shanghai site

New-ClusterFaultDomain-Type Site-Name "Beijing"-Description "Primary Site"

New-ClusterFaultDomain-Type Site-Name "Shanghai"-Description "Secondary Site"

# create Beijing site data center Rack and Shanghai site data center Rack

New-ClusterFaultDomain-Type Rack-Name "Rack Beijing1"-Location "Fumadasha 17, Room 501"

New-ClusterFaultDomain-Type Rack-Name "Rack Shanghai1"-Location "TaiDidash 14, Room 203"

# create two chassis above Beijing Rack and two chassis above Shanghai Rack

New-ClusterFaultDomain-Type Chassis-Name "Chassis 01"-Location "Rack Beijing01 Ontop"

New-ClusterFaultDomain-Type Chassis-Name "Chassis 02"-Location "Rack Beijing01 Under"

New-ClusterFaultDomain-Type Chassis-Name "Chassis 03"-Location "Rack Shanghai01 Ontop"

New-ClusterFaultDomain-Type Chassis-Name "Chassis 04"-Location "Rack Shanghai01 Under"

It should be noted that each failure domain Name here will be unique

# add a server to the chassis

Set-ClusterFaultDomain-Name "HV01"-Parent "Chassis 01"

Set-ClusterFaultDomain-Name "HV02"-Parent "Chassis 02"

Set-ClusterFaultDomain-Name "HV03"-Parent "Chassis 03"

Set-ClusterFaultDomain-Name "HV04"-Parent "Chassis 04"

# Building chassis rack site dependencies

Set-ClusterFaultDomain-Name "Chassis 01"-Parent "Rack Beijing1"

Set-ClusterFaultDomain-Name "Chassis 02"-Parent "Rack Beijing1"

Set-ClusterFaultDomain-Name "Chassis 03"-Parent "Rack Shanghai1"

Set-ClusterFaultDomain-Name "Chassis 04"-Parent "Rack Shanghai1"

Set-ClusterFaultDomain-Name "Rack Beijing1"-Parent "Beijing"

Set-ClusterFaultDomain-Name "Rack Shanghai1"-Parent "Shanghai"

# get the fault domain topology

Get-ClusterFaultDomain

# get complete information of failure domain

Get-ClusterFaultDomain | ft name,type,parentname,childrennames,Location,description-autosize

You can also see here that all the failure domains previously defined by Lao Wang, as well as nesting relationships, as well as location information and description information, are also new attributes, which are mainly used to mark the fault domain for easy troubleshooting or lookup, as can be seen, from the city level, to the data center building level, to the house level, to the rack level, and even to the location of the chassis above the rack. You can also enter city + data center information in the site Location location, Location and description information can also be changed later on Set, there are many ways to play, you can explore their own, the key is accurate, meaningful, clear.

# get a single failure domain topology

# get a certain level of failure domain topology

# Delete a failure domain

If you want to delete a failure domain, it is required that there are no child resources below. For example, if we want to delete Chassis 01, but there is a HV01 node below, we need to remove the HV01 first.

# remove the resources under the fault domain to be deleted

Set-ClusterFaultDomain-Name "HV01"-Parent ""

# Delete a failure domain

In addition to using Poweshell to configure the cluster failure domain, you can also export xml for configuration, edit it and then import xml

# Export failure domain schema xml

Get-ClusterFaultDomain | Out-File

After using the tool to complete the XML, you need to upload the XML back to take effect using the following command

$xml = Get-Content | Out-String

Set-ClusterFaultDomainXML-XML $xml

Above, Lao Wang briefly introduced the new fault domain features and configuration methods in WSFC2016, but so far we have only logically created the fault domain. We can see that it can correspond to physics, but it has not taken effect yet, because the effectiveness of the failure domain depends on the extent to which the cloud management platform or WSFC can sense it.

In conventional design, the failure domain is usually defined in the cloud platform management system, and the failure domain will operate on the overall architecture of the cloud infrastructure, while WSFC is the opposite, defining Site,Rack,Chassis at the cluster level, so that both Site,Rack,Chassis need to run within the framework of the cluster, which is also the difference between Microsoft's architecture and other vendors.

Among them, Lao Wang believes that WSFC has done a good job in site awareness of the failure domain. We have defined the site failure domain and put the node under the site fault domain.

Site awareness: after setting up the site, we can allow cluster nodes to transfer roles within the site every time they fail over or maintain mode. If there are no normal nodes to transfer to other sites, this technology is not available in previous versions. In previous versions, if we want to achieve similar requirements, we can do so by setting the preferred owner of the application. 2016 brings site awareness to the cluster.

Storage site awareness: the virtual machines in the cluster will follow the same CSV within the site. Suppose the CSV is migrated to another site. After one minute, the virtual machine discovers that the CSV is migrated to another site, then the virtual machine will be migrated in real time. After configuring site awareness, it will confirm that the preferred site always accesses the storage directly. If the storage moves, the virtual machine will follow.

Cluster group site awareness: we can also configure site awareness of the application, for example, setting the preferred site of SQL instance 1 as the preferred site of beijing,SQL instance 2 as shanghai, which enables a multi-host application scenario where SQL1 failover is first within the beijing site and SQL2 failover is first within the shanghai site.

Preferred site: after configuring site awareness, we first have to select the preferred site. The preferred site can ensure that when the cluster has a 50 max 50 vote, dynamic arbitration always removes the vote from the non-preferred site and allows the preferred site to operate. Yes, this is an alternative to LowerQuorumPriorityNodeID. When the cluster starts cold, the virtual machine will also start at the preferred site first.

Their priority order is as follows

Storage site awareness > cluster group site awareness > preferred site awareness

In addition to this information, site awareness adds a new cross-site heartbeat detection mechanism, which we will introduce in the next blog.

OK, let's start the experimental verification.

Clear all failure domain configuration

Experimental environment

The cluster currently runs one virtual machine and two dtc based on CSV, both of which operate on the Beijing site HV01

The fault domain is configured as follows

To verify site awareness, without configuring the preferred owner, three clustered applications operate on HV01 and belong to the same site fault domain beijing

Stop the node HV01 cluster service manually, and the application automatically migrates to HV02 within the same site first.

Open clusterlog to view

Although we are successful here, there is a certain chance that we can not achieve this effect. In 2016, when we build a site failure domain and the cluster fails over again, the traditional cluster role will attempt to fail directly inside the site, while the CSV-based virtual machine is not necessarily, because the CSV can drift, maybe now the CSV primary control point is in HV04, but the virtual machine is in HV01. It's also possible to run.

If the master node and the virtual machine of the CSV are on the same node, then when the node goes down, it is possible that the CSV will go to other sites. If the CSV goes to the nodes of other sites, then the virtual machine will follow follow.

If the master node and HV01 of the CSV are not in the same Site, then when the HV01 fails over, the virtual machine will go wherever the site,CSV of the virtual machine follow CSV is located, ensuring that the virtual machine can get the best performance while ignoring the functions perceived within the site, so the storage awareness is also the highest priority.

If the virtual machine is currently powered on, it will be detected every minute whether CSV and I are in the same Site. If not, I will migrate it in real time. If the virtual machine is powered off, it will not be detected. But during failover or maintenance, I will first go to the site where the CSV is located, which can be seen in the cluster log.

Next, we are going to enter another technology, that is, the preferred site function, the preferred site configuration level, which can be at the cluster level, storage level, and cluster level. The first thing we need to configure here is the storage level. We manually specify to ensure the stickiness of the CSV and the site, such as CSV01 always follow Beijing site, CSV02 always follow Shanghai site.

In this way, the virtual machine of the Beijing site will always find the CSV01,CSV01 of the Beijing site, and the virtual machine failover or maintenance will always be on the Beijing site first, because the CSV has been fixed, and if the CSV has maintenance operations or failed operations, you can return to the Beijing site as soon as possible, and the CSV configured at the same site will perform load distribution in the node.

# get the cluster group name of CSV

Get-ClusterSharedVolume | Get-ClusterGroup

CSV is also a cluster group Really?

# configure to the preferred site based on the obtained CSV cluster group name

(Get-ClusterGroup-name CSVClusterGroupName) .PreferredSite = "beijing"

OK, now we can rest assured that the virtual machine is always follow CSV,CSV and always follow the local site, which ensures that the virtual machine fails and that there are nodes available at the local site and must be migrated to the local site first.

Next, let's configure the preferred site at the cluster level. The purpose of configuring the preferred site at the cluster level is to ensure that the preferred site always wins when 50: 50 occurs.

# View the voting and witness voting of the current node of the cluster

Directly disable the arbitration disk to simulate the failure of cluster arbitration. In the case of 50-50 points, dynamic arbitration automatically selects a node to remove the number of votes.

You can see that by default, the cluster automatically removes votes from Shanghai sites.

If we manually specify the preferred site, for example, if we manually designate Shanghai as the preferred site, then when the next 50-50 share occurs, the dynamic arbitration will always remove the vote of the Beijing site node.

ClusterLog

Store the full number of votes cast at the time of witness

Witness the invalid number of provisional votes, which will be randomly removed by dynamic arbitration immediately after that.

Dynamic arbitration removes HV03 voting by default

Remove HV01 voting after manual modification

The above is one of the effects of the preferred site configured at the cluster level. It seems that 2012R2 LowerQuorumPriorityNodeID does not make much difference.

# configure the cluster level preferred site to return to Beijing

(Get-Cluster) .PreferredSite = Beijing

Having seen the storage-level site awareness and the new features of the preferred site to replace LowerQuorumPriorityNodeID, let's take a look at the main course, that is, multi-home site awareness for the application.

After looking at the multi-host site awareness of the application, Lao Wang still wants to show you the effect of site awareness at the cluster level. Currently, we have configured storage site awareness, so we can play a complete effect.

Time Node 1

All applications and virtual machines run on the Beijing site HV01

Stop HV01 cluster service manually. CSV migrates other nodes of the same site, and all roles migrate other nodes of the same site.

Stop the HV02 clustering service again, lose the entire Beijing site, and all applications migrate to Shanghai across site failure domains

So far, this is the effect of fault domain site awareness. For those who understand the concept of fault domain, but do not know Microsoft WSFC, this may be cool. The application is aware of the fault domain and gives priority to transfer within the fault domain. Virtual machines in the fault domain of the same site can get the best performance of storage. If the fault domain of the same site is not available, it can be transferred across the fault domain to the new fault domain site to continue to run. It looks very good. But the expert looks at the way, in fact, it is only to pack up a new concept, we can achieve a similar effect through the original preferred owner technology, but through the new way of configuration, it looks more professional and facilitates the landing of the failure domain concept.

Compared with site awareness, Lao Wang pays more attention to another large piece, that is, preferred site awareness, first of all, site awareness includes cluster level, storage level and cluster level, in which Lao Wang pays more attention to the preferred site awareness of storage level, which can be integrated with site awareness to ensure that virtual machines in the same site always access storage at the same site, so that CSV storage and virtual machines are always at the same site. The same site always gets the best performance, which was not available in previous versions

Sort out the new features added to the fault domain by WSFC 2016.

1. Fault domain definition: new, site-aware, SDS-aware

two。 Site awareness: instead of preferred owner

3. Preferred site awareness: an alternative to LowerQuorumPriorityNodeID

4. Apply multi-host preferred site awareness: instead of preferred owner

5.CSV storage preferred site awareness: storage follow site, VMfollow storage

6. New parameters for cross-site heartbeat detection

Failure domain and site awareness support WSFC traditional co-domain deployment, workgroup deployment and co-forest multi-domain deployment.

As Microsoft's local product tried to introduce the fault domain attribute for the first time, Lao Wang thinks that what Microsoft has done is OK and expects to continue to improve in the future. in Lao Wang's view, the fault domain can actually be defined at the SCVMM level, from the site, data center, rack, chassis, cluster, Host, it may be better to do it from top to bottom, because at this stage, it is defined from the bottom up. After all, the size of the cluster is limited, if it is at the VMM level, then we do not have to be limited by the size of a single cluster, or even for the site, or rack, chassis, data center, configure different failover policies, network policies, storage policies, that is even better. I hope this function can be continuously improved, and more and more local-side products, or cluster upper-layer applications can better cooperate with the failure domain.

Finally, by applying multi-master preferred site awareness, we can configure different cluster groups to select different preferred sites at the level of cluster groups, so that different cluster groups will first perform failover in their preferred sites. if the preferred site has no nodes available, then transfer to another site, which, to a certain extent, can reduce the cost of cross-site transfer To ensure that the resources of each site are used reasonably, and that there are no nodes in this site will be transferred to cross-site, to a certain extent, it can reduce downtime and improve the running time of applications.

# configure multi-home site awareness

Beijing is the preferred site for devtestdtc

Beijing is the preferred site for MikeWang

(Get-ClusterGroup-Name devtestdtc) .PreferredSite = Beijing

(Get-ClusterGroup-Name MikeWang) .PreferredSite = Beijing

Shanghai is the preferred site for devtest1dtc

(Get-ClusterGroup-Name devtestdtc1) .PreferredSite = Shanghai

Currently devtestdtc Mikewang runs on HV01 and devtestdtc1 runs on HV04

Downtime HV01,devtestdtc Mikewang transferred to HV02 at the same site in Beijing

Downtime HV04,devtestdtc transferred to HV03 at the same site in Shanghai

For site awareness or application multi-host preferred site awareness, it is recommended to configure the application's failback attribute, so that when the preferred site is detected, or the local site is available, it will return to the original site to run.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.