WSFC dynamic Arbitration and Voting Adjustment 1 12/25 Update SLTechnology News&Howtos

WSFC dynamic Arbitration and Voting Adjustment 1

2025-12-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

In the previous article, Lao Wang took 2008R2WSFC as an example to introduce the change processing when arbitration occurred. By 2012, this changed a lot, and even we had to rethink arbitration.

2008 and earlier era of arbitration is relatively rigid, just like you and the cluster agreed, 3 nodes majority arbitration model, I have at least two nodes running, then once your cluster when the last node, the cluster will be closed, because the cluster in the 08 era mainly emphasizes to follow the arbitration model, you and the cluster agreed arbitration protocol can not be violated, even if you can provide services to the remaining node, but the cluster will shut it down, Unless you use forced arbitration to start the cluster, forced arbitration was arguably used a lot in the '08 era.

By 2012, the arbitration model in the past has changed. It is no longer so rigid. Imagine that you have become familiar with the cluster. You have reached a very good intelligent agreement. The cluster no longer requires you to follow the arbitration protocol strongly. Or, the arbitration protocol has changed. At 2012, arbitration is mainly to keep the cluster continuously available. It no longer emphasizes that the cluster must follow the arbitration model. But mainly to ensure that the cluster survives.

Microsoft started injecting dynamic voting technology in 2012. Since 2012, WSFC can dynamically adjust votes according to node state changes. For example, the current cluster is 3 nodes. The arbitration model of most nodes. When a node is broken, it should be two votes under normal circumstances. In the 2008 era, if another node is broken at this time, the cluster will be closed. Because the arbitration model is not followed, you violate the protocol.

2012 will not. 2012 will dynamically adjust the number of votes of nodes to ensure that the number of votes of clusters is always odd. This way, when partitions appear, most of them can always be used. When 3 nodes fail 1 node, 2012 clusters will remove another vote, so that only one node in the cluster has votes. At this time, if the cluster fails another node, the cluster will still be available for part of the time. As for why I will say part, I will answer for you later. However, with a certain probability, it could last until the last node.

This is a relatively new way of thinking. In the 2008 era, if this happens, the cluster will be closed. We can only force the cluster to start on the last node. However, we don't need it in 2012, because it will dynamically adjust the number of votes for us. In the 2012R2 era, it is more intelligent. We can dynamically adjust the votes of witnesses to achieve true cluster support until the last node.

I won't say much about theory, because many people say that the concept of dynamic arbitration has been seen a lot, but they still don't understand what effect it has. Next, let's take a look at the actual effect.

As you can see, Lao Wang is now creating a cluster under the 2012R2 environment with 3 nodes and no disk witnesses and shared witnesses configured

In 2016, when my good friend Junsen Zhang configured the cluster, he asked me where most node arbitrators went in 2012 and how to configure arbitration. Some did not recognize it. Indeed, since 2012, the UI of cluster arbitration has changed.

It has become the following. Microsoft's good intentions are understood by Lao Wang. Microsoft knows that everyone thinks that the concept of arbitration is not easy to understand and design. Therefore, Microsoft has designed intelligent dynamic node voting and dynamic witness. Cluster will automatically help everyone determine the most suitable arbitration model. Under normal circumstances, we can choose the default arbitration configuration. Nothing else needs to be managed. If cluster detects that there is currently a disk suitable for witness, Witness disks are preferred as cluster quorum over majority nodes

Many friends may ask, where did most node arbitration go? In fact, at the beginning of 2012, most node arbitration has become "none", or arbitration witness is not configured. When we select arbitration witness in the second item under the arbitration selection interface, we can see the following interface, that is, we manually configure the witness of the cluster. If we want to change to majority node arbitration, we can choose not to configure arbitration witness. If we choose default, That is, let the cluster decide on its own cluster quorum model, and the cluster will automatically configure to majority node mode if it detects that no witnesses are available.

If we select Advanced Quorum Configuration under Configure Quorum Wizard, besides manually selecting Cluster Quorum Model, we can also manually select the number of votes of nodes in GUI interface. Here, we can specify that some nodes always have no votes, which can only be executed by command before. If you want to set Cluster Quorum Model to Disk Only, you also need to select Advanced Quorum Configuration.

The above briefly introduces the changes in the new Cluster Quorum Wizard since 2012. To help you familiarize yourself with the basic environment, you can see that most nodes, disk witnesses, shared witnesses, and disk-only arbitration models are still in place, but they have been changed. The disk-only arbitration mode is no longer recommended in the 2012 era, because disks have become single points of failure and cannot fully utilize the advantages of dynamic arbitration.

Starting with R2 2012, when we create a multi-node cluster, we often see prompts and warnings like the following

The reason is that starting from 2012R2, WSFC will want you to always configure a witness disk to ensure the highest availability of the cluster, because the dynamic node votes in the 2012 era are still a little risky. In 2012R2, no matter whether you are an odd node or an even node, as long as there is a witness disk, you can guarantee that your cluster will support the last node. For example, 3 nodes + witness disk, the cluster will automatically remove one vote of the witness disk. Now the cluster has three votes. If a node is broken, the cluster has 2 votes, and the cluster automatically adds one vote from the witness. Now the cluster has 3 votes, which is odd. At this time, if another node is broken, there is still the last node and witness left, and the cluster can still survive.

Let's take a look at the actual effect. First, let's look at the case of 3-node majority arbitration.

Now our 3 nodes are all in the same subnet, deliberately not configured with witness disks

2012R2 You can directly see the votes of nodes in the cluster GUI interface. You can see that each node currently needs one vote and is working normally.

We still created a clustered DTC app, now hosted on node HV02

HV02 is directly powered off, and cluster DTC automatically transfers to HV01 operation. It can be seen that at this time, the cluster has taken advantage of the dynamic voting technology newly added in 2012 to automatically remove one vote from two nodes and always ensure that the cluster votes are odd. Previously, in the 2008 era, if one node is broken in the arbitration of most nodes of 3 nodes, the cluster immediately starts to prompt. It cannot be broken again. If one node is broken again, the cluster will be closed. At 2012, there is no such prompt. Because clusters are no longer primarily interested in quorum model compliance, they are always kept available.

At this time, we also power off HV01, now only HV03 is left in the cluster, you can see that the cluster DTC has been transferred to HV03 to provide normal service! Cheers! We can now survive to the last node under the 3-node majority arbitration model, which can increase the availability of the cluster application to a certain extent. Previously, we needed to force the last node to start. Now we don't need it. Dynamic arbitration automatically helps us by adjusting the number of votes of cluster nodes. It helps us ensure the continuous availability of the cluster.

At this time, HV01 HV02 gradually recovered and joined the cluster node. You can see that the node gradually joined again. The cluster also dynamically helped us to adjust the number of node votes. HV02 joined. In the case of two nodes, the cluster randomly voted for node 2.

Node 1 is coming online and joining the cluster, and the cluster will dynamically adjust the votes

After Node 1 is fully online and joined the cluster, the cluster reverts to an odd number of three votes

Continuous availability throughout the cluster application, downtime is just the time it takes for the cluster application cluster group to go offline and back online from each node

Here, it looks beautiful, the cluster automatically helps us adjust the node vote, in the case of three nodes can also survive to the last one, but in fact, this way that most nodes dynamically adjust the number of node votes is also a bit flawed, Lao Wang said above that in the case of three nodes, the cluster part of the time can survive to the last node, here to explain why it is partial

Let's assume such a scenario. Suppose there are 3 nodes clustered now, one node is broken, and there are still two nodes left. In the end, can most of the two nodes in WSFC2012R2 be clustered? The answer is yes, but the risk is very high.

With two nodes remaining, the cluster randomly selects one node, assigns a vote, and removes the vote of the other node

If HV03 loses power, OK, the cluster doesn't care about you, because you are not the selected voting node. If you fail, the cluster can still function normally.

At this time, HV03 is restored, HV02 is still the selected voting node, we try to shut down HV02 operating system normally.

At this time, you can see that the voting node has been switched to HV03, and the cluster application is also running normally. This is like node 2 and node 3 are colleagues. Node 2 and node 3 say, I want to get off work. The rest of the work is left to you. Node 3 says yes. After node 3 switches work, node 2 shuts down, and node 3 continues to complete the rest of the work.

In the last case, we assume that the node currently selected to vote suddenly loses power.

It can be seen that HV02 is currently a voting node and is directly powered off without voting handover with HV03, so the vote is not handed over to HV03.

Cluster service is down, cluster service is inaccessible, and there is no dynamic quorum to support the cluster to the last node

The only way to do this is by forcing the cluster to start

Therefore, in the case of majority node arbitration, where only two nodes remain, dynamic arbitration also determines whether the cluster can function as appropriate

Case 1. Non-voting nodes lose power and the cluster functions normally

Scenario 2. The voting node operating system is shut down normally, the votes can be exchanged normally, and the cluster can run normally.

Scenario 3. The voting node is powered off, the cluster cannot run, and the votes are too late to exchange. Forced startup is required.

From this, we can see that most node dynamic arbitration can only be said to survive to the last node in some scenarios, so we can only pray that case 1 and case 2 will be encountered, but once case 3 is encountered, it can only be forced to start.

Dynamically adjusting node voting is an updated technology above 2012, we have seen it, undeniable, is a good technology, most scenarios can help administrators solve some problems intelligently, but there are also its shortcomings, namely case 3

In the 2012R2 era, Microsoft added dynamic witnesses on the basis of dynamic arbitration in 2012. In addition to nodes, witnesses can also automatically adjust votes. As long as there is a witness, regardless of case 1, case 2, case 3, the cluster can start normally. It can be said that as long as there is a witness, forced startup is almost no longer needed.

Next, we are looking at a scene of three nodes + disk witnesses. Lao Wang once saw this scene for everyone in 08 before. It can be said that it is very useless. In 08, three nodes + disk witnesses can only break one point at most, leaving two nodes + disk witnesses. As long as one breaks, the cluster will be closed. In Lao Wang's opinion, from the perspective of computing availability, it is useless, because I only have nodes to compute. I have to maintain the availability of two computing nodes. A witness disk is also maintained available.

In 2012R2, this scene has changed dramatically.

We added Cluster Disk 1, which is only 1GB, the smallest of all cluster disks, larger than 512MB, so Cluster Disk 1 is preferred if the cluster chooses the quorum disk

Here we verify that the cluster always configures witness disks for us using the default quorum configuration

You can see that the cluster automatically chose cluster disk 1 for us as the witness disk, and we followed best practices, and you can see that whether it was an odd node or an even node, in 2012R2, the cluster would recommend that you configure a witness disk

Current cluster DTC running on HV03

Direct power outage HV03, now the number of nodes is two votes, you can see that the current cluster automatically added a witness vote

HV01 is also powered down, and we can see that HV01 is powered down, but it has not adjusted its votes, because now only HV02 nodes and witness disks are left in the cluster.

Cluster DTC working properly on HV02

When HV01 and HV03 are repaired, you can see that the votes of their three nodes have been restored, and the votes of the witness disk are automatically removed.

At this point, I believe everyone has a concept of dynamic arbitration. This is a new idea. The cluster will automatically help us adjust the votes of nodes and witnesses to ensure that the cluster is always available.

In the case of most nodes, the cluster can persist to the last node in some scenarios. In the case of disk witnesses, as long as the disk witnesses survive and can be accessed normally, the cluster can persist to the last node. Therefore, it is recommended that when using 2012R2 clusters, regardless of whether the nodes are odd or even, you should always configure witness disks for the cluster.

In the next article, we'll continue with dynamic arbitration, simulating a multi-site, four-node dynamic arbitration scenario

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.