Laying the foundation of WSFC basic knowledge 04/19 Update SLTechnology News&Howtos

Laying the foundation of WSFC basic knowledge

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Before, I mainly introduced the types of clusters and some basic knowledge of clusters. At the beginning of this chapter, we will focus on the in-depth research and theoretical analysis of Microsoft failover clusters.

Microsoft failover cluster is a typical high availability cluster solution introduced in our last article, which is built into the roles and functions of Windows Server and does not need to install additional tools. Failover clustering is usually in a master-slave mode, that is, a cluster application has only one node to provide services at the same time, and then the failover cluster uses heartbeat detection mechanism to detect the survival status of nodes. Once the node downtime is detected, the cluster application hosted by the downtime node will be launched by querying the cluster database.

At the same time, failover cluster also has perfect cluster application health awareness, node health awareness, cluster health awareness, which began to be enhanced after the 2008 era, and tends to mature at the time of 12R2.

Some cluster applications can cooperate with DNS polling, or the cluster application has its own polling technology, then the dual-live cluster application working mode can be realized based on failover clustering, such as SOFS,SQL Server Always On and other technologies.

Microsoft failover clustering was introduced in the NT4 era, when it was called MSCS (Microsoft Cluster Service), and it was officially fully available in the 2003 era, and more enterprises began to use MSCS to build clusters, but in the 2003 era, although 2003 is an excellent system and the business is relatively stable, the MSCS cluster configuration in the 2003 era is really troublesome, which deters many people, but then again, the clusters in the 2003 era Lao Wang believes that it is one of the most convenient versions for IT personnel to understand how the cluster works.

By 2008, the cluster began to change. It changed its name to WSFC (Windows Server Failover Clustering) and provided a cluster creation wizard to help relevant people create clusters using highly available clusters more quickly and easily. It took a small day to create a cluster in the previous 2003 era, but it may only take 1-2 hours to start doing it in the 2008 era.

In the 2008 era, Lao Wang can be regarded as a turning point in WSFC. In this era, the cluster abandoned the 2003 era UI, replaced the new cluster UI, shielded the original cluster group and other technical details, and replaced it with a seemingly easier addition of cluster roles, and added a new cluster verification report.

2008R2, this version of Microsoft has released a large number of updates, among which for clusters, Microsoft has launched the CSV cluster file system, which has changed the operation mode of cluster virtualization. It turns out that 2008 if you run virtual machines on clusters, they all operate in the way of traditional cluster groups, assuming that 10 virtual machines are running on a cluster disk. If you want to migrate one of the virtual machines, you can only migrate the other 9 together. Because they are a whole traditional cluster group, with CSV in 2008R2, this practice has changed a little. All nodes can read the CSV file system at the same time, and virtual machines have a new cluster group mode, which can migrate one virtual machine at a time.

The 2012 WSFC era is the brilliant generation of Microsoft WSFC. In the 2012 era, Microsoft introduced the concept of dynamic voting, and the 2012R2 era was updated to dynamic arbitration, that is, the cluster can automatically help us adjust the number of nodes and the number of votes witnessed, and ensure that the cluster is always an odd number of votes. In the past, it required our managers to achieve the design, but the cluster would not automatically help us do these things.

At the same time, in the 2012 era, many new attributes have been added to the cluster in minority voting and equal voting scenarios, such as Lowerquorumprioritynodeid, which can help us choose one side to shut down when the number of votes on both sides of the partition is the same. 2012 2012R2 era cluster, for cluster arbitration, cluster node maintenance, introduced a lot of great, very intelligent features, such as

DrainOnShutdown,CAU et al.

To put it simply, clusters in the 2012 2012R2 era tend to be more intelligent on the basis of maturity. Clusters can carry out part of their own maintenance and management to ensure the continuous high availability of clusters, and some intelligent features have been added to enable administrators to do more cluster designs according to business scenarios.

In the 2016 era, Lao Wang thought that the cluster was closer to the cloud, with the help of some cloud functions to help the cluster, and also to support the current popular superfusion technology. in the 2016 era, cluster arbitration could be arbitrated using blob on Azure. Previously, when designing the arbitration location, architecture managers might be placed with the cluster of one of the sites, or separately in the third site. Now, with the implementation of blob on Azure, we can save some costs, and we can also take advantage of the redundancy technology stored on Azure. The 2016 cluster also begins to support SDS technology, similar to VSAN, which can contribute the storage in the belly of nodes on multiple servers to form a clustered storage pool, and then run the cluster application based on this contributed clustered storage pool.

To sum up, clustering in the 2016 era has not only optimized the cluster with the help of cloud functions, but also updated the current popular technologies, such as instantaneous anti-outage, rolling update, SDS super-fusion, storage replication, extended clustering and so on. We can see that WSFC is constantly improving with the pace of the times, constantly updating, and can meet the needs of more scenarios.

The above Lao Wang gives a brief introduction to the development history of Microsoft failover cluster, which involves some concepts, such as clustering, arbitration and witness. I will explain the new 2012 and 2016 features for you next. I will also write a blog to discuss with you when I have time. Below, Microsoft failover cluster will be referred to as WSFC.

Before formally introducing the concept of cluster details, let's take a look at the hardware and software requirements of WSFC.

1. Ensure that multiple nodes can access shared storage of the same content, whether from SAS,ISCSI,FCOE,JBOD,RBOD or SDS, and that the same shared storage can be accessed by all nodes within the cluster, so that other nodes can access resources from shared storage in the event of a failover

two。 Make sure that the cluster node OS is genuine and not pirated, otherwise you may encounter some strange problems.

3. To ensure that the cluster node is a domain member, some changes have taken place since 2012. The 2012 era proposed a cluster architecture that does not rely on AD, that is, cluster applications do not have to create VC0, but nodes are still required to join the domain. In 2016, xxx supports real workgroup clusters, cross-domain, cross-forest clusters.

4. It is not a mandatory requirement to ensure that the hardware of the cluster nodes is the same, but it is strongly recommended that the hardware of each node use the same server, otherwise virtual machines cannot be migrated. Lao Wang also suggests adopting a modular and scalable architecture for cluster node servers, making full use of highly available technologies and hardware offload technologies such as redundant switches, hot plug, RAID,MPIO, network card combinations, LBFO,ODX,RDMA,RSS, etc. Eliminate a single point of failure.

5. Make sure that the account that creates the cluster has local administrator privileges on the cluster node, and also has certain write permissions in AD, or permissions that are pre-set by the AD administrator.

6. Make sure that at least one network card is available on the cluster node. Even if the cluster node has only one card, you can install a cluster, but it is recommended to have at least two. As mentioned in the blog before, portal: http://wzde2012.blog.51cto.com/6474289/1947451, if you do not virtualize, it is best to have three cards, virtualization recommendation 4, cluster external management address, can be IPv6 or IPV4, can be DHCP or static Microsoft has no strict restrictions since the 2008 era, but we usually use static IP addresses, but DHCP sometimes reduces some downtime in multi-site scenarios, as will be mentioned in subsequent blogs.

I would like to say a few more words about the setting of the cluster network. For the cluster communication network card, it is recommended to prohibit netbios registration and lookup. After doing so, the cluster name will only be resolved to the external network, and will not attempt to parse netbios and DNS on the heartbeat network to prevent interference.

It is recommended that the order of network cards be adjusted so that the cluster external network cards have the highest priority, followed by heartbeat cards or memory cards. Microsoft has also said that the order of network cards is no longer important after the 2012 era, but it is recommended to follow it.

In 2003 era, only enterprise version and data center version can install MSCS cluster function, 2008 era is only enterprise version, data center version can install WSFC function, standard edition and Web version can only install NLB cluster, in 2008 × × there is Core version, Core version can also install cluster function, but also only enterprise version and data center version can, to 2012 era, only standard version and data center version Both versions have the same functionality to install the cluster feature, except for the virtualized OS authorization, which is the same as 2016 and 2012.

WSFC cluster support node is a virtual machine and a node is a physical machine, you can have one node is a physical machine and one node is a virtual machine, or both nodes are virtual machines, or you can build a virtual cluster on top of the physical machine cluster. The WSFC cluster does not care whether you are physical or virtual, as long as the system, network, and storage configuration meets the cluster standard.

The above are some rigid requirements for cluster installation, as well as recommendations. In practice, I suggest you take a look at technet and follow the technet documentation. Lao Wang only selects the installation requirements here.

Key points as an introduction, we will begin to introduce the working principles and detailed concepts of WSFC clusters

First of all, let's take a look at how WSFC failover works simply. We have a general impression.

1. First, deploy and configure the cluster node as required to ensure that the cluster server uses redundancy technology to eliminate a single point of failure of the server, network, and storage.

two。 Ensure that all nodes in the cluster can access shared storage

3. Cluster applications write application data to cluster shared storage

4. The administrator added the function roles above the node 1 server. After the addition, the node 1 server cluster database records the new role functions and associated information, and will later synchronize the information to other nodes 2 and the cluster arbitration disk.

5. Network-wide handshake detection between cluster nodes according to predetermined heartbeat detection frequency

6. When the server suddenly shuts down in node 1, the heartbeat detection frequency of node 2 reaches the threshold and it is determined that node 1 has gone offline.

7. After node 2 determines that node 1 has gone offline, it will view the cluster application currently hosted by the node 1 server based on the synchronized cluster database information, reapply the cluster application with the associated IP address, and put the cluster disk online at node 2.

8. The client accesses the cluster name normally and uses the cluster service, but the cluster application of the original node 1 is now provided by node 2, and the failover ends

Sum up the two key points here.

1. Shared storage must be accessible to all nodes and can be mounted and unmounted normally, because a traditional cluster application data must be written into shared storage, and shared storage has become an authoritative storage source. when the node is down, other nodes will view the information of the cluster database, and then mount the shared storage and re-launch the application, so the access to the shared storage must be ensured.

two。 Cluster database is one of the main concepts in the operation of WSFC. The current state of cluster applications will be recorded in the cluster database. For example, the current node 1 runs a DHCP role, the status is online, a file server role is run, the status is offline, and the cluster configuration, cluster member configuration, cluster resource addition, creation, startup, deletion, stop, offline and other state changes The purpose of the cluster database is to help each node know what kind of cluster service is running on each other. Once the other node is down, it will connect to the shared storage according to the status information in the cluster database for failover online operation.

When it comes to the detailed concept of clustering, Lao Wang would like to start with CNO and VCO, because after consideration, talking about these two concepts will make it easier to understand the later concepts.

Well, what is CNO? I still remember that Lao Wang once told you when he wrote the introduction to the cluster that a cluster allows a group of computers to work together, and it feels as if a computer is providing services. This computer feels like an external computer, that is, CNO, also called the cluster management object. When we run the cluster creation wizard, we will be prompted to enter a cluster name and cluster IP address. In fact, the cluster creation wizard will use the account under which we run the cluster creation wizard to create a computer object in the cluster. The name of the computer object is the cluster name we entered, and it will also create a relative DNS record. There are computer objects, DNS records, and IP addresses, such as whether a computer is correct or not.

The key point here is to ensure that the account running the cluster creation wizard has permission to write computer objects in AD. By default, permissions need to be granted at the AD domain level, or you can create a CNO computer object with the corresponding name, that is, the creation wizard account has full control over it, and then join the account to the local administrator group of the cluster node. This is the authority required for the account.

When the CNO is created, CNO will be used as the cluster management point. In the future, we will manage the cluster and enter the CNO cluster name directly. In addition to being the cluster management point, some cluster roles that do not need to create VCO can also directly use CNO as the external access name. CNO has certain self-management features. When CNO is created normally, CNO will maintain the cluster role virtual computer VCO and the DNS record of VCO.

The so-called VCO means that some applications running on top of the WSFC cluster need to have their own separate computer objects and names, such as SQL and file server roles. When they need to use Kerberos authentication, they need computer objects. When we add roles and functions in the cluster, VCO is actually created by CNO in the same OU in AD. Therefore, you also need to make sure that CNO has the permission to create computer objects under OU, and the DNS records of VCO will also be established by CNO. Sometimes, the cluster resource name may not be able to connect. At this time, it is necessary to see if the CNO VCO computer object is offline, or CNO does not have the permission to create DNS records or VCO objects.

When the DNS of CNO or VCO records the IP address, it will be bound to a node, which is the master node for VCO CNO. For example, the master node of CNO can be node 1, and the master node of CNO can be node 2. When node 1 fails, the IP address and domain name of CNO will fail over to node 2 for binding. If node 1 is alive, node 2 will die again. Then the IP address and domain name of CNO and VCO will be transferred to node 1 binding, where the binding can be understood as hosting

After talking about the concepts of CNO and VCO, it is relatively easier to understand the concept of clusters.

Cluster group can be understood as the minimum failover unit of the cluster. In the 2003 era, after we created the cluster, we can directly see the working status of the cluster group. That is, I said above that the cluster in the 2003 era is more convenient for everyone to understand the working status of the cluster. In the 2003 era, after we have created a cluster, there will be a cluster group with cluster IP address, cluster name, and cluster witness disk. This is the most basic and important information needed by the cluster. If these things are alive, the cluster can provide services to the outside world. This concept continues until 2016. After the creation of the cluster, there will be a cluster group behind the default cluster, also known as the cluster core resource, which can be understood as the parent of other cluster roles. All cluster applications can only be established because they have been set up by the cluster core resource group. when we open the cluster management console, we can see that there is a master server, the core resource group is on that machine, and that machine is the cluster master server.

When we add roles to the cluster, traditional cluster applications will let us enter the application name, the application IP address, and the available cluster disk, which are used to store the cluster application data. The cluster application name, application IP address, and selected cluster disk form a cluster group.

For example, at present, we have established a traditional file server role, and to fail over it as planned, we will essentially migrate according to an overall cluster group, apply the IP address to the cluster, transfer the DNS name of the cluster application from node 1 to node 2, and then mount the shared disk mounted on node 1 to node 2.

In a traditional cluster group, a single cluster disk can usually be occupied by only one cluster application. For example, if the file server uses this cluster disk, other cluster applications cannot reuse the disk. Moreover, if you want to migrate, you can only migrate the whole cluster group.

This is very inconvenient for virtual machines. Since you can only use one cluster disk for a cluster application, in the 2008 era, clusters were virtualized. If virtual machines are all created in a cluster group, the virtual machines are all on the same cluster disk. That is to say, when you want to migrate one of the virtual machines, you can only migrate the other virtual machines above and those that you use the same cluster group. Microsoft soon found that this traditional cluster group method was not very suitable for virtualization, so at the beginning of 2008R2, Microsoft launched the CSV cluster file system. At the same time, it also changed the operation mode of the cluster virtual machines, abandoned the traditional cluster group failover practice, and had a new cluster group model for the virtual machines on the cluster, which was different from the traditional cluster group. The new virtual machine cluster group contains only the specific information of a single virtual machine, such as virtual machine configuration information, virtual machine disk, virtual machine status and so on. Each planned or unplanned migration of each virtual machine will only involve this information, and there is no need to be forced to migrate all virtual machines.

The above is a brief introduction to the cluster group. As long as you know that the cluster group is the smallest failover unit of the cluster, when we migrate a traditional cluster application planned or unplanned, in fact, the whole cluster group of the application will be migrated to other nodes for hosting. For virtual machine 2008R2, we can only migrate the cluster group with information related to a single virtual machine.

After talking about the cluster group, let's take a look at the concept of the cluster verification report. in the 2003 era, there was no such thing. 2003 only the installation log was produced after the cluster installation was completed. at the beginning of 2008, the cluster changed to a new UI, and the cluster verification report was also launched. Today, the cluster verification report is also a troubleshooting tool for many Microsoft employees.

To put it simply, what is a cluster verification report? it can be understood as a private doctor of a cluster. when we create a cluster, it is strongly recommended to run a cluster verification report, which will help us from the system configuration, network, storage and other perspectives to diagnose whether the current environment is suitable for creating a cluster. If you create a cluster, those conditions have been met, and those that have been met will show a green check mark. Those conditions do not comply with Microsoft best practices, but will not affect the establishment of the cluster. Such reports will be displayed as exclamation marks. If those conditions do not meet the requirements of the cluster, it will lead to errors in the creation of the cluster. A red cross will be displayed and must be processed.

Usually, Lao Wang recommends that the cluster verification report be run when creating a cluster, and then for the cluster environment, such as changes to the network, storage, and new nodes, it is recommended to run the cluster verification report to ensure that the changes do not cause failure to the existing cluster.

Special attention should be paid to the storage in the cluster verification report. when conducting the cluster verification report, we can choose what to verify, which can be system configuration, network, storage, etc., in which when verifying storage, it may cause the cluster disk to be temporarily offline, and if there is business running above, there may be a short downtime. Therefore, when you check the storage in the cluster verification report, you must be careful. If you do not check the storage verification, it will not cause downtime, and all of them will be performed normally.

The cluster verification report is an important private doctor of the cluster. It tells us which are correct, which are incorrect, and which can be improved. When we case Microsoft because of a cluster problem, the Microsoft support engineer may first ask you to export and generate the cluster verification report and send him a copy. The cluster verification report itself has been verified against a lot of cluster content. Friends who are interested in WSFC can take a good look at the cluster verification report when they have time.

Next is the top priority of this article. We are going to explain the concepts of arbitration, witness and voting in WSFC. Frankly speaking, Lao Wang was confused about these concepts when he first worked on Microsoft products. He can do it, but he can never figure out the true meaning. I believe many friends who have just entered the Microsoft world may have the same problem. Then Lao Wang will try to explain these things clearly.

Before we mention the concept of witness voting, let's take a look at arbitration. Why on earth does the cluster need arbitration? have you ever thought about it? at first, Lao Wang was always confused about the role of arbitration. After a period of study and research, I have a little experience of my own.

And shut down the cluster

Therefore, Lao Wang believes that the first role of arbitration is to determine whether the cluster can survive according to the predetermined arbitration model when the state of the cluster node changes.

The second function is to maintain the normal operation of the partition on one side of the majority nodes of the cluster when the cluster is partitioned. For example, at present, the cluster has two sites in Beijing and Guangzhou, and there are three cluster nodes in the Beijing site and two in Guangzhou. Currently, the arbitration model they use is the majority of nodes. At this time, Beijing and Guangzhou directly have a network failure, and Beijing cannot access Guangzhou. At this time, the cluster arbitration will judge, ah, the cluster has a total of two sites, and the Beijing site still has three nodes left. it is the majority, and it can survive. At this time, the three nodes in Beijing will re-form the cluster. When the Guangzhou node network returns to normal, the cluster will be re-formed and continue.

In addition to maintaining the majority, arbitration also needs to solve the problem of dealing with brain cracks. For example, the current cluster has two sites in Beijing and Guangzhou, Beijing has two nodes, Guangzhou also has two nodes, and they adopt the majority node arbitration model. At this time, if there is a direct network failure between Beijing and Guangzhou, then the nodes in Beijing and Guangzhou will try to write data to the shared disk. Beijing and Guangzhou each think that they are alive, and they are the master node. As a result, the cluster data will be corrupted and the cluster will not work properly. This situation is often encountered in the era of 2008R2. It is possible that the number of nodes between the two sites is the same and suddenly forgot to communicate, or the witness disk is used, but the witness disk can not be contacted, which will lead to this kind of brain fissure.

Therefore, the purpose of arbitration is to always ensure that in the case of cluster partition, the majority of nodes are responsible for operation, and one of the minority nodes should be closed, that is, we should always ensure that the number of cluster votes is odd, because once there is an even number of votes, brain fissure will happen again. How to ensure that the number of cluster votes is always odd? on the one hand, we can make use of the existing technology of the cluster. On the one hand, the architectural designer's design concept should be accurate, if it is four nodes, then you must design a disk witness or shared witness, otherwise brain fissure is imminent.

So what is witness? you may often hear about arbitration plate, witness plate. In fact, the so-called witness is a technology introduced in the WSFC 2008 era to help us avoid brain fissure under even-numbered nodes.

By default, each node in the cluster will have its own vote. When the node can pass the network condition detection, the current connection to the shared storage is normal and the system is available, then the voting of that node is valid. Assuming that it is currently a four-node cluster, if each node is working properly, there should be four votes. If it is two sites, that is, two votes for each site. When a network partition occurs, the cluster will have a brain fissure. If a witness disk is introduced, when a 50max 50 partition appears, the witness can be contacted and the number of votes can be obtained. That is, the witness disk will also have the number of votes for its own partition. When 50x50 partition occurs, the other end can first establish a connection with the witness disk, and the other end will win and continue to run the cluster. The party that fails to contact the witness disk will be shut down.

Disk witness and shared folder are both for the same purpose, which was originally used by Microsoft to solve the problem of brain fissure. Although it can solve part of the problem of brain fissure, sometimes brain fissure still exists. Although disk witness and shared folder do the same purpose, there are still some differences between them. Both disk witness and shared folder can deal with the problem of brain fissure. But once there is a time partition, the disk witness can be handled well, but the shared folder is not. For example, the current node 1, node 2, uses a shared folder, when the DHCP role is added to node 1, and then node 1 goes down, and only node 2 is alive, we add the file server role to it, and node 2 goes down, so we turn on node 1. You will see in the cluster that the Node 1 cluster is unable to start providing services.

The reason is that node 1 does not have the latest cluster database, because there is no cluster database in the shared folder witness, so the cluster detects that node 1 does not have the latest cluster database, which will prevent the node from starting. Not if using a disk witness, because there is also a cluster database in the disk witness, and the database of the cluster node will also be synchronized to the witness disk, if the disk witness is used Node 1 has the opportunity to contact disk witness, synchronize the latest cluster database with disk witness, and then go online to cluster resources. therefore, it is recommended that if you can choose disk witness, try to choose disk witness, disk witness is the golden rule!

In the 2012 era, the cluster made an important update and introduced an intelligent way of dynamic voting and dynamic witness. to put it simply, you can witness with 4 nodes and 3 nodes with disk. because the group dynamically adjusts the number of votes for you, for example, now that the cluster is 3 nodes plus witness disk, one vote of the witness disk will be automatically removed, and the cluster is now 3 votes. If we add another node, the current four nodes, and the witness vote will be added, the cluster is now 5 votes, so in the 2012 era and after, WSFC cluster is always recommended to configure disk witness for the cluster, whether it is odd or even nodes, because the cluster will automatically help us to adjust the number of votes, through dynamic arbitration can help us deal with almost 80% of the brain fissure scenarios.

In some extreme cases, for example, the witness disk of the third site can not be contacted, we can also deal with the brain fissure in the case of 50 hip 50 by forcing the cluster, or Lowerquorumprioritynodeid and other technologies, and using forced startup can also start the cluster in a small number of cluster partitions. For example, in the current 5-node cluster, all four nodes are at site 1, site 1 crashes, site 2 is still alive. However, because the arbitration algorithm does not allow a small number of votes to provide services by default, but sometimes a lot less, as long as it can provide services, we can also use the forced startup technology to force a node of site 2 to start up to provide services, so the current WSFC can well deal with brain fissure, 50amp 50 scenarios, and a few node survival scenarios.

Taking the most widely used 2008R2 cluster in China as an example, there are four WSFC arbitration models in 2008R2.

Node majority: each cluster node has its own number of votes, and a majority of nodes are required to vote before the cluster can operate normally. For example, 5 nodes must have at least 3 valid votes, and 3 nodes must have at least 2 valid votes. Below the valid ticket default cluster will not provide services.

Node and disk majority: each node has its own number of votes, and the witness disk also has one vote. when the witness disk is alive, allow half of the cluster bad nodes, such as 6 nodes, witness the disk alive, then allow the failure of 3 nodes, plus one vote of the witness disk, the cluster can still live normally, when the witness disk hangs, the node must have a majority vote before the cluster can live, for example If you witness that the disk is down, a cluster of 5 nodes can only break two at most, and at least 3 tickets should be available.

Hard disk only: in this arbitration model, only the witness disk has one vote, and all nodes will try to get the witness disk, and the node that can connect with the witness disk can survive when the partition occurs. For example, if it is an 8-node cluster, even if it breaks 7 nodes, there is only one node left, but it can contact the disk, so the cluster can also survive. Cluster applications will be hosted on this node as much as possible, and the arbitration model introduced in 2003 has the advantage that it can survive when there is only one node in the cluster when the disk is available. The disadvantage is that the disk has become a single point of failure, and this arbitration model is no longer recommended by Microsoft after 2003.

Above we have introduced arbitration, witness, voting, arbitration model, let's string these concepts together to see if it is easier to understand.

The cluster needs arbitration to determine whether the cluster can survive, and the arbitration evaluates the number of votes in the cluster according to the requirements of the arbitration model. When the number of votes is set according to the arbitration model and reaches the minimum allowable survival and above, the arbitration determines that the cluster can survive. If it exceeds the minimum allowable survival of the arbitration model, the cluster shuts down. When there is a brain fissure zone, the arbitration will use the number of votes witnessed. To select one of the partitions to continue to provide services, and the other party to close

Lao Wang also thought of a visual example to illustrate arbitration and voting. Let's take the example of a hotel.

You should have at least three waiters before you can open for business as a hotel.

After the industrial and commercial registration, Lao Wang went back to the store to open a shop, put up a sign, and the sign was the external name of the cluster. when everyone saw your name, they all came in to eat, and every employee would clock in automatically when they went to work. I am healthy today. I can communicate with others normally, I can work, and this is voting. At this time, the waiter Xiao Li asked for leave from work, and the four waiters can still work. Then there was another waiter, Xiao Huang, who was going to get married and asked for leave, and there were three waiters left who could also work, but everyone did a little more work than before. At this time, Xiao Hong asked for leave again. Oh, manager, I was sick and could not come to work. At this time, only two waiters were left to work. At this time, the Trade and Industry Bureau immediately came, saying that at the beginning of your hotel, at least three people provided services. Now there are only two people left. you can't open the hotel and close it until someone comes. at this time, the hotel is closed, but the interior decoration has not been touched, but the sign cannot visit and come in for dinner for the time being. but when the waiters are all together, you can resume providing services to the outside world.

I don't know if you can understand this example, arbitration can be understood as an agreement you signed with the cluster, and the cluster abides by this agreement to judge whether your cluster can survive.

If there is a brain fissure, for example, the sincerity hotel has branches in Beijing and Tianjin, and both branches need to report to the department of industry and commerce, and only when they report their work can they provide services normally to the outside world. under normal circumstances, the department of industry and commerce only corresponds to one branch, and only when there is something wrong with this branch, it is the other branch and it reports. Then, Tianjin Hotel has two waiters. There are also two waiters in the Beijing hotel. Normally, they have to communicate by phone every day. Today, is it you or me?

Suddenly on this day, the telephone line broke down, Beijing could not contact Tianjin, Tianjin could not contact Beijing, what should we do, but we had to report, ah, so both sides thought that I should report, I should provide services, so both sides called the Ministry of Industry and Commerce to report their work, but the reports were inconsistent. The leaders of the Ministry of Industry and Commerce listened to Beijing and Tianjin, so the leaders of the Ministry of Industry and Commerce were also bewildered. What on earth is going on in your hotel? can you do it today? after a while, this is a brain fissure.

If there is a witness, it is equivalent to a deal between the two branches and the Ministry of Industry and Commerce, Leader. Under normal circumstances, we confirm who reported to you on the phone. Once the phone breaks down, we will prevail. Look at this fat man. He is an expert in our company. You can listen to which branch it is in. The leader of the Ministry of Industry and Commerce said yes, and then the telephone line broke down again two days later. But the fat man is in Beijing, and he said to the leader of the Ministry of Industry and Commerce, Ah, the fat man is in your branch, and there are many people in your branch, so you should provide services to the outside world first. Wait until his line is repaired in Tianjin.

If it is a compulsory arbitration, that is to say, there is another line failure in the Beijing branch and the Tianjin branch, but the Beijing branch has talked to the leaders of the Ministry of Industry and Commerce. Leaders, ah, our side is more important today, you listen to us. So the leader of the Ministry of Industry and Commerce said, well, since you are in a hurry to listen to you today, and then the Beijing branch and leaders have finished reporting their work, well, the Beijing branch can now provide normal services. But I will inform it later in Tianjin, telling him that I have listened to Beijing today and that you will have a day off tomorrow.

After arriving at the 2012R2, once this kind of compulsory arbitration occurs, there is no need to notify Tianjin and tell him not to open it tomorrow. After the 2012R2, once Tianjin detects that compulsory arbitration has taken place in Beijing, Tianjin will not start it automatically. It will know that there is already a compulsory arbitration operation on this side of Beijing, and I should abide by its operation.

If you use the Lowerquorumprioritynodeid attribute, or the Site preferred attribute in 2016, that is to say, the Industry and Commerce Bureau has laid down the rules beforehand, when there are two people in Beijing and Tianjin, and there are no fat people, I should not listen to anyone, for example, if Lowerquorumprioritynodeid is a node in Tianjin, that is to say, when a brain fissure occurs, Beijing's automatic service will be promoted directly to provide services to the outside world.

The above is a brief introduction of Lao Wang's basic knowledge of WSFC. I hope all my friends can have their own harvest. The foundation of the theory has been finished. In the next part, Lao Wang will use a set of 2008R2 environment to implement the true form of group arbitration and the correct way to deal with it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.