WSFC2016 On Azure 04/15 Update SLTechnology News&Howtos

WSFC2016 On Azure

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

In this article, Lao Wang will introduce some of the operations that WSFC runs on Azure, as well as some points that need to be paid attention to.

There are several reasons for writing this article.

1. To dispel the myth that the cluster can run on the public cloud platform.

two。 Bring you the thinking of the cluster running on the public cloud platform.

3. To introduce cloud S2D and cloud arbitration implemented by WSFC 2016 with the help of Azure.

First of all, can WSFC clusters run on public cloud platforms? the answer is yes. Theoretically, as long as we can meet the requirements of establishing clusters, we can deploy clusters on any platform of private cloud, public cloud or hybrid cloud.

Think about what are the most basic prerequisites to prepare when we build a cluster

1. Ensure that an activated, normal operating system is available

two。 Ensure good network quality between nodes

3. Ensure that shared storage can be seen between nodes, or use third-party replication to achieve an effect similar to shared storage

To achieve this, the third-party plug-in can cooperate with the cluster, and the disk made by the replication disk using the third-party plug-in can be recognized by the cluster as the cluster disk. Secondly, if you want to run on the public cloud or other cloud platforms, make sure that the third-party replication plug-in can be supported by the platform.

Once the prerequisites have been confirmed, we can monitor various public cloud platforms to see if they can support the needs of the cluster.

First of all, the operating system, the cluster requires that you can get an activated, healthy and stable OS, so you need to ensure that the OS provided by the public cloud is healthy enough that it will not cause cluster instability due to system problems.

Ensure that the public cloud cluster node changes the machine name to static IP, which does not affect the normal operation of OS.

The network

Cluster nodes can accept that there are no multiple network cards, but there must be at least one network card, which will be used to bear business traffic, CSV traffic, cluster communication traffic, and the cluster needs to be tested for health.

Therefore, when we think about the deployment of clusters on the public cloud platform, the most important point is that the virtual machines and virtual machines on the public cloud can communicate normally through the network card, and the network quality should be stable and efficient, because all the traffic is on one card. The best practice is to use multiple network cards and use multiple network cards for public cloud cluster nodes. It is best if the performance of the public cloud virtual machine Nic can be optimized.

Storage

In the previous traditional concept, the cluster must access the shared storage, and the application must write the data to the shared storage. With the update of the application and third-party plug-ins, the application can begin to support the use of local disk + replication to achieve the cluster disk, or high availability effect.

It is very important to run the cluster on the public cloud: we are running VM's Guest Cluster on other people's platform.

This means that we do not have access to the underlying storage architecture such as FC,FCOE,ISCSI,JBOD,RBOD, and we do not have the right to control the allocation of storage, so what kind of shared storage a virtual machine can use depends entirely on how public cloud vendors disclose it, but it is usually not very high. Some conscientious vendors maybe will open up technologies similar to share vhdx. Through the underlying virtualization technology, a virtual disk can be mounted to multiple nodes at the same time, or an ISCSI gateway can be provided to provide ISCSI target to the virtual machine node in a secure way.

If the vendor does not provide a solution, we need to use some third-party vendors to implement a cross-node local disk replication to make the cluster think it is a cluster disk, but as mentioned before, this requires cluster support. Public cloud platforms support the deployment of this plug-in

Authentication

If you plan to deploy the cluster as a workgroup model, you do not need to consider the public cloud platform's support for AD domains

If you plan to deploy the cluster as an AD domain model, please make sure that the public cloud vendor supports the installation of AD-type virtual machines to ensure that the installed AD domain can provide domain services normally in the public cloud environment, and that the AD domain controller will not run normally because of the disk cache settings and network reasons of the public cloud platform.

Once the cluster is deployed as an AD domain model, it will need to be written to the cluster CNO object and VCO object, so it is necessary to ensure that the public cloud supports virtual machines running AD domain architecture, AD domain virtual machines can communicate normally with cluster node virtual machines on the public cloud, and CNO,VCO read and write requests can be performed normally for cluster node virtual machines for AD domain authentication requests.

If the above four basic requirements can be met, it means that it is feasible for the cluster to run above the public cloud, then we can think about the next step, whether it is really necessary for us to deploy the cluster in the public cloud, and under what scenario, we should deploy the cluster on the public cloud. For the cluster, I can get those benefits if I deploy in the public cloud.

First of all, let's take a look at the benefits that we know public clouds can bring to us.

1. Large enough, enough resources available, we can deploy enough nodes

two。 In use, we only need to pay for the resources we have used, and there is no cost for the unused resources.

3. The public cloud platform will be responsible for maintaining the underlying facilities such as servers, storage, networks, and so on, and we do not need to spend energy and financial resources to maintain these facilities.

4. Public cloud platforms usually define a Region mechanism, which can help us ensure that if the resources, storage, VM, and network are all in the same Region, then the VM, storage, network and other resources under the same Region will be as close as possible to achieve better performance.

5. Public cloud platforms usually have complete failover and disaster recovery plans, which will guarantee SLA for us when we purchase services, and we, as users, can get compensation if the SLA is violated. For the storage disk of the public cloud platform, we usually get multiple replicas. This replication may be the same as Region or cross-Region mechanism. For the network, by default, the network will be as close as possible to VM, storage, and use the optimized network link under the Region. If there is a problem with the network link under the Region, the public cloud vendor can usually help us quickly switch the network link. For virtual machine resources, if we choose virtual machines to different Region FaultDomain or different Rack FaultDomain, when the public cloud platform is maintained, we can ensure that resources under different Rack or Region are not maintained at the same time.

So, since the public cloud itself has the ability to copy storage, the redundancy of network links, and the FaultDomain capability of virtual machines, it is so perfect that we just run applications on virtual machines, so why run WSFC on public clouds?

In fact, no, from the most fundamental point of view, from the point of view of public cloud vendors, they need to guarantee the SLA of IAAS architecture, to ensure that your subscription will not violate the SLA principle, and they will not be asked for compensation. This is the point of view of public cloud vendors. They just stand in the operation and maintenance to ensure that the hardware of their cloud platform can be served normally, and maintenance will not lead to downtime, so OK, the application runs on your virtual machine, and the high availability of your application. It has nothing to do with public cloud vendors. You need to care about it and choose the right solution.

Here, Lao Wang is only talking about the IAAS scenario. If the cloud platform provides PAAS-level services, it can meet your needs. For example, if the cloud platform provides SQL paas services, then you may not need to deploy WSFC on the public cloud, because we want to deploy WSFC because the applications in our care vm continue to provide services. If my application is stateful, then I will only deploy it on one. If mine is broken, the status of running above and how applications will be migrated to other nodes to run, iaas level, public cloud vendors will not care about you. Public cloud vendors only guarantee that the network, storage, and server will not fail, and maintenance or failure will not affect SLA. But if you have stateful applications running in your VM, you hope that applications can continue to provide services. Once a VM fails, applications can still provide services. Then you must need to deploy the cluster architecture to implement IAAS.

If you think that Paas services provided by public cloud vendors can meet my needs, you accept this new thinking, such as SQL paas, which is an application-level cloud service, so public cloud manufacturers will be responsible for the underlying hardware, then to the system, to the high availability of applications, which should be care. For example, if public cloud vendors provide paas services for databases, they usually provide shards, clusters, and high availability groups. And other technologies to help us ensure the high availability of applications.

If you don't trust paas, I can't see this kind of paas, I don't know how to switch, and I'm not familiar with it. You want to choose a familiar model, then your care application is continuously available. In the case of iaas, you have no choice but to deploy WSFC.

Therefore, when WSFC runs on the public cloud platform, the following two situations are necessary

1. Purchased the IAAS service of the public cloud, but there are applications in the virtual machine, and the continuity of your care application

two。 You are not willing to accept the PAAS services of the cloud platform. You want to get higher Control rights and manage the high availability of applications in a way that you are familiar with.

WSFC runs on the public cloud platform and there are several typical cloud optimization scenarios.

1. Develop and test, deploy the produced WSFC locally, upload a set of WSFC in the same environment in the cloud for development, testing and debugging

two。 Disaster recovery, WSFC local deployment part of the node, cloud deployment of part of the node, normal operation on the local side, cloud node no vote, local end failure, direct failover to the public cloud, this scenario requires high quality network link between the local end and the public cloud

3. Cloud outbreak: deploy a small number of WSFC nodes locally for daily use. Once the usage explosion occurs, the local node can no longer hold. You can take advantage of the cloud outbreak to directly extend the node to the public cloud node, use it only in the event of an outbreak, and then return to the local node after the outbreak. Billing should only be used in the event of an outbreak.

Through the above introduction, I believe you should know

Benefits of 1.WSFC deployment to the public cloud

2.WSFC, what scenarios need to be deployed on public clouds?

A typical scene in which 3.WSFC runs above the public cloud

Points that 4.WSFC needs to pay attention to when deploying in public cloud

It doesn't matter if you don't understand it all. let's actually take the WSFC on Azure public cloud platform as an example to discuss what was mentioned above in the experiment.

Feasibility evaluation of WSFC On Azure

1. Operating system

There are many virtual machine templates built into Azure. After deployment, the virtual machine is activated, the machine name can be changed, and it is an optimized template that can be trusted.

two。 The network

Azure uses VNET architecture for the network. Cloud virtual machines under the same VNET can communicate normally and support fixed IP for virtual machines. However, it should be noted that Azure virtual machines are created as one network card by default. For the best practice of clustering, we may want to use multiple network cards. It is very troublesome to add network cards to virtual machines that have been created and running on Azure. Therefore, if you want to set up multiple network cards for WSFC nodes using best practices, it is best to plan at the planning stage, so that the created Azure virtual machine will have multiple network cards.

3. Storage

Before WSFC2016, if WSFC wants to go to Azure, then at least WSFC above 08R2 does not support public storage and ISCSI,SAS,SCSI for Guset VM-level WSFC,Azure after Azure, and there is no way to allocate the underlying storage directly to virtual machines, and share vhdx technology does not see where it can be used for the time being, that is to say, Azure itself does not support Guest VM WSFC shared storage. Azure virtual machines have exclusive locks on each of their virtual disks, and disks based on Azure Blob storage will not support permanent retention, nor will they support running ISCSI Server to WSFC directly in the virtual machine, so before WSFC 2016, if you want to cluster on Azure, there are only these options for cluster storage.

> run WSFC on Azure, but only cluster appllication that does not require shared storage. You can use additional virtual machine native disk replication at application level to achieve HA, such as SQL AG

> use a third-party scheme to install plug-ins for WSFC nodes, replicate the local disks of each node through the third-party plug-ins, and wrap one layer to form cluster disks that can be recognized by the cluster. Similar products include SIOS DataKeeper,Starwind.

> use Express Route to open up local and public clouds, and expose local ISCSI to public cloud WSFC nodes through secure channels

Therefore, you can see that the scenario of running WSFC on Azure is very limited, because, after all, most cluster applications still need a shared storage to write data to achieve the effect of failover. Third-party solutions like SQL AG are still in a small number, and not everyone is familiar with Express Route. Express Route is super expensive, and not all companies can afford it. Therefore, it is possible to give up a batch of public cloud applications that are not sure whether WSFC can run. WSFC ran on Azure, but because shared storage could not be exposed to virtual machines, cluster applications could not run, and a batch was abandoned, so in fact, before WSFC 2016, very few people ran WSFC on Azure.

When it comes to WSFC 2016, there are a lot of new cloud-optimized features, which make WSFC on Azure have new possibilities. The technologies that WSFC 2016 can use with Azure are Storage Space Direct,Storage Replica,Cloud witness, of which Storage Space Direct, hereinafter referred to as S2D, is the most helpful to WSFC On Azure. This is an exciting feature. To put it simply, with S2D, the cluster no longer has to share storage, you can use the local disk of each node. Contribute the local disk of each node to form a cluster-based storage pool, this storage pool can do SSD HHD NVME tiered storage, or full SSD, full NVME storage pool, build the cluster storage pool, and then build the cluster storage space, that is, virtual disk, each virtual disk can have its own fault-tolerant strategy, simple, mirror, parity, and can define the storage QOS based on node level and disk level. This is a new milestone in Microsoft's foray into software-defined storage.

Now that we have mentioned S2D, we have to mention the concepts of storage pool, storage space, cluster storage pool and JBOD,SOFS, which is also an opportunity to supplement the lessons of cluster storage.

There is no need for me to talk about cluster storage in the traditional sense. Through SAN,ISCSI and other technologies, we use multi-path to distribute the target to multiple node servers to ensure that all nodes can recognize the disk.

So what is storage pool and storage space? to put it simply, Lao Wang believes that this is a basic implementation of storage virtualization by Microsoft. Through the functions at the OS level and the cooperation with hardware devices, the realization can be achieved through cheap Share SAS, or JBOD,RBOD solutions, to build reliable application storage, or even cluster storage. The whole process is completely defined at the Windows software level.

Imagine the architecture above the traditional SAN. First of all, the bottom layer will also be a collection of disks, and then the physical disks will be centrally managed by the upper controller layer with CPU. In the controller layer, the storage device can usually control the disk fault tolerance strategy, de-duplication, reduced mode, storage layering, cache settings, and a series of storage management adjustment operations. Finally, the first is to connect the adapter, after the storage is allocated to the node. Nodes use multiple paths to connect to storage through ISCSI,FC,FCOE.

Microsoft's concept of storage pool and storage space is that starting with Windows Server 2012, built-in support at the system level can define all disk sets, storage control, and final delivery. 2012 of the traditional storage functions we have talked about can be realized, disk sets correspond to storage pools, storage controls correspond to storage space, and connection adapters correspond to SMB,2012 to start SMB protocol optimization. You can get SMB multi-channel technology, automatic aggregation of multi-path, SMB transmission process can also use hardware technologies such as RDMA,RSS to achieve transmission performance optimization, so Microsoft launched the vision of storage pool and storage space, in order to replace the expensive price of traditional storage, only need to use share sas, cheap jbod,rbod, you can achieve the advanced functions above storage on the system.

So, although we have a storage pool and storage space, which is good, we plug the machine into the JBOD and configure the storage Raid at the system level, deduplication, tiering, and compact mode, which looks cool, but we just run on a stand-alone machine, and we don't see the value of this scenario yet. if we really want to see the application value of storage pools and storage space, we'd better build this architecture in the cluster, starting with WSFC 2012. We can deploy the cluster storage pool on top of the cluster, and then build the storage space based on the cluster storage pool. In fact, the so-called storage space corresponds to the concept of virtual disk when we land. When we create each virtual disk, you can choose the fault tolerance scheme of the virtual disk, whether to simplify and other settings.

Once we have deployed the architecture of the cluster storage pool and cluster storage space, then our storage architecture has changed, which is equivalent to clustering the level of the controller in the storage, the storage pool and storage space we configure on the cluster, the disk fault tolerance policy, deduplication, thin mode, storage tiering, even when one of the nodes is down, the cluster storage pool, the cluster storage space. And the policies we have configured will still exist, and we have achieved the high availability of the storage controller.

At the same time, based on the clustered storage pool, the built storage space virtual disk can be seen directly in the clustered disk after it is created, because the controller is clustered, so the created virtual disk can be seen by all node disk managers and can be supported as a clustered disk.

In this way, we can complete a scenario, that is, Cluster in a Box, where my cluster is deployed in a Box, where there are two computing nodes and two share sas, or a JBOD binding. We don't have to share storage, we can build a cluster in a box, using JBOD, or share sas, we can build a cluster and start it up. The cluster disk is built from the node's cluster storage pool-cluster storage space. Then it is better to have RDMA to implement a SMB3 RDMA multi-channel architecture, or you can independently use a single JBOD, and then connect to two cluster nodes, so that we can achieve clustering through cheap storage combined with 2012 storage virtualization technology.

So what is SOFS? many people will think that SOFS is a set of storage space and storage pool. in fact, after listening to Lao Wang's introduction, you will find that they are not the same thing at all.

Storage pool, the final delivery of storage space is a disk, a disk that can be seen in the cluster disk, that is, your cluster does not care about my underlying storage architecture, regardless of whether the underlying architecture is ISCSI,SAN or cluster storage pool. Anyway, I will give you a disk, a qualified cluster disk that can be seen by all nodes.

For SOFS, SOFS is no matter how you provide the disk, because SOFS directly wants to run a share on CSV, SOFS only recognizes CSV, your underlying layer can be SAN,ISCSI, storage space, SOFS is not Care,SOFS only Care, whether you provide a normal CSV, in other words, even if we are traditional SAN architecture, we can still have SOFS

So what does SOFS do? to put it bluntly, this is a storage sustainable availability scheme realized by SMB technology combined with cluster technology, CSV technology and DNS polling technology. when we build a cluster file service, if we choose a scale-out file server for application data, then we can get a SOFS, and after we get SOFS, we will create a new share and get a UNC path. If the cluster role supports UNC, you can use SOFS. Currently, Hyper-V and SQL can use the UNC path of SOFS to write application data into the SOFS path. In the case of application-aware SOFS, we can get continuously available storage connections.

Because SOFS is transparent failover, when we use SOFS, we do not use the UNC path name of a traditional virtual cluster IP,SOFS that corresponds to the local IP of each node. When the application accesses the SOFS path, it is actually based on DNS polling, that is to say, the file server corresponding to the SOFS UNC path is in AA mode, and all nodes can host connections. Each time, the most appropriate node is automatically selected to provide storage services. At the beginning, 2012R2 can implement load balancing based on different share in SOFS, such as NODE1 for\ SOFS\ Share1 and NODE2 for\ SOFS\ Share2.

In addition, SOFS is called a scale-out file server because SOFS supports adding Server and automatically joining SMB load polling. For example, there are currently two nodes responsible for providing SOFS UNC paths for three share, which means that one node will be responsible for two share. At this time, if another node is added, it will be provided by three nodes, and each node will be responsible for the request of one share to achieve the load balancing scale-out function.

Basically, the core meaning of using SOFS is to take advantage of SOFS's load balancing scale-out optimization performance, or transparent failover capability. in the era of WSFC 2012, SOFS has not been optimized. At that time, if we try to store the virtual machine under a SMB path, the file server node that provides the SMB path will be powered off directly, and all the above virtual machines will crash. Later, SOFS changed this, because each time the application visits SOFS, it uses SOFS's unique wintess service to confirm which node the user is accessing, and then tracks the SMB client session of the application. Once the current node of the application goes down, the session of the application is transferred directly and transparently to another node. For Hyper-V and SQL, users will not feel the interruption, which is the core part. SOFS implements a cluster-based witness detection engine on top of DNS polling and cluster to ensure that users' SMB client sessions can always be maintained.

The above Lao Wang briefly reviewed the new things involved in cluster storage after 2012, and then it's our highlight. S2D is also the most important thing we need to know before we implement WSFC On Azure.

To put it simply, we can understand that the S2D architecture is an upgraded version of JBOD+ storage virtualization + clustering in the 2012 era. Didn't you say in 2012 that it is cheap to build a cluster using JBOD architecture, but you still need to buy a JBOD, right? I am 2016 cheaper than you. I can directly use the local disk of each node to do the cluster, which is even more powerful and easier to use, because JBOD is still less used domestically, if you say If you support the architecture contributed by the local disk, more people may be willing to try

S2D architecture basically means that we specify NoStorage parameters when we cluster. After the cluster is created, S2D is enabled, and S2D is scanned. Each cluster node, in addition to the C disk system boot disk, also has disks of the same size. Disks can be SCSI, SCSI or SATA. Qualified disks scanned will be added to the S2D chassis of the cluster, and then we can continue to do cluster storage pool and cluster storage space. CSV, SOFS and so on, so S2D doesn't mean replacing the original storage pool, storage space, it's just a technology that can aggregate the non-system disks on each node in a cluster to form an available storage pool.

With such a technology, we can play WSFC on Azure, we can have more tricks. Why? because we care wsfc can run on azure, nothing more than care shared storage? now with this S2D technology, I directly start several virtual machines, virtual machines I attach data disks, then start the cluster in the virtual machines, and then open S2D, recognizing that the additional data disks of each node can be used as storage pools. Then we can play tricks, cluster storage pool, cluster storage space, to the storage space here, we can see the virtual disk we created on the cluster disk, we can already recognize the cluster disk, the underlying layer of the cluster disk is built through S2D, HA's cluster storage controller, and the storage space creates the virtual disk-cluster disk, which can be mirrored or verified.

Of course, for this prerequisite for using S2D on the cloud, the underlying virtualized host should be able to cooperate with our Guest VM. According to Lao Wang, only the underlying host is Hyper-V host or ESXI host, and Guest S2D Clsuter can be implemented by attaching disks.

OK, the most important S2D, has been introduced. Later, we will conduct a series of experiments to show you the S2D architecture mentioned by Lao Wang above, and help you to come up with such an architecture in your head.

For our deployment of WSFC on Azure, in addition to S2D this key technology, we can also use storage replication, cloud arbitration technology to optimize our cluster.

Storage replication is an one-tier replication technology based on Valume Manager and Partition manager that exited above Windows Server 2016, which can help us complete synchronous replication or asynchronous replication of control storage. The level of replication can be local disk, cross-server disk, multi-site cluster disk, cross-cluster disk.

There is a typical scenario where we can take advantage of this storage technology on Azure.

That is, we have a set of application environment, which is currently running in China based on S2D, and we hope to have a backup in Europe, so we can set up two S2D Cluster clusters in China and Europe respectively, and then two S2D cluster disks under region do storage replication, so that once the S2D cluster runs in China and fails, we can also use S2D Cluster in Europe, and the storage data will be synchronized.

Because the disk of Azure itself can already be replicated locally or remotely, we use storage replication in Guest. It must be because, for applications, we need controllable replication management, whether cluster-to-cluster or node-to-node, we use storage replication to ensure the operation of dependent applications quickly in case of failure.

Cloud Witness Technology is a new feature of WSFC 2016 and Azure. To put it simply, it moves Witness to Azure. When we use Cloud Witness, we actually create a storage account on Azure, and then we get the name of the storage account, the main access key value, and the name of the service endpoint. Then we select Cloud Witness in WSFC 2016 and fill in these information. We will connect to the Azure storage account through HTTPS REST, generate a blob under the storage account we entered, and then use this blob as the cloud witness of the local WSFC or the WSFC on the Azure. After registering a cluster to use the cloud witness, there will be a ClusterInstanceID under the quorum blob to help us identify which cluster is currently used by the blob.

Cloud Witness can have many scenarios, for example, a multi-site cluster scenario, for example, Beijing and Shanghai have cluster nodes, and the cluster data disks on both sides are replicated through storage replication. But the witness disk cannot use replication technology because it has a cluster database dependent on timestamps, so we can either put the witness disk on a remote site, either Beijing or Shanghai. It is best to put it in a fair place, where both sites can be accessed normally, so that once the partition occurs, that site can access the witness disk and be started first. But what if the enterprise really does not have such a third site, or it is very expensive to deploy a witness disk in the third site? Then usually you can choose to use file sharing witness to deal with this kind of multi-site cluster, but using file sharing witness needs to pay attention to the time zoning problem mentioned by Lao Wang earlier. File sharing witness can work well in scenarios that help deal with cluster partitions, but for time partitions, the processing is not as good as disk witness. Cloud witness is suitable for this kind of Enterprises cannot do disk witness or shared witness for the third site, so Cloud Witness can be your third site. As long as you pay a fee, you can use Cloud Witness to help you follow the best practices of multi-site arbitration. You can use Cloud Witness as long as you ensure that your WSFC node port 443 can communicate with Azure and connect to the Internet properly.

OK, the theoretical preparation part is basically over. Let's go directly to the actual combat part of WSFC On Azure. This time we take the domestic version of Azure as an example. When building WSFC On Azure, we need to prepare the following content on Azure

1. A planned VNET,VNET address space, pre-planned DNS in the subnet

two。 A standard storage account for cloud witness use

3. Ensure that under normal operation, VM, storage, network, operate under the same Region

4. A planned cloud service to ensure that all WSFC virtual machines are under the same cloud service

5. For WSFC nodes, use the availability set technology under the same service

6. Add at least two new data disks for each WSFC node

In addition to the above six points, there is another point to consider. The deployment model of WSFC 2016 should be based on the workgroup mode or the AD domain model. In WSFC 2016, the full workgroup architecture is supported to deploy the cluster, but the deployed cluster can be applied to very few applications. The most typical one is the SQL Server cluster based on SA verification. Later, Lao Wang will also blog about this deployment model.

In this article, we mainly focus on the traditional cluster model, even if we use AD clusters and use AD for authentication, which ensures that most applications can support

So if we decide to deploy AD on Azure, there are a few things we need to pay attention to.

1. Try to use static IP for virtual machines

Do not store C disk in 2.AD database. Azure's virtual machine Wndows C disk is optimized by cache, and AD database cannot work with this kind of cache. A non-cache disk must be attached separately as AD database.

3. If you confirm that the virtual machine of Azure needs to use a domain controller, you need to plan the VNET. The domain controller should be that DIP, and then keep the DIP and configure it as the DNS server of VNET, so that all WSFC nodes under the same VNET can see that DNS is a domain-controlled IP and can join the IAAS domain above the Azure normally.

There are several typical scenarios for DC On Azure

1. Install DC using the virtual machine above Azure, then connect with the local AD network, and use Azure as the remote site of the AD site

There are some virtual machines running on 2.Azure, these virtual machines need to provide Web services for users, but authentication needs to go back to the local domain control verification every time, which not only has security risks, but also affects performance. You can choose to deploy a domain control on the cloud, so cloud users use cloud domain control authentication for access, and local users use local domain control authentication for access.

3. Only for the use of virtual machines on Azure, it is necessary to deploy a domain controller to provide centralized authentication to ensure the maximum degree of control.

Experimental verification

Create a planned virtual network and select a location in northern China

Edit the virtual network access space, corresponding to the concept of VMM. Here should be VM Network and VM Subnet.

Plan for the virtual network DNS server to be unified to 10.0.0.4. Note here that if we want to deploy WSFC on the Azure, the virtual machine must use a manually planned virtual network, otherwise, by default, each virtual machine will automatically create a virtual network separately, not under the same subnet.

After completing the first step of network planning, we create a storage account for later cloud witness, but on the Chinese version of the Portal portal, we can find that the storage account can no longer be operated, and the operating storage account will be redirected to the new portal.

Create a storage account, select criteria, keep the replication by default, and select the location in northern China to ensure that the resources are as close as possible.

Create a china16dc,node1,node2 virtual machine, and select all three virtual machines under the same cloud service to ensure that the same virtual network, virtual network subnet and the same storage account are used. For node1,node2 WSFC nodes, we can create an available set of settings, and make use of the rack faultdomain capabilities provided by the Azure cloud platform to ensure that our two machines are always on different racks and will not be maintained at the same time.

For Node1 and Node2 WSFC nodes, the endpoint 443 needs to be enabled so that later nodes can contact the cloud witness

On the new portal, we can manually select the IP address of three machines as static, we choose static, for the domain controller we set it to 10.0.0.4 as planned, and the other two nodes 5 and 6

Open the domain controller virtual machine to see that the IP address has been configured according to our plan

Install DC On Azure IaaS normally, create a cluadmin account to install the WSFC cluster, and join the DomainAdmins group

Make sure that the active directory database is on disk E, in short, it must be a non-C disk, non-D disk, non-cached data disk!

Join Node1,Node2 to the ha.com domain of DC on IAAS normally

So far, we have taken the first big step to build an iaas domain environment that can provide services normally on Azure. When we log in remotely, we can directly manage the WSFC node virtual machine through ha.com\ cluadmin.

Next, we are going to deploy S2D Cluster On Azure and attach two data disks of the same size to the Node1,Node2.

For the data disk of the S2D node, we can use the default storage account, or we can create a separate storage account of the advanced performance type, and then create a SSD-type data disk attached to the virtual machine for higher performance, which is necessary if your cluster application needs to achieve higher performance.

We discussed above that S2D makes WSFC on Azure have a lot of possibilities, so what specific WSFC scenarios does Azure have now?

Use Express Route to get through the local ISCSI Server to the Azure WSFC cluster

Using third-party plug-ins to build local storage of replication nodes as cluster disks

SQL AG that does not use S2D (AG itself does not necessarily require shared storage and can be replicated at the application level)

SQL Cluster based on S2D

SQL AG based on S2D

Provide SOFS based on S2D and expose it to the SQL cluster under the same VNET to achieve SQL file storage SOFS (or local cluster)

SOFS is provided based on S2D, which is used for RDS UPD storage path. RDS can be a public cloud, or open × × to local private cloud.

S2D-based WSFC cluster for other cluster loads, such as traditional file servers, general programs, general services, etc.

WSFC cluster based on S2D to realize storage replication across Region clusters

Next, Lao Wang will implement the fourth scenario for everyone. We build an S2D storage model on top of WSFC on Azure, and then the upper layer provides a SOFS path. This SOFS can be used in this cluster or delivered to other clusters under the same VNET.

Next, we are going to start S2D on WSFC on Azure. According to the experiment, Lao Wang found that starting S2D on Azure is different from the local side. We cannot use the Enable-ClusterStorageSpacesDirect command to build it, but we can use Enable-ClusterS2D. In addition, for each S2D node, we need to execute winrm / quickconfig.

Execute winrm / quickconfig for Node1,Node2

Install the failover clustering feature for Node1,Node2, and then create the cluster by executing the following command

New-Cluster-Name sofs-Node Node1,Node2-StaticAddress 10.0.0.12-NoStorage

After the creation, we got a normal WSFC on Azure, but now the cluster does not have storage, unless we use applications such as SQL AG that do not require shared storage.

Next, we need to open S2D for the cluster and let the cluster collect the local disks of each node and assemble them into available cluster storage pools.

If you are using Enable-ClusterStorageSpacesDirect to create S2D, you will find an 83-page VPD warning, followed by an SDS error that cannot find the right disk

This problem can be avoided by using the Enable-ClusterS2D command at this time. Since we are all in a HDD environment, we do not need to make cache settings.

Enable-ClusterS2D-CacheState disabled-Autoconfig 0-SkipEligibilityChecks

After the creation is completed, you are prompted to warn. If you open the htm, you can see that it is only the result of our execution. Two system disks cannot be collected into the available storage pool. Two data disks on each node, a total of four, have been added to the storage pool available to the cluster.

Open the cluster chassis to see the existing chassis information

Click on the pool to create a new storage pool. After entering the name, you can see the available disk groups collected by S2D, which is not quite the same as the display of 2012R2 JBOD.

2012R2 JBOD Architecture Building Cluster Storage Pool is displayed directly as Clustered Storage Spaces

Select the available physical disks, and if there are less than 3 disks here, you will be prompted that the storage pool cannot be created in the next step.

After the creation, we now have the cluster storage pool. Click New Virtual disk to build the cluster storage space.

The creation results are as follows, as you can see, we can choose the storage data layout for each virtual disk, simple, mirrored, and verified.

In general, when the virtual disk of the cluster storage space is created, the disk can be seen directly in the cluster storage. There should be some cooperation between the cluster and the storage space. Once the virtual disk separated from the storage space is detected, it is allocated to all nodes, that is, it should be used as a cluster disk, so it is automatically taken over.

Add the disk to CSV. Note that the available storage needs to be formatted as NTFS before it can be used for CSV.

Finally, we need to deliver a SOFS path and add roles through the cluster. We need to note that since SOFS needs to generate VCO under the same OU as CNO, we need to give the CNO object administrative rights to OU and the permission to create DNS records for the DNS area, otherwise the SOFS will not be created successfully.

DNS has added DNN records normally.

The SOFS file server can also be added normally based on CSV, and then we create a path for the SQL cluster to use, and now we have a SOFS On S2D Cluster On Windows Azure!

Through the experiment, I believe you have already understood the S2D architecture process of running WSFC cluster on cloud platform.

Cloud platform level additional virtual machine disks-> create WSFC cluster-> enable S2D feature to collect disks-> create cluster storage pool-> create storage space-> get cluster disks-> build cluster shared volumes-> deliver SOFS

Although a little complicated, but as long as it is clear, it is not difficult, the key is the change of thinking, to put it simply, we have just found a new cluster storage architecture model, taking advantage of the benefits of a complete set of storage virtualization. S2D allows cluster disks to be built for clusters without the need for provider public storage. The use of S2D technology brings a lot of new possibilities for us to deploy WSFC on the public cloud. In the future, we also hope that Microsoft will introduce more convenient technologies to help us deploy WSFC on various cloud platforms and bring WSFC anywhere.

In contrast, cloud witness technology is not as important as S2D in WSFC on Azure, but cloud witness technology can also help us optimize the cluster. Whether it is local WSFC or WSFC On Azure, local WSFC nodes can use cloud witness when they connect to port 443 of Azure. WSFC on Azure, although we can build a cluster through S2D, we need to consider the witness disk of the cluster. If the local node of WSFC starts a witness disk through S2D, it can also be started, but this is not the best practice. We can use cloud witness technology which is completely independent of WSFC S2D to complete the cluster witness, and use the storage blob on Azure to help us ensure the cluster witness.

For Cloud Witness, we do not need to prepare too much to ensure that port 443 of the WSFC node is properly opened to communicate with Azure storage, and then prepare a standard storage account.

Copy the primary key value that is prepared to store the account in advance.

Open the Cluster Arbitration Wizard and select the advanced arbitration configuration

Select arbitration witness and configure cloud witness

Enter the name of the storage account and the main access key of the copied storage account. If it is Global Azure, the service endpoint can be kept by default. The domestic version of Azure needs to be changed to core.chinacloudapi.cn.

Click next, the WSFC 2016 node will contact the Azure storage account through port 443, and the follow service endpoint will find a storage path that can be written, and then record the ClusterInstanceID of WSFC to generate a block blob. This blob provides cloud witness arbitration for our WSFC.

After successful configuration, you can see that the current witness has become a cloud witness on the cluster.

Cloud Witness is very good, is a very good technology, in the case that we do not have a third site, can help the local multi-site architecture to follow best practices, using the public cloud as witness arbitration, Azure storage to help us ensure the availability of witness, only requires to open port 443, for WSFC on Azure, the third site witness is not easy to deploy, so you can use cloud witness technology

When facing a cluster network partition, for example, when there is no communication between two nodes, that node can establish a connection with Cloud Witness through 443, and that node can win and continue to provide services. as long as the cloud witness is in our cluster, it can support to the last node.

If your WSFC on Azure cluster does not need to follow best practices and can be used normally, you can also choose to build a file share through another Azure machine, such as DC, to witness the sharing of WSFC on Azure

Similar to Cloud Witness, there is a seemingly feasible solution. Azure File Services, many friends will also think of using an Azure File path to do file sharing for the cluster, but in practice, you will find that the cluster rejects an Azure File Services Azure file sharing witness, because to use Azure File Service, you must enter specific authentication information assigned by Azure, so if you want to use Azure technology to achieve WSFC witness, only cloud witness technology can choose.

Well, it is time to fill in the holes dug in the previous article. Can Cloud Witness deal with time zones or not?

Time partition means that when other nodes are not online, I modify the cluster information to add resources, and then shutdown, before the cluster information is modified and the nodes that are not online can be rebooted. Can we continue to provide services? in the 2012R2 era, disk witness can, because there is a cluster database in the disk witness, even if I am not online when other nodes modify resources. I can also retrieve the latest cluster database from the disk witness, but not the file sharing witness. In this case, the file sharing witness will find that the cluster cannot be started when the node where the cluster information was not previously modified is booted. Because there is no up-to-date cluster database, you have to wait for the node that modified the information to start.

Experiments show that Cloud Witness is also unable to handle time partitions because there is no cluster database in Cloud Witness blob.

Only disk witnesses can handle time partitions, so if you are worried about a time partition scenario, you need to find a way to get the cluster to use disk witnesses, either by S2D build or express route, or you can choose to deploy clusters with odd nodes on Azure, which has a 66% chance of making it to the last node!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.