On the High availability of vsphere 02/13 Update SLTechnology News&Howtos

On the High availability of vsphere

2026-02-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

High availability can be achieved at the following levels:

1 High availability of application layer: for example, the cluster that implements mysql and oracle database applications is mainly to judge whether the mysql and oracle applications stop running.

2 operating system high availability: such as failover clustering (windows failover clusteringWFC) of windows.

3 High availability of the virtualization layer, such as vsphere high availability (HA) and vsphere faulttolerance (FT).

4 High availability of physical layer, such as multiple network adapters, SAN, etc.

The vSphere HA and Fault Tolerance (FT) functions minimize or eliminate unplanned downtime by providing rapid recovery from outages and continuous availability, respectively.

With vSphere, enterprises can easily raise the benchmark level provided for all applications and achieve higher levels of availability at lower cost and simpler operations. With vSphere, you can:

A provides higher availability independent of hardware, operating systems, and applications.

B reduce the planned downtime of common maintenance operations.

C provide automatic recovery in the event of a failure.

1. VSphere HA provides fast interrupt recovery

VSphere HA leverages multiple ESXi hosts configured as clusters to provide fast outage recovery and cost-effective high availability for applications running in virtual machines.

VSphere HA protects application availability in the following ways:

1 prevent server failures by restarting the virtual machine on other hosts in the cluster.

2 prevent application failure by continuously monitoring the virtual machine (sending detection signals to the virtual machine by the host through vmware tools) and resetting it when a fault is detected.

Unlike other clustering solutions, vSphere HA provides the infrastructure and uses it to protect all workloads:

A there is no need to install special software in the application or virtual machine. All workloads are protected by vSphere HA. Once vSphere HA is configured, no action is required to protect the new virtual machine. They are automatically protected.

B vSphere HA can be used in conjunction with vSphere Distributed Resource Scheduler (DRS) to prevent failures and to provide load balancing between hosts within the cluster.

Compared to traditional failover solutions, vSphere HA has several advantages:

Minimize settin

After the vSphere HA cluster is set up, all virtual machines in the cluster can receive failover support without additional configuration.

Reduced hardware costs and settings

Virtual machines can act as mobile containers for applications and can move between hosts. Administrators avoid repetitive configuration on multiple computers. "when using vSphere HA, you must have sufficient resources to fail over the number of hosts to be protected by vSphere HA." However, the vCenter Server system automatically manages resources and configures the cluster.

Improved availability of the application

The availability of any application running within the virtual machine becomes higher. Virtual machines can recover from hardware failures and prevent client operating systems from crashing by monitoring and responding to VMwareTools detection signals and restarting unresponsive virtual machines.

Integration of DRS and vMotion

"if the host fails and the virtual machine is restarted on another host, DRS recommends migration or migrates the virtual machine to balance resource allocation."

VSphere HA clusters allow collections of ESXi hosts to work together as a group, and these hosts provide a higher level of availability to virtual machines than ESXi hosts provide separately.

The hosts in the cluster are monitored, and in the event of a failure, the virtual machine on the failed host is restarted on the standby host.

When creating a vSphere HA cluster, you can choose to use a single host as the preferred host (master) to communicate with the vCenterServer and monitor the status of other hosts, slave hosts (slave), and their virtual machines.

If vSphere HA is enabled for the cluster, all active hosts (hosts that are not in standby or maintenance mode or hosts that are not disconnected) will participate in the election to select the preferred host for the cluster. Hosts that mount the largest number of data stores have an advantage in the election. There is only one preferred host per cluster, and all other hosts are secondary hosts. If the preferred host fails, shuts down, or is removed from the cluster, a new election will take place.

The preferred host in the cluster has many responsibilities:

1 monitor the status of the slave host. If the secondary host fails or cannot be accessed, the preferred host determines the virtual machine that needs to be restarted.

2 monitor the power status of all protected virtual machines. If a virtual machine fails, the preferred host ensures that the virtual machine is restarted. Using the locally placed engine, the preferred host can also determine where to perform the reboot.

3 preferred hosts manage the list of cluster hosts and protected virtual machines and manage hosts that add or remove hosts inside the cluster, that is, the preferred host maintains the inventory within the cluster.

4 the preferred host manages the list of protected virtual machines and updates the list each time the user initiates a switch operation. Vcenter server requires that some virtual machines be protected or not protected. That is, when the virtual machine is powered on, the virtual machine is protected, and if the host fails, the virtual machine will be restarted on other hosts. When the virtual machine is powered off, there is no need to protect it.

5 preferred host cache cluster configuration, master host notification and reminder slave host, cluster configuration modification.

6 the master host sends heartbeat information to the slave host to let the slave host know the existence of master. If the slave host does not receive the heartbeat information, the new preferred host is re-elected.

7 master reports status information to vcenter,vcenter normally only communicates with the master host.

One of the preferred functions performed by the host is virtual machine protection. "when a virtual machine is protected, vSphere HA ensures that it attempts to power back on after its failure."

The preferred host is committed to protecting the virtual machine when it is observed that the power condition of the virtual machine has changed from power off to power on.

Machine. In the event of a failover, the preferred host must restart the protected virtual machine for which it is responsible. This responsibility has been assigned to the preferred host that exclusively locks the system definition file on the data store that contains the virtual machine profile.

Responsibilities of slave hosts in the cluster:

1 the slave host monitors the status of locally running virtual machines and sends significant changes in the running status of these virtual machines to the master host.

2 the slave host monitors the health status of the master host. If the master host fails, the slave host participates in the election of the master.

3 slave uses the characteristics of vSphereHA access control vSphereHA, which do not need the coordination of master. These features include VMHealth Monitoring.

View the status of master and slave:

Host failure type and detection

The preferred host of the vSphere HA cluster is responsible for detecting the failure of the secondary host. Depending on the type of failure detected, virtual machines running on the host may need to be failed over.

In the vSphere HA cluster, detect three types of host failures:

1 the host stops running (that is, a failure occurs).

(2) the host is isolated from the network.

3 the host loses its network connection to the preferred host.

VSphere HA uses the management network and storage devices to contact. When master cannot reach slave through the management network, master uses a storage network (heartbeat datastores) to check whether the slave is alive.

The preferred host monitoring cluster slave hosts are accomplished by exchanging network detection signals, and this communication is accomplished by managing the network. When the preferred host cannot receive these detection signals from the secondary host through the management network, such as a failure of the network interface of the preferred host or slave host, it checks the host activity before declaring that the host has failed. The preferred activity check performed by the host is to determine whether the secondary host is exchanging detection signals with the data storage (that is, the storage network). If this slave host exchanges detection signals with the data store, the preferred host assumes that it is in a network partition or network isolation, so it continues to monitor the host and its virtual machine.

Network partitioning: one or more slave cannot reach the master through the management network, even if there is no problem with their network connectivity, in which case the vSphere HA can use the storage network to detect the survival of the separate hosts (the slaves above) and whether to protect the virtual machines in them.

Network isolation: one or more slave loses all management network connections so that the slave can contact neither the master nor the other ESXi hosts. In this case, the slave host notifies master through the storage network that it is already quarantined.

Note: if you ensure that the network infrastructure is sufficiently redundant and that at least one network path is always available, host network isolation should occur in rare cases.

When a management network failure occurs in a vSphere HA cluster, some hosts in the cluster may not be able to communicate with other hosts through the management network. There may be multiple partitions in a cluster.

Partitioned clusters cause virtual machine protection and cluster management features to be degraded

1 virtual machine protection. VCenter Server allows the virtual machine to power on, but protects it only if it is running in the same partition as the preferred host responsible for it.

2Cluster management. VCenter Server can only communicate with some hosts in the cluster and can only connect to one preferred host. Therefore, changes in the configuration that affect vSphere HA will not take effect until the partition is resolved. This failure may cause one partition to operate under the old configuration, while the other partition uses the new settings

Summary: when the preferred host in the vSphere HA cluster is unable to communicate with the secondary host through the management network, the preferred host uses the data store detection signal to determine whether the secondary host has failed, is in a network partition, or is isolated from the network. If the secondary host has stopped the data storage detection signal, the secondary host is considered to have failed and its virtual machine has been restarted elsewhere.

VCenterServer uses vSphere HA host health to report whether the host is the preferred host or secondary host. If the HA status column is enabled, this condition is reported on the summary tab of hosts in vSphere Client and in the Host list view of the cluster or datacenter. The HA condition "running (master)" indicates that the host is the preferred host for vSphere HA. The "connected (slave)" condition indicates that the host is the vSphere HA slave host.

Note: if you disconnect the host from the cluster, all virtual machines registered with that host are not protected by vSphere HA.

VCenterServer uses access control to ensure that there are sufficient resources within the cluster to provide failover protection.

1. "number of host failures allowed by the cluster" access control policy:

Using the "number of host failures allowed by the cluster" access control policy, vSphere HA allows a specified number of hosts to fail while ensuring that sufficient resources are left within the cluster to fail over the virtual machines on those hosts.

Using the number of Host failures allowed by the Cluster policy, vSphere HA performs access control in the following ways:

1 slot size calculation

The slot size consists of two components (CPU and memory).

A vSphere HA calculates the CPU component by obtaining the CPU reservation for each powered-on virtual machine and then selecting the maximum value. If no CPU reservation is specified for the virtual machine, a default value of 32 MHz is assigned to it.

B vSphere HA calculates memory components by obtaining the memory reservation and memory overhead for each powered-on virtual machine, and then selecting the maximum. There is no default value for memory reservation.

2. Calculate the current failover capacity using the number of slots

After calculating the slot size, vSphere HA determines the CPU and memory resources available for the virtual machine on each host. The host resource data used by vSphere HA can be found by connecting directly to the host using vSphere Client and then navigating to the host's resources tab. You can then determine the maximum number of slots each host can support. To determine this number, divide the number of CPU resources of the host by the slot-sized CPU components, and then round up the results. Do the same calculation for the number of memory resources of the host. Then, compare the two numbers, and the smaller number is the number of slots that the host can support.

Calculate the current failover capacity by determining the number of hosts that can fail and still have enough slots to meet the requirements of all powered-on virtual machines, starting with the maximum.

Appendix: advanced runtime information

If you select the "number of host failures allowed by the cluster" access control policy, a link to advanced runtime information is displayed in the vSphere HA area on the cluster summary tab in vSphere Client. Click this link to display the following information about the cluster:

A slot size.

B the total number of slots in the cluster.

C the number of slots used. The number of slots allocated to virtual machines that are powered on. If you have defined an upper limit for slot size using the advanced option, this number can be greater than the number of virtual machines that have been powered on. This is because some virtual machines occupy multiple slots.

D the number of slots available. The number of slots that can be used to power on other virtual machines in the cluster. VSphere HA retains the number of slots required for failover. The remaining slots can be used to power on the new virtual machine.

E number of failover slots. The total number of slots except used and available slots.

F the total number of powered on virtual machines in the cluster.

The total number of hosts in the g cluster.

Total number of normal hosts in the cluster. The number of hosts that are connected, are not in maintenance mode, and do not have vSphere HA errors.

Example: access control using the "number of host failures allowed by the cluster" policy

The example shows how to calculate and use slot size using this access control policy. Make the following assumptions about the cluster:

1 the cluster consists of three hosts, each with a different number of CPU and memory resources available. The first host (H1) has 9 GHz and 9 GB of available CPU resources and memory, the second host (H2) has 9 GHz and 6 GB, and the third host (H3) has 6 GHz and 6 GB.

2 there are five powered-on virtual machines in the cluster with different CPU and memory requirements. The CPU resources and memory required for VM1 are 2 GHz and 1 GB,VM2 for 2 GHz and 1 GB,VM3 for 1 GHz and 2 GB,VM4 for 1 GHz and 1 GB,VM5 for 1 GHz and 1 GB, respectively.

3 the number of host failures allowed by the cluster is set to 1.

1 compare the CPU and memory requirements of the virtual machine, and then select the maximum value to calculate the slot size.

The maximum CPU requirement (shared by VM1 and VM2) is 2 GHz, while the maximum memory requirement (for VM3) is 2 GB. Based on the above, the slot size is 2 GHz CPU and 2 GB memory.

2 this determines the maximum number of slots that each host can support.

H1 can support four slots. H2 can support three slots (take the smaller of 9GHz/2GHz and 6GB/2GB), and H3 can also support three slots.

(3) calculate the current failover capacity.

The largest host is H1, and if it fails, there are six slots in the cluster, enough for all five powered-on virtual machines. If both H1 and H2 fail, there will be only three slots left in the cluster, which is not enough. Therefore, the current failover capacity is 1.

The number of available slots in the cluster is 1 (six slots on H2 and H3 minus five used slots).

It is recommended that you do not use the access control of the "number of host failures allowed by the cluster" policy, because it is difficult to determine the number of failures if the hardware performance of the hosts within the cluster is not temporary. This policy can be used unless the hardware performance of hosts within the cluster is consistent.

2. "percentage of reserved cluster resources" access control policy

VSphere HA can be configured to perform access control by reserving a certain percentage of cluster CPU and memory resources for recovery from host failures.

Using the percentage of reserved Cluster Resources access control policy, vSphere HA ensures that a specified percentage of the total CPU and memory resources are reserved for failover.

Using the reserved Cluster Resources policy, vSphere HA can enforce the following access controls:

1 calculate the total resource requirements for all powered-on virtual machines in the cluster.

2 calculate the total number of host resources available for the virtual machine.

3 calculate the current CPU failover capacity and current memory failover capacity of the cluster.

4 determine whether the current CPU failover capacity or current memory failover capacity is less than the corresponding configured failover capacity (provided by the user). If so, the access control does not allow this operation.

Note that the percentage of reserved Cluster Resources access control policy also checks whether there are at least two vSphere HA-enabled hosts in the cluster (excluding those that are entering maintenance mode). If there is only one vSphere HA-enabled host, this operation is not allowed even if a sufficient percentage of resources can be used. The reason for this additional check is that vSphere HA cannot fail over if there is only one host in the cluster.

Calculate the current failover capacity

The total resource requirements for powered-on virtual machines consist of two components, CPU and memory. VSphere HA calculates these values.

1 the CPU component value is calculated by adding up the CPU reservation of the powered-on virtual machine. If no CPU reservation is specified for the virtual machine, a default value of 32 MHz is assigned to it.

2 the value of the memory component is calculated by adding up the memory reservation (and memory overhead) of each powered on virtual machine.

The sum of CPU and memory resources of the host is calculated, and the total amount of host resources available to the virtual machine is obtained.

Use the total host CPU resources minus the total CPU resource requirements, and then divide this result by the total host CPU resources to calculate the current CPU failover capacity. Current memory failover capacity is calculated in a similar way.

Example: access control using the percentage of reserved cluster resources policy

The example shows how to use this access control policy to calculate and use the current failover capacity. Make the following assumptions about the cluster:

3 configured failover capacity is set to 25%.

The total resource requirements for powered-on virtual machines are 7 GHz CPU and 6 GB memory. The total number of host resources available for virtual machines is 24 GHz CPU and 21 GB memory.

Based on the above, the current CPU failover capacity is 70% ((24GHz-7GHz) / 24GHz). Similarly, the current memory failover capacity is 71% ((21GB-6GB) / 21GB).

Because the configured failover capacity of the cluster is set to 25%, other virtual machines can still be powered on using 45% of the total cluster CPU resources and 46% of the cluster memory resources.

3. "designated failover host" access control strategy

A specific host can be designated as a failover host when configuring vSphere HA.

If you use the specify failover host access control policy, vSphere HA will attempt to restart its virtual machine on one of the designated failover hosts if the host fails

Note that if you use the "specify failover host" access control policy and specify multiple failover hosts, DRS will not load balance the failover hosts.

Current failover Host is displayed in the vSphere HA area of the summary tab of the cluster in vSphereClient. The status icon next to each host can be green, × ×, or red.

1 green. The host is connected, is not in maintenance mode, and has no vSphere HA errors. There are no virtual machines powered on on the host.

2 × × ×. The host is connected, is not in maintenance mode, and has no vSphere HA errors. However, a virtual machine that is powered on resides on the host.

3 red. The host is disconnected, is in maintenance mode, or has a vSphere HA error.

Requirements for vSphere HA clusters

Before setting up an vSphereHA cluster, the following requirements should be met:

1 all hosts must be licensed by vSphereHA.

2 at least two hosts are required in the cluster.

3 static IP addresses need to be configured for all hosts.

4 all hosts should have at least one common management network, and best practices should have at least two.

ESXi hosts with version 5, version 4.0 and later-VMkernel networks with check boxes selected.

6 to ensure that any virtual machine can run on any host in the cluster, all hosts should have access to the same virtual machine network and data storage. Similarly, virtual machines must be on shared rather than local storage, otherwise they will not be able to fail over if the host fails.

Note that vSphere HA uses data storage signal detection to distinguish between partitioned hosts, isolated hosts, and failed hosts. Accordingly, you must ensure that the data store reserved for vSphere HA is always available immediately.

7 in order for virtual machine monitoring to work, VMware Tools must be installed.

Summary: the requirements of the vSphere HA cluster are similar to those of Vmotion.

Create a vSphere HA cluster

VSphere HA can be enabled for the cluster. VSphere HA-enabled clusters are a prerequisite for Fault Tolerance. VMware recommends that you create an empty cluster first. After you plan the resources and network architecture of the cluster, you can use vSphere Client to add hosts to the cluster and specify the vSphere HA settings for the cluster.

Steps

1 Select the hosts and clusters view.

2 right-click the data center in the inventory tree, and then click New Cluster.

3 complete the New Cluster Wizard.

Do not enable vSphere HA (or DRS) at this time.

4 Click finish to close the wizard and create the cluster.

An empty cluster is created at this time.

According to your cluster resources and network architecture plan, use vSphere Client to add hosts to the cluster.

6 right-click the cluster, and then click Edit Settings.

In the cluster's Settings dialog box, you can modify the cluster's vSphere HA (and other) settings.

7 on the Cluster Features page, select to open vSphere HA.

8 configure vSphere HA settings for the cluster as needed.

A host monitoring status

B access control

C virtual machine option

D virtual machine monitoring

E data storage detection signal

9 Click OK to close the Settings dialog box for the cluster.

1. Cluster function

The first panel in the New Cluster wizard can be used to specify basic options for the cluster.

In this panel, you can specify a cluster name and select one or two cluster features.

Name specifies the name of the cluster. The name is displayed in the vSphere Client list panel. You must specify a name

To continue creating the cluster.

Open vSphere HA if you select this check box, the virtual machine will restart on other hosts within the cluster in the event of a host failure. To enable vSphere Fault Tolerance on any virtual machine in the cluster, vSphere HA must be turned on.

Open vSphere DRS if you select this check box, DRS will balance the load of virtual machines across the cluster. DRS places and migrates virtual machines even if they are protected by HA.

2. Host monitoring status

After you create the cluster, enable host monitoring so that vSphere HA can monitor the detection signals sent by the vSphere HA agent on each host in the cluster.

If you choose to enable host monitoring, each host in the cluster is checked to ensure that it is running. If one host fails, the virtual machine is restarted on another host. Host monitoring is also necessary for the vSphere Fault Tolerance recovery process to function properly.

Note: if you need to perform network maintenance that may trigger a host isolation response, VMware recommends that you first disable host monitoring to suspend vSphereHA. When the maintenance is complete, re-enable Host Monitoring.

3. Enable or disable access control

Through the New Cluster Wizard, you can enable or disable access control for a vSphere HA cluster and select a policy about how it is executed. Access control can be enabled or disabled for vSphere HA clusters.

Enable: prohibit virtual machine power-on operations that violate availability restrictions to enable access control and enforce availability restrictions while retaining failover capacity. No operations that violate availability restrictions are allowed on the virtual machine.

Disable: allow power on virtual machines that violate availability restrictions

Disable access control. "for example, you can perform this operation even if powering on a virtual machine results in insufficient failover capacity." When you do this, no warning is displayed, and the cluster does not turn red. If the cluster does not have enough failover capacity, vSphere HA can still perform a failover and use the Virtual Machine restart priority setting to determine which virtual machine to power on first.

If access control is enabled, vSphere HA provides three policies that enforce access control.

1 number of host failures allowed by the cluster

2 percentage of cluster resources reserved as failover space capacity

3 specify the failover host

4. Virtual machine options

The default virtual machine settings control the restart order of virtual machines (virtual machine restart priority) and how vSphereHA responds when network connectivity is lost between hosts (host isolation response).

These settings apply to all virtual machines in the cluster in the event of host failure or host isolation. In addition, exceptions can be configured for specific virtual machines.

Virtual machine restart priority setting

The virtual machine restart priority determines the relative order in which the virtual machine restarts after a host failure. These virtual machines restart sequentially on the new host, starting first with the highest priority virtual machines, then those with lower priority, until all virtual machines are restarted or no more cluster resources are available. If the number of host failures exceeds the number allowed by access control, the system may wait until more resources are available before restarting the lower priority virtual machine. If you specify a failover host, the virtual machine will restart on that failover host.

The values for this setting are: disabled, low, medium (default), and High. If you select disabled, vSphere HA is disabled for the virtual machine, which means that the virtual machine is not restarted on other ESXi hosts when its host fails. The disabled setting does not affect virtual machine monitoring, which means that when a virtual machine on a functioning host fails, it is reset on the same host. You can change this setting for each virtual machine.

The restart priority setting for virtual machines varies according to user needs. VMware recommends that you assign a higher restart priority to virtual machines that provide the most important services.

Host isolation response Settings

The host isolation response determines what happens when a host within the vSphere HA cluster loses its management network connection but continues to run.

The host performs its isolated response. Responses include: keep power on (default), power off, and power off. You can also customize this property for individual virtual machines.

To use the shutdown setting, VMware Tools must be installed in the client operating system of the virtual machine. The advantage of shutting down the virtual machine is that it can retain its condition. Shutdown is better than powering down the virtual machine, which does not flush recent changes to disk and does not commit transactions.

5. Virtual machine and application monitoring

If you do not receive a VMware Tools detection signal for a single virtual machine within the set time, virtual machine monitoring restarts the virtual machine. Similarly, if you do not receive a detection signal from an application that the virtual machine is running, application monitoring can restart the virtual machine. You can enable these features and configure vSphere HA to monitor the sensitivity when there is no response.

"when virtual machine monitoring is enabled, the hypervisor service (using VMware Tools) evaluates whether each virtual machine in the cluster is running by checking the general detection signals and IVMware Tools O activity of the VMware Tools process running within the client." If no detection signal or Istroke O activity is received, it is likely that the client operating system has failed or the time that VMware Tools has not been allocated to complete the task. In this case, the hypervisor service determines that the virtual machine has failed and then decides to reboot the virtual machine to restore the service.

You can configure the level of monitoring sensitivity. High sensitivity monitoring can draw the conclusion that a fault has occurred more quickly. However, if the monitored virtual machine or application is actually still running, but the detection signal is not received due to factors such as resource constraints, high-sensitivity monitoring may mistakenly assume that the virtual machine has failed. Low-sensitivity monitoring prolongs the time of service outage between the actual failure and the virtual machine reset. Please select an option that makes a valid compromise to meet the demand.

Second, vSphere Fault Tolerance provides continuous availability

VSphere HA provides a basic level of protection for virtual machines by restarting the virtual machine when the host fails. So its disadvantage is that there is downtime, which can be a few minutes or more than ten minutes.

VSphere FaultTolerance can be enabled for virtual machines to achieve higher levels of availability and data protection than vSphere HA provides, thereby ensuring business continuity.

VSphere FaultTolerance ensures continuous availability of virtual machines by creating and maintaining the same as the primary virtual machine, and replacing the secondary virtual machine of the primary virtual machine at any time in the event of a failover.

To get the best results for FaultTolerance, you should first be familiar with how it works, how to enable it for clusters and virtual machines, and how best to use it.

FaultTolerance can be enabled for most mission-critical virtual machines. A duplicate virtual machine (called a secondary virtual machine) is created that runs with the primary virtual machine in a virtual lock step (vLockstep). VMware vLockstep captures inputs and events that occur on the primary virtual machine and sends them to a secondary virtual machine that is running on another host. Using this information, the execution of the secondary virtual machine will be equivalent to that of the primary virtual machine. Because the secondary virtual machine runs in a virtual lock step with the primary virtual machine, it can take over execution at any point without interruption, providing fault-tolerant protection.

As shown in the following figure: primary and secondary virtual machines in the Fault Tolerance pair

The primary virtual machine and the secondary virtual machine can continuously exchange detection signals. This swap enables virtual machines in virtual machine pairs to monitor the status of each other to ensure continuous Fault Tolerance protection. "if the host running the primary virtual machine fails, the system performs a transparent failover, where the secondary virtual machine is immediately enabled to replace the primary virtual machine, a new secondary virtual machine is started, and Fault Tolerance redundancy is re-established within seconds." If the host running the secondary virtual machine fails, that host is also replaced immediately. In either case, the user will not experience service interruption and data loss.

The primary virtual machine and its secondary copies are not allowed to run on the same host. This limit ensures that both virtual machines are lost due to host failure.

VSphere features not supported by Fault Tolerance

The following vSphere features are not supported by fault-tolerant virtual machines.

1 Snapshot. Snapshots must be removed before enabling Fault Tolerance on the virtual machine. "in addition, it is not possible to take a snapshot of a virtual machine with Fault Tolerance enabled."

2 Storage vMotion . You cannot call Storage vMotion for a virtual machine that has Fault Tolerance enabled. To migrate memory, you should temporarily shut down Fault Tolerance before performing the Storage vMotion operation. After the migration is complete, you can reopen FaultTolerance.

(3) linked clone. You cannot enable Fault Tolerance on a virtual machine that is linked to a clone, nor can you create a linked clone from a virtual machine that has Fault Tolerance enabled.

4. Virtual machine backup. You cannot use Storage API for Data Protection, VMware Data Recovery, or similar backup products that require virtual machine snapshots (as performed by ESXi) to back up FT-enabled virtual machines. To back up a fault-tolerant virtual machine in this way, you must first disable FT and then re-enable FT after performing the backup. Storage array-based snapshots do not affect FT.

Use the Fault Tolerance feature with DRS

When you enable the EnhancedvMotion Compatibility (EVC) feature, you can use vSphere Fault Tolerance with vSphere Distributed Resource Scheduler (DRS). This process not only enables fault-tolerant virtual machines to benefit from a better initial placement, but also allows them to be included in the load balancing calculation of the cluster.

When EVC is enabled on the cluster, DRS recommends the initial placement of fault-tolerant virtual machines, moves them during rebalancing of the cluster load, and allows you to assign the DRS automation level to the primary virtual machine (the secondary virtual machine always has the same settings as its associated primary virtual machine. )

During initial placement or load balancing, the number of primary or secondary virtual machines that DRS places on the host does not exceed a fixed number. This restriction is controlled by the advanced option das.maxftvmsperhost. The default value for this option is 4. However, if you set this option to 0 minute DRS, this restriction will be ignored.

Fault Tolerance requirements:

The following are the cluster, host, and virtual machine requirements that you need to know before using vSphere FaultTolerance.

Cluster requirements for FaultTolerance:

The following cluster requirements must be met before using FaultTolerance.

1 the host certificate check feature is enabled. I'll talk about how to view it later.

2 at least two FT certified hosts are running the same Fault Tolerance version number or host build number (that is, the same version of ESXi includes patches). The FaultTolerance version number is displayed on the summary tab of the host in vSphereClient.

3 ESXi hosts can access the same virtual machine data storage (SAN, NAS, ISCSI) and network. Please refer to the lecture materials below

4 Fault Tolerance logging and vMotion network are configured. Please refer to the lecture materials below

5 the vSphere HA cluster has been created and enabled. See "creating a vSphere HA Cluster". VSphere HA must be enabled before powering on fault-tolerant virtual machines or adding hosts to a cluster that already supports fault-tolerant virtual machines.

Host requirements for FaultTolerance:

The following host requirements must be met before using FaultTolerance.

1 the processors on the host must come from a FT-compatible processor group. In addition, it is strongly recommended that the processors of the host be compatible with each other. Information about supported processors can be found at http://kb.vmware.com/kb/1008027.

2 hardware Virtualization (HV) must be enabled in BIOS when configuring each host.

Note: when the host cannot support Fault Tolerance, you can view the reason on the summary tab of the host in vSphere Client. Click the blue title icon next to the host field where FT is configured to view a list of Fault Tolerance requirements that the host does not meet.

Virtual machine requirements for FaultTolerance

The following virtual machine requirements must be met before using FaultTolerance.

1 Virtual machines must be stored in virtual RDM or thick virtual machine disk (VMDK) files. If the virtual machine is stored in a thin set VMDK file, when you try to enable Fault Tolerance, a message appears indicating that the VMDK file must be converted. To perform this conversion, the virtual machine must be powered off.

2 virtual machine files must be stored on shared memory. Storage solutions that can accept sharing include fibre Channel, (hardware and software) iSCSI, NFS, and NAS.

3 only virtual machines with a single vCPU are compatible with Fault Tolerance functionality.

The specific steps are as follows:

Prepare clusters and hosts for Fault Tolerance:

To enable FaultTolerance for the cluster, the prerequisites for this feature must be met, and then specific configuration steps are performed on the host. After completing these steps and creating the cluster, you can also check that the configuration meets the requirements for enabling Fault Tolerance.

Before enabling FaultTolerance for the cluster, the tasks that should be completed are:

Enable host certificate checking (if you are upgrading from a previous version of vCenter Server).

2 configure the network for each host.

3 create a vSphere HA cluster, add hosts, and check for compliance.

After you have prepared FaultTolerance for clusters and hosts, you can open FaultTolerance for virtual machines

1. Enable host certificate check

Using host certificate checking, ESXi hosts can be configured to authenticate each other, ensuring that a more secure environment is maintained. Host certificate checking is required for the ESXi host where the fault-tolerant virtual machine resides.

Steps

1 Connect vSphere Client to vCenter Server.

2 Select system Administration, and then select vCenter Server Settings. The vCenter Server Settings window appears

3 Click SSL Settings in the left pane.

4 check the vCenter requires an authenticated host SSL certificate box.

5 Click OK.

2. Configure the network for the host

On each host that you want to add to the vSphere HA cluster, you must configure two different network switches so that the host supports both Vmotion and vSphereFaultTolerance.

Note: there must be at least three network cards on each ESXi host, one for communicating with vcenter and receiving by virtual machine bridge, the other for Vmotion communication, and the third for Fault Tolerance logging.

Refer to the documentation for Vmotion migration for Vmotion port groups.

prerequisite

Multiple gigabit network interface cards (NIC) are required. For each host that supports Fault Tolerance, a minimum of two physical gigabit NICs are required. For example, you need a network card dedicated to Fault Tolerance logging and another dedicated to vMotion. VMware recommends having three or more network cards to ensure availability.

Note that the vMotion and FT logging NICs must be on different subnets and the FT logging Nic does not support IPv6.

Create a port group dedicated to FaultTolerance logging:

Steps

1 Connect vSphere Client to vCenter Server.

2 in the vCenter Server list, select the host, and then click the configuration tab.

3 Select the network under hardware, and then click add Network Link. The add Network Wizard appears.

4 Select VMkernel under connection type, and then click next.

5 Select create a vsphere standard switch, and then click next.

6 provide the label of the switch.

7 Select to use this port group for Fault Tolerance logging, then click next.

8 provide the IP address and subnet mask, and then click next.

9 Click finish.

Do the same on other ESXi hosts to complete the port group for Fault Tolerance logging.

To confirm that vMotion and Fault Tolerance are successfully enabled on the host, view the summary tab for the host in vSphere Client. In the General pane, the host fields for which vMotion is enabled and FT is configured should be displayed as Yes.

3. Create vSphere HA clusters and check for complianc

VSphere FaultTolerance is used in a vSphere HA cluster environment. After configuring the network on each host, create a vSphere HA cluster and add hosts to it. You can check whether the cluster is configured correctly and meets the requirements for successfully enabling Fault Tolerance.

Steps

1 Connect vSphere Client to vCenter Server.

2 create a cluster and add hosts to it

3 in the vCenter Server inventory, select the cluster, and then click the profile Compliance tab.

4 Click to check compliance now and run the compliance test.

To view the tests that were run, click description.

The results of the compliance test are displayed at the bottom of the screen. The host will be marked as compliant or non-compliant.

Provide Fault Tolerance for virtual machines:

After you have taken all the steps required to enable vSphere FaultTolerance for the cluster, you can open FaultTolerance for each virtual machine

Function.

The options for opening Fault Tolerance are not available and grayed out if any of the following conditions are true:

1 the host where the virtual machine resides does not have a license to use this feature.

2 the host where the virtual machine resides is in maintenance mode or standby mode.

3 the virtual machine is disconnected or orphaned (its .vmx file cannot be accessed).

4 the user does not have permission to turn on this feature.

Open Fault Tolerance for the virtual machine

You can open vSphere FaultTolerance through vSphereClient

Steps

1 Select the hosts and clusters view.

2 right-click a virtual machine and select Fault Tolerance > Open Fault Tolerance.

If you select more than one virtual machine, the FaultTolerance menu is disabled. Fault Tolerance can only be opened for one virtual machine at a time.

A specific virtual machine will be designated as the primary virtual machine and a secondary virtual machine will be established on another host. The primary virtual machine now has fault tolerance enabled.

View information about fault-tolerant virtual machines

You can use vSphereClient to view the fault-tolerant virtual machines in the vCenterServer inventory.

Note: you cannot disable FaultTolerance from a secondary virtual machine.

The vSphere FaultTolerance area (pane) is displayed in the summary tab of the primary virtual machine, which contains information about the virtual machine.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.