What happens if the vsan host fails? 07/06 Update SLTechnology News&Howtos

What happens if the vsan host fails?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article is to share with you what will happen if the vsan host fails. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

What happens if the vsan host fails? Let's take a look at the following picture:

This situation is slightly different from "disk failure". When a disk failure occurs, VSAN notices what is happening, notices that the disk cannot be recovered, and triggers component refactoring. However, in the event of a host failure, VSAN does not notice what is happening. This failure state is called "does not exist". Once VSAN notices that the component (VMDK in the above example) does not exist, the timer starts the 60-minute timing. If the component is restored within 60 minutes, VSAN synchronizes the mirrored copy. If the component cannot be restored, VSAN creates a new mirrored copy. Note that you can reduce this timeout value by changing the advanced setting "VSAN.ClomRepairDelay".

If the previously failed host recovers and rejoins the cluster, VSAN checks the object refactoring status. If the object has been refactored on one or more other nodes, there will be no other actions. If object refactoring is still in progress, the components of the previously failed host will still be resynchronized in case there is a problem with the new component. When all objects are synchronized, the components of the original host are discarded and the newly created copy is enabled. However, if the new component fails to complete synchronization for some reason, the original component on the original host will continue to be used.

Note: when the host fails, all virtual machines running on it are restarted by VSPHERE HA. Vsphere ha may restart virtual machines on any available hosts in the cluster, regardless of whether or not those hosts have VSAN components.

Supplementary optimization information: (from https://blog.51cto.com/roberthu/2049330)

Vsan6.2 advanced parameter optimization

Esxcfg-advcfg-s 1024 / LSOM/heapSize

Esxcfg-advcfg-s 180 / VSAN/ClomMaxComponentSizeGB

The default value of esxcfg-advcfg-s 512 / LSOM/blPLOGCacheLines is 128K, which increases to 512K

The default value of esxcfg-advcfg-s 32 / LSOM/blLLOGCacheLines is 128K, increased to 32K

This parameter must be modified before the host officially deploys the virtual machine.

Appendix Learning:

The meaning of congestion

Congestion is a feedback mechanism that reflects a reduction in the inbound IO request rate at the level served by the incoming vSAN disk group from the vSAN DOM client layer. This decrease in the inbound IO request rate is caused by IO latency, while the underlying bottleneck will lead to IO latency. Therefore, an effective way is to shift the latency from the underlying to the input traffic without changing the total throughput of the system. This avoids unnecessary queuing and tail loss queues in the vSAN LSOM layer, thus avoiding wasting a lot of CPU cycles when processing IO requests that may eventually be discarded. Therefore, no matter what type of congestion, temporary and small congestion values are usually fine, but not good for system performance. However, persistent and large congestion values can lead to longer latency and lower throughput than expected, so attention should be paid to and addressed to improve baseline performance.

Reporting of congestion

VSAN measures and reports congestion with scalar values between 0 and 255. The introduced IO delay increases exponentially with the increase of congestion value.

A feasible way to deal with congestion

Check whether the congestion is persistent and high (> 50). In many cases, high congestion values are caused by misconfiguration or poor performance of the system. If there is always a high congestion value, check the following:

The maximum queue depth supported in 1.IO controllers and devices. A maximum queue depth of less than 100 can cause problems. Please check that the controller is certified and listed in the vSAN HCL list.

two。 Incorrect version of firmware or device driver software. Please refer to VMware HCL for vSAN compatible software.

3. Incorrect size setting. Incorrect size settings for cache layer disks and memory may result in high congestion values.

If the problem is not any of the above, it must be debugged to determine whether the baseline can be better adjusted to reduce congestion. You must note that it is:

4. Are all disk groups congested, or

5. The congestion value of one or two disk groups is abnormally higher than that of other disk groups.

In case (1), it is likely that the vSAN cluster backend will not be able to handle the IO workload. If possible, you can adjust the baseline in the following ways:

6. Shut down some virtual machines or

7. Reduce the number of outstanding IO/ threads per virtual machine, or

8. For write workloads, reduce the size of the working set.

For case (2), that is, congestion on one disk group is much higher than that on other disk groups in the system, which indicates an imbalance in write IO activity between disk groups. "if this persists, try increasing the number of disk strips in the vSAN storage policy used to create virtual machine disks."

Common types of reported congestion and their solutions

The types of congestion and remedies for each type are listed below:

9.SSD congestion: when the active working set written to IO of a particular disk group is significantly larger than the size of the cache layer of that disk group, it usually causes SSD congestion. In mixed and all-flash vSAN clusters, data is first written to the write cache (also known as write buffer). A process called a degraded dump moves data from the write buffer to the capacity disk. The write cache is subject to a high write rate, which ensures that write performance is not limited by capacity disks. "however, if the benchmark populates the write cache at a very fast rate, the degraded dump process may not be able to keep up with the IO rate." In this case, SSD congestion occurs to instruct the vSAN DOM client layer to slow down the IO to a rate that the vSAN disk group can handle.

Remedy: to avoid SSD congestion, resize the virtual machine disks used by the baseline. For best results, we recommend that the size of the virtual machine disk (active working set) does not exceed 40% of the cumulative write cache size of all disk groups. Note that for mixed vSAN clusters, the write cache size is 30% of the cache tier disk size. "in an all-flash cluster, the size of the write cache is the size of the cache tier disk, but should not exceed 600 GB."

two。 Log congestion: log congestion usually occurs when vSAN LSOM logs (metadata for IO operations that have not been degraded) consume a large amount of space in the write cache.

Typically, a large number of small writes on small working sets result in a large number of vSAN LSOM log entries, which leads to this type of congestion. In addition, if the benchmark does not issue a 4K alignment IO, the number of IO on the vSAN stack will increase, causing 4K alignment. An increase in the number of IO may cause log congestion.

Remedy: check that the baseline is consistent with the IO request on the 4K boundary. If inconsistent, check that the benchmark uses a very small working set (when the total size of the access virtual machine disk is less than 10% of the cache tier size, the working set is considered small. See above on how to calculate the size of the cache layer. If so, increase the working set to 40% of the cache tier size. If neither of the above conditions is true, you will need to reduce write traffic in two ways: reducing the number of outstanding IO issued by the benchmark or reducing the number of virtual machines created by the benchmark.

3. Component congestion: this congestion indicates that there are a large number of outstanding commit operations for some components because their IO requests are queued. This may result in longer latency. Typically, a large number of writes to several virtual machine disks can cause this congestion.

Remedy: increase the number of virtual machine disks used by the benchmark. Ensure that the benchmark does not issue IO to a small number of virtual machine disks.

4. Memory and Slab congestion: memory and slab congestion usually means that the heap memory space or slab space used by the vSAN LSOM layer is insufficient to maintain its internal data structure. VSAN sets up a certain amount of system memory for internal operations. However, if the benchmark actively issues IO without any restrictions, it may cause vSAN to use up all the memory space allocated to it.

Remedy: reduce the workset of the benchmark. Or, increase the following settings on the experience base just in time to increase the amount of memory reserved for the vSAN LSOM layer. Note that these settings are for each disk group. In addition, we do not recommend using these settings on the production cluster. These settings can be changed through esxcli (see Knowledgebase article 1038578), as follows:

/ LSOM/blPLOGCacheLines, default 128K, increased to 512K

/ LSOM/blPLOGLsnCacheLines, default value is 4K, adjusted to 32K

/ LSOM/blLLOGCacheLines, default value is 128K, increased to 32K

Thank you for reading! This is the end of the article on "what will happen if the vsan host fails". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.