VMware vSphere 5.1 Cluster in-depth Analysis (27) 07/09 Update SLTechnology News&Howtos

VMware vSphere 5.1 Cluster in-depth Analysis (27)

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

VMware vSphere

5.1

Clustering Deepdive

HA.DRS.Storage DRS.Stretched Clusters

Duncan Epping & Frank Denneman

Translate By Tim2009 / translator: Tim2009

Catalogue

About the author

Knowledge point

Preface

Part I vSphere High availability

Chapter 1 introduces vSphere high availability

Chapter II High availability components

Chapter III basic concepts

Chapter 4 restart the virtual machine

Chapter 5 increase high availability flexibility (network redundancy)

Chapter VI access Control

Chapter VII Virtual Machine and Application Monitoring

Chapter 8 Integration

Chapter 9 Summary

Part II: vSphere DRS (distributed Resource scheduling)

Chapter 1 introduction to vSphere DRS

Chapter II vMotion and EVC

Chapter III DRS dynamic quota

Chapter IV Resource Pool and Control

Chapter V DRS Computing recommendation

Chapter VI DRS recommendation Wizard

Chapter VII introduction to DPM

Chapter 8 DPM Computing recommendation

Chapter 9 DPM recommendation Wizard

Chapter 10 Summary

Part III vSphere Storage DRS

Chapter 1 introduction to vSphere Storage DRS

Chapter 2 Storage DRS algorithm

Chapter III Storage Imax O Control (SIOC)

Chapter IV data Storage configuration

Chapter V data Storage Architecture and Design

Chapter 6 impact on Storage vMotion

Chapter VII relevance

Chapter 8 data Storage and maintenance Mode

Chapter 9 Summary

Part IV expansion of cluster architecture

Chapter 1 extension of Cluster Architecture

Chapter II vSphere configuration

Chapter III troubleshooting

Chapter IV Summary

Chapter V Appendix

(after a trip to Wuhan during the Dragon Boat Festival, please forgive me for continuing to update.)

Part IV expansion of cluster architecture

Chapter 1 extension of Cluster Architecture

In this section we will continue to discuss specific infrastructure, how to leverage HA,DRS and storage DRS, and how to deploy to increase availability. Whether it is your workload or the amount of work that resources provide to you, we will guide you through some design suggestions and decision-making methods. Of course, in order to make appropriate decisions on the details of implementation, it is necessary to fully understand your environment. In any case, we hope that this section will provide an appropriate way to understand how to put certain functions together and how to build an ideal architecture when your environment receives requirements.

Scene

Our choice is that the scenario is to expand the cluster, and we will also mention the Vmware vSphere Metro Storage Cluster solution. We chose the guided scenario to allow us to explain multiple design and architectural considerations. Although the scenario has been tested and valid in our experimental environment, each environment is unique, and our recommendations may be different from yours based on our experience.

A VMware vSphere Metro Storage Cluster (vMSC) configuration is a Vmware vSphere 5 certified solution based on cluster and storage array synchronous replication. This solution is usually deployed in environments with limited distances from the data center, often in metropolitan or campus environments.

The main advantages of the extended cluster model are to make the data center fully active and workload balancing, because many customers find it attractive because of the migration of virtual machines between sites and the ability to store vMotion, turn on on-demand and non-intrusive to move workloads across sites, and the scalability of the cluster provides active load balancing freedom, which should be the main design and implementation goal.

The benefits of extending the cluster solution:

Workload mobility

Automatic load balancing across sites

Enhanced to avoid downtime

Avoid disaster

Technical requirements and constraints

Due to the technical constraints of virtual machine online migration, there are some specified requirements that must be considered when extending the cluster implementation, which are included in the list of the storage section of the Vmware hardware compatibility wizard as follows:

Storage connections use fibre Channel, ISCSI,SVD (Storage Virtualization device), and support for FCOE

Storage connections using NAS (NFS protocol) do not support vMSC configuration at write time (August 2012)

The maximum network latency supported between the site and the ESXi management network is the 10ms round trip time (Round Trip Time (RTT))

Note the 10ms latency (Metro vMotion) supported by vMotion only under Enterprise enhanced license

The maximum latency supported by duplicate links for synchronous storage is 5ms (Round Trip Time (RTT)). Usually your storage vendor will provide the maximum RTT they allow.

ESXi's vMotion network requires at least redundant network links of 622Mbps

The storage requirements are a little more complex than the storage synchronous replication solution, where an vSphere Metro Storage Cluster request extends from a single storage subsystem to the site, in which the data storage provided must be accessible (readable and writable) and come from both sites. "further, when a problem occurs, the ESXi host must be able to continue to access the data storage from any site without affecting ongoing storage operations."

This excludes traditional synchronous replication solutions, where when they create a primary / standby relationship between active (primary) LUN, the data is accessed, and the standby LUN is receiving a replication operation, in these solutions, in order to access the standby LUN, the repetition must be stopped (or undone), the LUN visible host, and now the upgraded standby LUN has a completely different LUN ID, which is essentially a new available replica. This type of solution is suitable for traditional disaster recovery. It is expected that the virtual machine needs to be started on the second site, vMSC needs to configure the second site at the same time, and vMSC needs to be configured at the same time without affecting access, so running virtual machines are still allowed to migrate between sites; normal vMotion will not migrate the disk files of virtual machines.

VMSC's storage subsystem must read and write at both sites at the same time, and all disks write to the site synchronously to ensure data consistency. No matter where writes are read locally, storage architecture calls between cluster sites require a large amount of bandwidth and very low latency, increasing distance and latency will cause disk write latency, which greatly degrades performance, and successful vMotion execution between different locations at the cluster site will not be allowed.

Unity and non-unity

VMSC solutions are classified in different regional directories based on host access to storage on a different basis. It is important to understand the different types of extended storage solutions, which can affect your design. There is a major directory described in the VMware hardware compatibility list:

Unified host access configuration-ESXi hosts at both sites need to connect storage nodes on storage clusters at all sites to access, and the path to submit ESXi hosts is to extend the access distance

Inconsistent host access configuration-ESXi hosts at each site connect only to storage nodes at the same site, and storage nodes that submit paths to ESXi hosts are limited to the local site.

Let's describe them more deeply from an architectural and implementation perspective to make sure they are clear enough.

Unified, the hosts of data center-An and data center-B can access both data center A storage system and data center B storage system. In fact, the storage area network is extended between the site and all hosts to access all LUN, in these configurations, read and write access to the LUN occurs on one of the two arrays, the synchronous mirror is maintained hidden, and the second array is read-only. For example, if the data contained in LUN is readable and writable on the array of data center-A, all ESXi hosts will enter the data storage through the array of data center A, for the ESXi host of data center A, there will be a local access, and the virtual machine on the ESXi host of data center B is located on the data storage, in order to prevent downtime or the operation control of LUN is transferred to data center-B. All ESXi hosts will continue to view the same LUN in the presence, except for the ESXi host that has already accessed data center-B.

Figure 160: unified storage architecture

As you can see, it is ideal for a virtual machine to access data storage from the same data center through array control (read and write), which minimizes traffic between data centers and avoids the impact of reading the entire interconnection performance.

The concept of site association of virtual machines is governed by read-write copies of data stores. "site association" is sometimes called "site bias" or "LUN locality", which means that when a virtual machine on data center An is associated with a site, it reads and writes a copy of data center An on the data store, which is explained in more detail in the DRS chapter.

Disunity: hosts in data center A can only access the array local to the data center, and the array (the opposite sibling array in the data center) is responsible for providing full data storage access for everyone. In most scenarios, virtual machines use this concept, which allows ESXi hosts on each data center to read and write the same data storage / LUN.

It is understandable that even if two virtual machines are in the same datastore but in different datacenters, they are written locally, and a key point in this configuration is to define the site association (Site Affinity) for each LUN/ datastore, and sometimes to refer to "Site bias" or "LUN locality". In other words, if something happens on the link between the site and the storage system on the site, the site will be accessible only to read and write to the data store, which is, of course, to prevent data corruption in a failure scenario.

Figure 161: inconsistent storage architecture

As a unified solution is the most commonly used deployment today, we will use unified storage in our test cases. It should be noted that many design considerations also apply to non-unified configurations. If this is not the case, we will collect them.

Scene architecture

In this section we will describe the architectural configuration for the scenario, and we will also discuss some basic configurations and various vSphere functions. In order to further explain their respective functions, referring to the HA and DRS sections of this book, we will make recommendations based on VMware best practices and operating manuals, and explain how to prevent and limit downtime in practice in our failed scenario.

Architecture

An architectural scenario consisting of a single vSphere 5.1 cluster and four ESXi hosts, managed by the vSphere vCenter server, decided to use vSphere 5.1 to test and improve the permanent device loss (PDL) scenario introduced in vSphere 5.0U1. These enhanced features described are mainly used to extend the cluster environment. We will discuss the vSphere HA section in more detail in this section. It is worth noting that there is no change in PDL behavior in vSphere 5.1.

For our testing purposes, we simulate a user environment with two sites, the first site called Frimley and the second site called Bluefin. There is an extended layer 2 network between the Frimley data center and the Bluefin data center, the distance between the school clusters is the smallest, and the vCenter Server and virtual machines are running on the same cluster.

There are two ESXi hosts at each site, and the vCenter Server in the Bluefin data center is configured with a vSphere DRS management host, and only one vCenter Server instance is used in an extended cluster environment, unlike traditional VMware Site Recovery Manager configurations that require two vCenter Server. In Chapter 15, we discussed configuring VM-Host association rules and using ISCSI as the main protocol in our scenario.

NetApp MetroCluster on the vSphere 5.1 cluster through a fiber-optic configuration connection in unified device access mode. This configuration is described in depth in NetApp's technical white paper "TR-3548". This means that each host in the cluster has two storage nodes connected. Each node is connected to two fiber optic switches, and the nodes in the second area are also connected to two similar switches. For any given LUN, the LUN presented by either of the two storage nodes is read and written through the ISCSI.

In contrast, the storage node maintains replication and the read-only copy is effectively hidden until the ESXi host needs it.

When using NetApp MetroCluster, an ISCSI connection is bound to a specified virtual IP address, and the virtual IP address on the ESXi host is used to connect to the storage controller. In a failed scenario, the IP-Address is converted to the opposite storage controller, allowing seamless access to the target stored IP address without the need for reconfiguration.

A total of 8 LUN:4 were created to access the Frimley data center through virtual ISCSI IP addresses, and another 4 to access Bluefin data centers through virtual ISCSI IP addresses.

Figure 162: infrastructure

Table 27: infrastructure

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.