What is the ingenious design of simple high availability ZStack Mini 02/07 Update SLTechnology News&Howtos

What is the ingenious design of simple high availability ZStack Mini

2026-02-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

What this article shares with you is about the ingenious design of simple high-availability ZStack Mini. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article. Let's take a look at it.

ZStack Mini products have "4S" features of the "3s", that is, simple (Simple), extensible (Scalable) and intelligent (Smart). Here is another "S (robust, Strong)".

Whether it is the Chinese word for "robust" or the English word for "Strong", it is usually used to describe the physique of a person or animal, and implies a higher ability to survive, or the ability to recover more quickly after injury and loss of part of the physiological function.

It is precisely because of the potential meaning of this term that it is also borrowed to describe one of the characteristics of IT systems or applications, such as the robustness of the application / IT system, which is not completely "disabled" in the event of a bug or hardware failure, but can continue to work and return to normal more quickly.

The robustness of IT system can be explained by "RAS", that is, Reliability, Availability and Serviceability. To put it simply, reliability means that the components that make up the IT system are of good quality and are not easy to fail, even if one or more components fail, it does not affect the normal operation of the application (availability); and in the event of component or system failure, it can immediately enable the replacement mechanism to make the failed component / system quickly return to normal (maintainability).

It can be seen that reliability and maintainability design obey usability design to a certain extent, and its purpose is to improve usability and meet the needs of continuous operation of the business (without interruption as much as possible).

IT system availability is usually measured by several "9s", such as 5 9s and 6 9s, which refers to the percentage of system available time, corresponding to the (allowable) downtime calculated on a yearly basis.

Give two examples: one 9 or 90% availability with no more than 36.5 days of allowed downtime per year, and 5 9 or 99.999% availability with an annual downtime of no more than five and a half minutes.

Figure note: according to the comparison table of availability and (allowable) downtime compiled by E Enterprise Research Institute, for every additional 9 of availability, the allowable downtime per unit time is reduced to 1/10: for example, the availability of 4 9s (99.99%) the allowed downtime per year is close to 53 minutes, while the allowed downtime of five 9s is less than 5.5 minutes per year.

We know that because processes such as post take a long time, it may take more than 5 minutes for the server to restart once, which means that if the server goes down once a year, even if it is restored immediately, the availability of the five 9s will not be guaranteed. However, the single point of failure (Single Point OfFailure,SPOF) of hardware can not be avoided completely, and sometimes the failure of the software system has to be solved by server restart, so the "2N" system has become a common idea to ensure high availability.

For example, two identical systems run the same application and access the same data, usually one master and one standby (Active-Passive). After the main system fails, the backup system takes over, because the latter has been running all the time and does not need to go through a time-consuming software and hardware startup work. In theory, the time of service interruption only depends on the switching speed between the master and standby, not to mention five 9s, that is, six 9s or seven 9s. It's also achievable.

The theory is very simple and the implementation is very complex, including how to ensure that the data and state of the two systems are as consistent as possible in order to switch quickly?

The traditional architecture of separation of computing and storage requires at least two servers to connect to a dual-control storage system, synchronous applications between the two servers, and the high availability of data is handled by the dual-controller storage system, which usually uses a storage medium with dual-port functions (such as FC or SAS hard drives), and the control of data access is switched if necessary (such as one of the controller failures). Accordingly, the network subsystem is usually set up with double redundancy, and the composition of the whole solution is very complex. Dual-port hard disk reduces the workload of data synchronization, but it is often classified as special equipment, which is not in line with the trend of standardized hardware combined with "software definition".

By using software-defined storage in industry-standard servers, Super Convergence Architecture (Hyper-Converged Infrastructure,HCI) realizes the unity of computing and storage, improves the flexibility of expansion, and reduces the complexity of deployment and operation and maintenance. However, the distributed software-defined storage of most super-fusion systems use the three-copy mechanism to avoid data loss, coupled with the consideration of maintainability, these super-fusion systems usually start from three or four nodes, virtually raising the threshold of user procurement.

In other words, regardless of network devices, whether computing and storage are separated, or computing and storage are integrated, the number of highly available devices or nodes commonly used in the above two small-scale deployments is not less than 3-for example, 2U4 node servers commonly used in super-converged systems are calculated on the basis of 4 servers.

From the perspective of architecture, ZStack Mini has some of the characteristics of two architectures: on the one hand, it is a super-integration of computing and storage; on the other hand, the storage subsystem of each node is based on the RAID technology commonly used in traditional storage systems.

Interestingly, with this combination, ZStack Mini requires at least two servers, that is, one 2U2 node server-although they are all 2U multi-nodes, the cost of 2U2 can be much lower than that of 2U4, making it significantly easier for users to accept.

So how does ZStack Mini ensure high availability of data and applications with (at least) 2 nodes? What about the utilization of storage space? Please take a look at our analysis below.

Minimalist architecture helps improve reliability

Reliability is an integral part of usability. Reliable components that can run stably for a long time contribute to the overall availability of the system, but "reliability" is constrained by cost. "high-cost and high-availability" system is not meaningless, but the threshold is too high.

In view of the fact that ZStack Mini is inherited from the ZStack cloud engine, and its product form (2U2 node) is similar to the super-converged product in the form of 2U4 node, such as 2U chassis, double redundant power supply, and almost the same footprint, and both of them can be used as the minimum deployment unit (super-converged 3-node or 4-node use 2U4), but the architecture of ZStack Mini with only 2 nodes is undoubtedly simpler.

The picture shows the ZStack Mini,2U chassis with two built-in server nodes, and the picture shows the super-converged all-in-one machine designed by the more mainstream 2U4 nodes. In terms of the number of hardware, it is obvious that the number of ZStack Mini components of 2U4 nodes is much more than that of 2U2 nodes, and the space design is more compact, and each node faces greater challenges such as scalability and heat dissipation.

Whether it is the super-converged all-in-one products of ZStack Mini or 2U4 nodes, there are a variety of IT hardware inside, each of which or even each hardware has a failure rate. Take Western Digital Ultrastar DC HC310 (4TB) hard disk widely used in ZStack Mini as an example, its annual failure rate is 0.44%. The more hardware used in the system, the greater the risk of failure.

According to the availability of Ultrastar DC HC310 series hard drives published on Western Digital's official website, Annual failurerate (AFR) in the image above is the annual failure rate of the hard drive (0.44%), while the MTBF in the image above represents the average time between failures, which is 2 million hours.

The component closely connected to the hard disk is the SATA/SAS raid card. ZStack Mini uses the raid card with lithium battery backup unit (Battery Backup Unit,BBU) produced by Broadcom, which can store the data in the Cache of the raid card to the hard disk in case of sudden downtime.

Note: after unplugging the hard disk, ZStack Mini automatically rebuilds the data. In the corresponding monitoring interface of the management background, you can see the "rebuilding" status sign, and the performance monitoring interface also shows that there is continuous IO reading and writing activity. Until the data "reconstruction" is completed, the RAID health status will be in a "degraded" state.

Through the test of the above simulation scenario, it is proved that any node of ZStack Mini can effectively resist the failure of a single data storage disk without causing data loss or application standstill, and the application virtual machine still continues the current task unconsciously until it is completed or manual intervention.

Between nodes: 2N guarantee high availability of applications

The RAID technology in the node ensures that any disk failure will not affect the application, but the traditional hardware-based RAID technology (after replacing the hard disk) takes a long time to rebuild data-- several hours depending on the capacity of the hard disk-- during this period, if another hard disk fails, the data will be lost and the application will be interrupted. In addition, CPU, memory, network card and other components are not redundant, failure may also lead to downtime. All of these are called node-level replicas when another node of the ZStack Mini comes into play.

The picture shows the front of the ZStack Mini, covered with 3.5in hard drives and hot-swappable. The back is shown below, and almost all the components are located inside the node, which means that the replacement of any component except the hard drive requires downtime.

As the saying goes, "keep troops for a thousand days and use troops for a while". When one node cannot work properly, the other node will "come forward" with data and status that has been synchronized all the time. This is what we usually call (node level) high availability. In order to verify this feature, after the virtual machine is set to "highly available", we verify whether the application can continue to run by means of sudden power outage of the node in which it is located.

Video interpretation: node 1 in ZStack Mini is in a "rebuilding" state because one of the hard drives was unplugged in the previous test. In this test, E Enterprise Research Institute simulated a sudden power outage of this "failed" node to verify the high availability function of ZStack Mini.

There are four virtual machines on node 1, of which "rendering Server", "Transcoding Server" and "Network Management platform" are set to high availability. For comparison, another virtual machine named "CentOS7.2" does not use the high availability feature. In the transcoding server, E Enterprise Research Institute exports the rendered video from the previous test and transcodes it using XCoder software.

During the transcoding process (when about 1/3 of the transcoding progress has been completed), node 1 is powered off directly without any operation to simulate a sudden power outage. After the power outage of Node 1, ZStack Mini prompts Node 1 to lose contact and reports that the "Network Management platform" is missing. Then, ZStack Mini starts the "high availability" process and starts the migration of the application virtual machine with the "high availability" feature. about a minute later, the virtual machine that was originally located on node 1 and turned on the "high availability" function is restarted on node 2.

After the "transcoding server" is restarted, the progress of the task before XCoder is cleared and the task is automatically restarted. We have tested that when any node on the ZStack Mini is powered off, the virtual machine on which the "high availability" function is turned on will automatically migrate to another normal running node.

Through the above two-stage verification, we can see that ZStack Mini has good availability, whether in the case of hard disk component failure or node-level failure, and the application can continue to run without interruption or after a short pause without data loss.

Computing storage: efficiency and data persistence

In the process of using this set of ZStack Mini, we communicated with some potential users who are interested in this product and found that there is a representative problem: two nodes, one master and one backup, availability is guaranteed, but isn't the utilization of hardware only half? Would it be wasteful?

This problem can be seen from two aspects: computing and storage resources.

From the application level, as mentioned in the previous test session, the virtual machine where the application is located, the "high availability" function is optional, that is to say, only when this function is enabled, the virtual machine will occupy the computing resources of both nodes at the same time. This is also the price that must be paid to ensure the continuous operation of the application.

If the usability requirements of an application are not so high, "high availability" can not be enabled, thus saving unnecessary waste.

From a storage perspective, all ZStack Mini user data is mirrored and stored on two nodes, so that even if one node is completely damaged, the data will not be lost. From the point of view of the storage utilization of the data disk, it is 1: 1 (mirrored) between nodes and 3: 1 (RAID 5 with 4 disks) within the node, so the overall efficiency is 0. 5 × 0. 75 disk 0. 375, that is, less than half of the level.

It doesn't look high, does it? For comparison, the storage utilization of the triple-copy super-fusion system is 1/3, that is, 0.333-so it seems that ZStack Mini has a slight advantage.

ZStack also explains the advantages of Mini in data persistence:

The probability of persistent failure of double copy data is equal to the probability of simultaneous damage of any two disks distributed in different computing nodes. According to the 1.7% annual disk damage rate of Google (which is higher than the index published by hard disk manufacturers), that is 1.7 × 1.7 × (1Comp2) = 0.01445%, and the data persistence is 98.56%, close to 2 9s.

The probability of persistent failure of double replicas + RAID5 data is equal to the probability of simultaneous damage of any four disks distributed on different computing nodes, and it must be two on one side, not four on one side or three on one side, that is, 1-1.7 × 1.7 × 1.7 × (18x31) = 99.999995%, that is, more than 7 9 (1831 is the probability that 8 hard drives break 4 at the same time and 2 on each of the two nodes)

The three copies will lose data as long as any three disks are damaged, and the availability probability of the three copies is 1-1.7% × 1.7% = 99.99951%, that is, more than 5 9s.

Maintainability is a general term, but it is reflected in every detail of product design.

For example, modern x86 servers are mostly hot-swappable and tool-free, which actually reflects maintainability at the hardware level. Faulty components can be replaced with bare hands without the aid of tools, reducing maintenance time and naturally contributing to availability (after all, availability can also be measured by downtime).

At the same time, in terms of software, as far as ZStack Mini is concerned, it saves a lot of time in the initialization process, at the same time, many functions are clicked by the mouse, and then multiple related processes are completed silently in the background, which is also the embodiment of maintainability: minimize manual operation and avoid human misoperation.

Of course, this is far from enough. When talking about the future development of ZStack Mini, ZStack introduced the upcoming ZStack Mini 3.0 product, which will add a major new feature:

1 backup function

The current version 2.0 can also be set up for backup, but version 3.0 will officially introduce an external disk backup feature that allows regular backups of the system and can be used to restore these backups on new machines. In the future, it will also support backup to the cloud, so that the data can rest easy.

(2) the improvement is related to the high availability of application.

When we verify the "node failure", although the application virtual machine is set to be highly available, when the node fails, the application virtual machine still needs a short pause before resuming service. In the new 3.0 version, in the case of node failure, uninterrupted handover will be achieved, and users will not feel the pause of the application virtual machine.

3 integrate Tencent App Center in ZStack Mini

Now that the ZStack Mini 2.0 platform is deployed, users need to manually create virtual machines and install applications. But in 3.0, ZStack will cooperate with each ISV to integrate the application template directly into Tencent App Center according to the application characteristics of different industries, so that users can simply download and deploy, saving the complex configuration in the process of application installation, and at the same time, there is more guarantee in upgrade and maintenance, which greatly improves the maintainability.

The above is how the clever design of simple high-availability ZStack Mini is, and the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.