Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the quorum Mechanism in distributed system

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

How to analyze the quorum mechanism in distributed systems, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

In distributed systems, we usually increase data redundancy to achieve higher data reliability. Multiple copies and erasure codes are the two most commonly used data redundancy technologies. The Quorum mechanism is a mechanism that often cooperates with redundant copy management.

Let's first consider a simple scenario, your system is 3 copies of the design, how to read and write process is better? Intuitively speaking, when writing, 3 copies are written successfully to report success, so it is OK to read any copy when reading. This is the simplest copy strategy: write-all-read-one, or WARO for short, and the most commonly used copy strategy.

Write-all-read-one

For redundant systems with multiple replicas, when writing, all replica nodes are successful (write all), and when reading, you can just read a copy (read one). This is easy to understand, because there is an assumption that as long as the writing is successful, so many copies of the data are consistent, and you can read any of them correctly (regardless of silence or other problems, the problems caused by silence are solved by data self-checking).

Now we want to ask a question: why can you read any copy of the data when reading it in WARO mode?

The point is: we are reading successful data, which is guaranteed by the premise of Write-All. Now consider the pros and cons:

Advantages:

The implementation is simple, does not need to consider complex exception handling, and is efficient when reading.

Disadvantages:

The usability of the write is not very good, because the update write operation needs to be successful in all N copies, so once there is a copy exception and the write fails, the update operation is not available. For the update service, although the system has N replicas, the system cannot tolerate any replica anomalies; the availability of reads is good, and the system can provide read services as long as one copy of the N replicas is normally serviced. In other words, the system can tolerate 1 copy exception.

So, we see the difference between WARO read and write availability, the availability of the update service is too low, although the system uses multiple copies, but the availability of the update operation is equivalent to no copy.

Can you optimize this? Yes, in fact, it is the CAP theory. WARO ensures a high degree of data consistency, but at the expense of the availability of write updates, our optimization idea is to appropriately increase the availability of natural data, and the consistency of natural data will be lowered. Quorum is a mechanism to improve the availability of write updates.

QuorumQuorum definition

WARO sacrifices the availability of write updates to bring the simplicity of the system, and the availability of reads reaches the highest state of the replica mechanism. We relax the WARO condition, do not require Write-all, so as to make a tradeoff between the availability of read-write services, this is Quorum.

In the Quorum mechanism, when a write update operation w [I] is successful in all W copies of N copies, the update operation is called "successfully submitted update operation", and the corresponding data is "successfully submitted data".

Make R > Nmurw, because the update operation w [I] is successful on W copies, if you need to read at most R copies, you must be able to read the updated data v [I] when reading data. If an update w [I] is successful on W replicas, the set of any R replicas must intersect with the set of successful W replicas, so reading R replicas must read the updated data vi of w [I]. Schematic diagram:

If you take a closer look at the above description, the deduction comes from this:

Because the update operation conditions in WARO mode are too stringent, we want to improve the availability of the update operation, so we relax it appropriately to allow N of the update operations to fail, and it doesn't matter if we succeed, but the usability of writing will come up. Although the reliability of the data will be in an incomplete state for a short time, it will bring about an improvement in usability. Because after the introduction of Quorum mechanism, when the update operation is successful, N copies are not necessarily the latest successful data, so when we read them, we are not free to read any copies, but we must read the successful data, so we have to read R copies every time, and ringing W must satisfy > N, because only in this way can we ensure that the R copies you read must contain the correct data. If the condition of ReneW > N is guaranteed, the set of read replicas and the set of written replicas can be guaranteed to intersect.

For example, the system is 5 copies, so that the data of the first five copies are all the same, all v1. An update operation w2 and the first three copies are successful, and the replica situation becomes (v2 v2 v2 v1 v1). At this point, the set of any three copies must contain v2.

As a matter of fact, in the above definition, making Wendy NJournal Renewal 1 is actually equivalent to WARO, so WARO is a special case of the Quorum mechanism.

Think about a question: you read R copies of the data, although it must contain the correct data, but how do you identify the correct data? Remember not to replace your God's perspective.

To solve this problem, we have a very important conclusion: relying solely on the quorum mechanism can not guarantee the strong consistency of the data. Because even if you read the correct data, you will not be able to identify the correct data without the assistance of other means.

For example:

5 copies of the system, so that W = 3grave3, the initialization values are all v1, then it is [v1, v1] after a write update, the system status becomes [v2, v1, v1] because of Renew3, according to the rules of qurom, we only need to read three copies of data to read the correct value. Exhaustively, then R = 3, there are three cases in the list of copies read: [v2, v2, v1] [v2, v2, v2] [v2, v1, v1]

We see that either way, v2 is on the list, and from God's point of view, we know that this is the right data. But for these three scenarios, what do you need to do to make sure that the correct data is v2 each time?

In fact, only in the second case [v2, v2, v2] can you conclude that v2 is correct, because there is only v2 in the list.

In the first case [v2, v2, v1] and the third case [v2, v1, v1] you need other further measures before you can identify the correct data. For example, in the first case [v2, v2, v1] which value would you take as the correct value? V2? Because v2 is the majority? But in the third case, v1 is still the majority.

This situation requires some additional action to confirm the correct copy, and it may not be distinguishable in the end.

How to determine the latest submitted value?

The Quorum mechanism states that only W of N replicas need to be updated successfully. When reading R replicas, the latest successful data can be read when the condition of Wigr > N is satisfied. However, due to the existence of update failures, only reading R copies is not necessarily certain which is the correct data.

Let's take this as an example: let's assume that the system of Numb5Magne3 Magazine Renew3 is initialized to [v1, v1]. After a successful update, the status at a certain time is [v2, v1, v1], that is, the first three copies have been updated successfully. Because Wend3 meets the conditions, this is a successful update.

At this time, if you read any three copies, you must be able to read v2, and there are not many cases. There are only three cases in which we can enumerate:

[v2, v2, v2] [v2, v2, v1] [v2, v1, v1]

We cannot tell whether v1 or v2 is the latest successfully submitted data?

We can brainstorm.

Method 1, each time you write a successful copy, write another metadata to record the request, such as the successful version v2, so that you can know which version of the data is correct.

Method 2, or we further strengthen the reading conditions:

Restricted update operations must be strictly incremented, that is, only the previous update operation is successful before the latter update operation can be submitted, so the data version number successfully submitted must be continuously incremented. Read R replicas, for the data with the highest version number of the R replicas, if W already exists, the data is considered to be the latest successfully submitted data. If there are less than W successfully submitted data, assuming X, the other copies will continue to be read until W copies of this version are successfully read, then the data is the latest successfully submitted data. If among all the copies, the number of the data does not meet W, then the second largest version in R is the latest successfully submitted copy.

So let's think about these three situations:

[v2, v2, v2] three copies are all v2, the number of v2 versions > = 3, then v2 is the latest successful version [v2, v2, v1] [v2, v2, v1]: this situation still needs to be read, because the number of v2, v1 is not satisfied > = 3; [v2, v1]: the number of v2 > = 3, so it is concluded that v2 is the latest successfully submitted data. [v2, v1, v1]: at this time, we can finally conclude that v2 is the latest data successfully submitted. Two data are v2, one is v1, then we still need to continue to read. There are two cases: [v2, v1, v1] this situation is similar, but also continue to read, know that the number of versions of a certain data meets the conditions > = 3

If the process of continuing to read fails, or times out, then it is impossible to judge.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report