In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "what is the consensus agreement of blockchain". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
High availability architecture is one of the core challenges in distributed system design, and Byzantine fault tolerance is a general scheme to solve the problem of efficient fault tolerance. The Byzantine system originated from the problem of Byzantine generals. in ancient times, some Byzantine generals led their troops to capture an enemy city. each general can only control their own troops and send a message to the other generals through a messenger (this message is known only to the two generals involved, and the other generals can inquire, but cannot verify the correctness of the message). If some of these generals are spies from the enemy, it is the question of Byzantine generals how to enable loyal generals to reach an agreement of action without being provoked by spies. Each node of the distributed system can be compared to a general, the message transmission between servers can be compared to a messenger, and the server may have errors and the wrong information may be transmitted to other servers. Therefore, the Byzantine fault-tolerant system means that in a system with n servers, the whole system needs to meet the following two conditions for each request:
All non-Byzantine servers use the same input information to produce consistent results
If the information entered is correct, then all non-Byzantine servers must accept the information and calculate the corresponding results.
Among them, the server that failed is called Byzantine server. These two conditions are called consensus Consensus, and they are the most basic problems of fault-tolerant processing in distributed systems. In the actual operation of the Byzantine system, it is generally assumed that there are no more than f Byzantine servers in the whole system, and each request needs to meet two indicators at the same time:
Safety: any completed request will not be changed, it can be seen by later request
Liveness: requests from non-Byzantine clients can be accepted and executed, and requests from non-Byzantine clients cannot be executed due to any factors.
In real networks, Byzantine systems need exponential algorithms to solve the problem. With the rise of large-scale distributed systems, especially cloud computing, simplified Byzantine protocols that improve performance by relaxing liveness are emerging and used in specific systems, such as PBFT, Paxos,Raft and so on. From the current research situation, the design of Byzantine system is mainly divided into state machine Byzantine protocol and Quorum Byzantine protocol. The main functional differences between the two protocols are: the former needs to arrange sequence numbers for all requests, and all requests must be executed sequentially, which is mainly used in distributed computing systems that are sensitive to system state; the latter does not need to arrange sequence numbers for requests, and multiple requests can be executed at the same time, which is mainly used in distributed storage systems. The research direction of Byzantine fault-tolerant technology is mainly to reduce system overhead and narrow the gap between Byzantine fault-tolerant technology and the current practical application system.
The characteristic of the state machine Byzantine system is that the whole system maintains a state together, and all servers take consistent action, including three protocols: consistency protocol agreement, checkpoint protocol checkpoint, view exchange protocol view change.
The goal of the conformance protocol is to have requests from clients execute in a determined order on each server. Generally, there is one node as the master node, which is responsible for sorting the client's requests, and the rest is the slave node, which executes the requests in the order provided by the master node. All servers work under the same configuration information, which is called view, and each time the master node is changed, the view changes. The conformance protocol consists of at least three phases, sending the request request, assigning the sequence number pre-prepare, and returning the result reply. Depending on the design of the protocol, phases such as interactive prepare and serial number confirmation commit may also be included.
Every time the Byzantine system executes a request, the server needs to log. If the logs are not cleaned in time, the system resources will be occupied by a large number of logs, which will affect the performance and availability of the system. On the other hand, due to the existence of Byzantine servers, the consistency protocol does not guarantee that every server executes the same request, so the state of different servers may be inconsistent. Periodic checkpoint protocols can process logs periodically and correct the status of the server in a timely manner.
Once there is an error in the master node itself, it may lead to problems such as receiving different requests with the same sequence number from the node, or assigning multiple sequence numbers to the same request, which will directly lead to the request can not be executed correctly. The role of the view change protocol is to replace the master node with a slave node when it cannot continue to perform its duties, and to ensure that requests that have been executed by non-Byzantine servers will not be tampered with.
Castro and Liskov [4] proposed Practical Byzantine Fault Tolerance (PBFT) earlier, which reduced the complexity of Byzantine protocol from exponential level to polynomial level for the first time, which made it possible to apply Byzantine protocol in distributed systems. This algorithm can solve the Byzantine fault tolerance problem without liveness guarantee in asynchronous networks. Although Liveness is not guaranteed, the probability of this algorithm entering an infinite loop is very low and is completely available in engineering. For this, Barbara Liskov won the Turing Award in 2008. PBFT takes two assumptions to provide stateful replication:
1. Only a small part of the system may be Byzantine servers.
two。 The system relies on dominant elections to reach consensus and ensure that the results are correct
If there are 3f+1 nodes in the system, then PBFT can tolerate no more than f Byzantine servers. Briefly explain why: suppose there are three generals and only one traitor. If An is a traitor, then A may attack B and then give C the order to retreat. When B and C synchronize information with each other, they will find two inconsistent messages. But neither B nor C can tell who is a traitor. For example, from B's point of view, he cannot tell whether An is a traitor or C is a traitor. So it is impossible to solve one traitor in three generals. In order to ensure correctness, PBFT pairwise interaction determines the intersection before a non-traitor can be determined: therefore, N > = 3f+1 is derived from-(Nmurf) + (Nmurf)-N > = frank1.
PBFT's conformance protocol is shown in the following figure, and each client's request takes five phases to complete. PBFT uses pairwise interaction to execute the request of the client after the server has reached an agreement. Because the client cannot get any information about the running status of the server from the server, the error of the master node in the PBFT protocol can only be monitored by the server. If the server fails to complete the client's request for a period of time, the view exchange protocol is triggered.
The design idea adopted by PBFT is to put all the work on the server side, such as achieving consistency, monitoring the Byzantine master node and so on. Therefore, the design of its consistency protocol is complex, in which there are two stages that require pairwise interaction between servers, large amount of data processing and complex calculation.
PBFT has been used in many scenarios. For example, Liskov has used it to implement a NFS. Recently, a popular application is in the blockchain super ledger project dominated by IBM. In addition to PBFT, the super ledger project also introduces Sieve, a self-use consensus protocol based on PBFT. Its purpose is to reach a consensus on the output of nodes on the basis of PBFT. This is because one of the important functions of the super ledger project is to provide intelligent contracts on the blockchain-that is, a piece of code executed on the blockchain, so it will bring uncertainty of the final state on the blockchain ledger, so Sieve will introduce the code execution result signature on the basis of the PBFT implementation for verification.
The state machine Byzantine system can only execute one request at a time, so the order of execution requests between servers is exactly the same. In many distributed storage systems, applications usually do not require strict execution order, but are very sensitive to response time, so Quorum Byzantine system is proposed. Quorum system means that shared data is stored on a set of servers and provides read / write operations by accessing a constant subset (quorum) of the server. Such protocols contain a feature: after specifying the size of the subset of access, any such subset contains the latest data and must be readable. On this basis, Quorum Byzantine system ensures that the consistent semantics of the system is still valid when the server has a Byzantine error. The writing operation of this kind of system is as follows: first, the client c reads the timestamp A from a server subset Q of a constant size, and the client c selects a timestamp greater than any A from the available timestamp t and greater than all the timestamps used by c Then, the client c sends (v is the data to be updated, t is the timestamp) to all servers until it receives feedback from a certain subset of servers of constant size. The read operation process of this kind of system is that client c obtains the latest data information returned by all servers in a constant size server subset Q, and then c selects the required value according to the specific protocol requirements.
Relax the scene of the Byzantine fault-tolerant system and do not allow traitors in the system (that is, no hacker attacks, no false messages, etc.), which is common for the construction of distributed systems in data centers. Paxos is the first consensus algorithm for such a system, but not for Byzantine fault-tolerant systems (except Byzanetine Paxos). Paxos runs on a set of processes called replicas that need to agree on a value even if an error occurs. If more than half of the replicas do not have crash for a long enough time, then all running replicas will be able to agree on a proposed value: in state machine replication This means that you need to agree on which record is next in the (multi-copy) log.
Wu dysprosium's diagram is used to illustrate the Paxos process: Paxos is divided into two stages: Prepare and Accept. There are two main roles in the protocol, Proposer and Acceptor. Before value is accepted by most, each Acceptor can accept multiple different values. However, once a value is agreed upon by the majority accept (that is, the value), then the value will not change. Because the value will be found in the Prepare phase, the value will be used in the Accept phase, and all subsequent proposals will choose this value. It is important to note that each phase begins processing as soon as a response from the majority is received. And due to machine downtime, Acceptor needs to persist acceptedProposalID,acceptedValue and minProposal. You can see from the process that Prepare has two functions:
1. The large proposal id will block the process of reaching agreement with the unfinished small proposal id, so in order to reduce invalid Prepare requests, choose a larger id than the proposal id you have seen before. two。 Once a value is agreed, subsequent Prepare will find the value as the value of the accept phase, which shows that a Paxos agreement requires at least two network interactions.
(author: dysprosium Wu Link: https://zhuanlan.zhihu.com/p/20872811)
Paxos is not suitable for Byzantine fault-tolerant systems. For example, in the following figure, N2 servers falsify data and make it impossible for Paxos systems to reach a consensus.
Zab (the protocol adopted by ZooKeeper) is a Paxos variant protocol, and its name AB means atomic multicast, because Zab allows every node in the system to perform the same operations in the same order. Atomic multicast and consistency election are considered to be equivalent problems [3]. Raft is a conformance election protocol that has emerged in recent years, which is easier to understand on the basis of achieving the correctness and performance of Paxos equivalence. As you can see from the figure above, both Zab and Raft have a special step to elect a Primary Leader, and both break down the consistency issues to be resolved into a series of separate sub-issues: Leader elections, log replication, and security and other activities. The change of Leader is indicated by epoch and term in Zab and Raft respectively. Each Leader election will increase the value of epoch or term, so all nodes only accept epoch or Leader nodes with high term. after the Leader election, Leader proposes that all replicas operate in a unified order.
In all Paxos-class protocols, each value (that is, a proposal) is essentially a log record marked by two parts, called slot and ballot in Paxos, epoch and counter in Zab, and term and index in Raft. When Leader broadcasts a proposal for the current log, followers are elected by a legal majority (Quorum) and apply the proposal locally after the Leader is submitted. All Paxos class protocols guarantee the global order.
The similarities between these Paxos-like protocols are described above, so there are some differences in their design. First of all, there is the Leader election. Both Zab and Raft divide the implementation into stages (epoch or term), while Paxos does not. Each epoch starts with a new election, followed by a broadcast, and ends with a Leader defeat. In Zab and Raft, there can be at most one Leader at any time, while Paxos does not require this, because Paxos does not have an independent Leader election phase, so there will be multiple Leader in the Paxos system, but it can still be guaranteed to be correct, which is determined by the number of ballot and Quorum. There are three phases in the Zab protocol, and each node is in one of these three phases at any time. The discovery phase is the Leader election; followed by the synchronization phase, where followers synchronize according to all the logs of the new Leader since the last epoch; and finally, the broadcast phase, that is, the normal working mode (normal operation) in the figure above, in which Leader continues to submit customer decisions until it expires. In Raft, there is no obvious synchronization phase, and Leader is synchronized with each follower in normal working mode, which is done by comparing the index and term of each log. As a result, the state of Raft is simpler.
The second difference lies in the communication mechanism between replicas. Zab uses a message model, requiring at least three messages for each update: offer, confirmation, and submission, as shown in the figure above, while Raft relies on RPC.
Paxos system has some typical application scenarios in building distributed systems: the first thing that comes to mind is Server replication, which uses state machine replication to synchronize the state between multiple Server. The second is metadata replication, because Paxos systems usually cannot manipulate large amounts of data, so some metadata replications are based on Paxos, such as Apache BookKeeper systems and Kafka. The third is synchronization services, such as various distributed locks. The fourth is called interceptor orchestration. For example, in a distributed graph processing framework, BSP-based models require a large number of blockers for synchronization, which is usually carried out by Paxos systems, such as Apache Giraph and Hama, as well as Google Pregel.
Both Paxos and Raft algorithms may theoretically enter an endless loop that cannot be passed by a vote (although the probability is actually very low), but they all meet the requirements of safety, but relax the requirements of liveness, and so is PBFT.
In the usual infrastructure, mastering the Paxos protocol has been able to meet all the high-availability design challenges. However, with the popularity of bitcoin and blockchain, the environment of Paxos applications is no longer suitable, because with the introduction of decentralized networks, distributed systems are no longer entirely reliable, so the demand for consensus algorithms is more diverse. Consensus algorithms can be said to be one of the few core technologies of blockchain. At present, blockchain applications are divided into three categories:
Private chain: this refers to blockchain applications deployed within the enterprise, where all nodes can be trusted
Alliance chain: a semi-closed ecological trading network with nodes of unequal trust
Public chain: an open ecological trading network that provides a global trading network for alliance chains and private chains.
Because the private chain is a closed ecological storage system, the use of Paxos systems can achieve the best performance; the federation chain has semi-open and semi-open characteristics, so Byzantine fault tolerance is one of the suitable choices, such as IBM super ledger project; for the public chain, the requirements of this consensus algorithm have gone beyond the scope of ordinary distributed system construction, coupled with the characteristics of transactions, so we need to introduce more security considerations.
The workload used by Bitcoin, for example, proves that PoW is the first consensus algorithm for the public chain. In the public block chain, each node involved in the system is the center and stores a complete transaction book system. However, each node can not keep accounts at the same time, because the environment of the nodes is different, the information received is naturally different, if the accounts are kept at the same time, it will inevitably lead to inconsistency and confusion. Since nodes cannot keep accounts at the same time, they have to choose which node has the right to keep accounts. However, if some special nodes are designated to have the power of accounting, it is bound to run counter to the original intention of decentralization. Bitcoin blockchain solves the consistency problem of decentralized accounting system through competitive bookkeeping. The so-called competitive bookkeeping is a mechanism that competes for bookkeeping rights with the computing power of each node, namely "computing power". In the Bitcoin system, there is a math contest about every 10 minutes (the size of the arithmetic power determines the probability of winning a round of the competition, and the nodes with high power are more likely to win the competition). The winner of the competition is given the right to keep accounts once, so that, for a certain period of time, only the winner of the competition can keep accounts and synchronize the account book information to other nodes. This is proof of the workload. In addition to workload proof, there are some other consensus algorithms, mainly to solve the shortcomings of wasted computing resources and low throughput of PoW. For example, PoS equity proof mechanism, the basic concept is that the difficulty of generating blocks should be proportional to the equity (ownership proportion) in the network; decentralized equity proof mechanism DPoS, each shareholder can delegate their voting rights to one representative, and the top 100 representatives with the largest number of votes take turns to generate blocks according to the established schedule.
Therefore, in order to design the blockchain, we need to make our own trade off in terms of throughput and security to choose the appropriate consensus algorithm. From the perspective of blockchain, Paxos and PoW can be said to be the two extremes of trade off [6], and it is not difficult to understand that PBFT is selected in the super book focusing on the general blockchain.
This is the end of the content of "what is the consensus agreement of blockchain". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.