Quick introduction to Block chain (4)-- BFT (Byzantine Fault tolerance) consensus algorithm 04/16 Update SLTechnology News&Howtos

Quick introduction to Block chain (4)-- BFT (Byzantine Fault tolerance) consensus algorithm

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Quick introduction to blockchain (4)-- BFT (Byzantine Fault tolerance) consensus algorithm 1. Introduction to BFT 1. Introduction to Byzantine General problem

The Byzantine General problem (Byzantine Generals Problem) is a famous example abstracted by Leslie Lamport (2013 Turing Prize winner) to describe the distributed system consistency problem (Distributed Consensus) in the paper.

A simple informal description of the question of General Byzantine is as follows:

The Byzantine Empire wanted to attack a powerful enemy and sent 10 troops to surround it. This enemy is no better than the Byzantine Empire, but it is enough to withstand simultaneous attacks by five conventional Byzantine armies. For some reasons, these 10 armies cannot break through at a single point together and must attack at the same time in a separate encirclement. There is no chance that any of their troops will attack alone, unless at least six troops attack at the same time. They are scattered around the enemy country and rely on messengers to communicate with each other to negotiate the intention and time of attack. The problem that bothers these generals is that they are not sure whether there are traitors among them, and traitors may change their intention or timing of attack without authorization. In this state, can the Byzantine generals find a distributed protocol that allows them to negotiate remotely and win the battle? This is the famous Byzantine general problem.

The question of General Byzantium does not consider whether the messengers will be intercepted or unable to convey information, that is, there is absolutely no problem with the channel of message transmission. Lamport has proved that it is impossible to try to achieve consistency through message delivery on unreliable channels where messages may be lost. Therefore, when studying the Byzantine general problem, we assume that the channel is not a problem, and then do the research on consistency and fault tolerance.

2. The question of the two generals

Before the question of Byzantium, there was a question of two generals (Two Generals Paradox): whether the two generals should reach an agreement to attack or retreat through a messenger, but how to reach an agreement if the courier might get lost or be stopped by the enemy (lost or forged)?

According to the FLP impossible principle, there is no general solution to the two generals problem.

3. Introduction to BFT

BFT (Byzantine Fault Tolerance), that is, Byzantine fault tolerance, is a fault-tolerant technology in the field of distributed computing. Byzantine fault tolerance comes from the Byzantine general problem. The Byzantium problem is a model of the real world. Computers and networks may behave unpredictably due to hardware errors, network congestion or interruptions, and malicious attacks. Byzantine fault-tolerant technology is designed to deal with real abnormal behavior and to meet the normative requirements of the problems to be solved.

The blockchain network environment is consistent with the Byzantine problem model, with functioning servers (loyal Byzantine generals), faulty servers, and saboteur servers (renegade Byzantine generals). The core of the consensus algorithm is to form a consensus on the state of the network among normal nodes.

Usually, the node that fails is called Byzantine node, while the normal node is non-Byzantine node.

The Byzantine fault-tolerant system is a system with n nodes. For each request, the whole system satisfies the following conditions:

A. all non-Byzantine nodes use the same input information and produce the same results

B. if the information entered is correct, then all non-Byzantine nodes must receive the information and calculate the corresponding results.

The assumptions commonly used in the Byzantine system include:

A, the behavior of Byzantine nodes can be arbitrary, and Byzantine nodes can collude with each other.

B, the errors between nodes are irrelevant

C, through the asynchronous network connection between nodes, messages in the network may be lost, out of order and delayed to arrive, but most protocols assume that messages can be delivered to the destination in a limited time.

D, the information transmitted between servers can be detected by third parties, but the content of the information can not be tampered with or forged and the integrity of the information can be verified.

The original Byzantine fault-tolerant system is not practical because it needs to show its theoretical feasibility. In addition, additional clock synchronization mechanism is needed, and the complexity of the algorithm increases exponentially with the increase of nodes.

2. PBFT algorithm 1. Brief introduction of PBFT algorithm

PBFT (Practical Byzantine Fault Tolerance), a practical Byzantine fault-tolerant algorithm, was proposed by Miguel Castro and Barbara Liskov in their paper "Practical Byzantine Fault Tolerance and Proactive Recovery" published in 1999. PBFT algorithm can work in asynchronous environment, and the inefficiency of the original Byzantine fault-tolerant algorithm is solved by optimization, and the complexity of the algorithm is reduced from exponential to polynomial, which makes the Byzantine fault-tolerant algorithm feasible in practical system applications and has been widely used. The PBFT algorithm can guarantee both Safety and Liveness when the total number of failure nodes is not more than 1 prime 3.

PBFT algorithm uses cryptography-related technologies (RSA signature algorithm, message verification coding and digest) to ensure that the message delivery process can not be tampered with and destroyed.

2. The principle of PBFT algorithm

PBFT is a state machine replica replication algorithm, that is, the service is modeled as a state machine, and the state machine replicates in different nodes of the distributed system. Each copy of the state machine saves the state of the service and implements the operation of the service. A collection of all copies is represented by an uppercase letter R, and an integer from 0 to | R |-1 is used to represent each copy. For ease of description, it is usually assumed that the number of failed nodes is f, and the total number of service nodes is | R | = 3f+1, where f is the maximum number of copies that may fail. Although there can be more than 3f+1 copies, additional copies do not improve reliability except for performance degradation.

All copies operate in a rotation process called View. In a view, one replica acts as the primary node (primary) and the other replica nodes act as backup nodes (backups). Views are sequentially numbered integers. The primary node is calculated by the formula p = v mod | R |. V is the view number, p is the copy number, and | R | is the number of copies set. When the master node fails, it is necessary to start the view rotation process.

The implementation process of PBFT algorithm is as follows:

(1) REQUEST

Client C sends to master node p

Request.

O: the specific operation of the request

T: the timestamp appended by the client during the request

C: client identification.

REQUEST: contains the message content m, and the message digest d (m).

The client signs the request.

(2) PRE-PREPARE

The master node receives the request from the client and needs to verify whether the signature of the client request message is correct.

Illegal requests are discarded. Correct request, assign a number n, number n is mainly used to sort client requests. And then broadcast a message

The message is sent to other replica nodes.

V: view number

D client message summary

M: message content

Sign the master node.

N is [h, H] to be in a certain range.

(3) PREPARE

Replica node I receives a PRE-PREPARE message from the primary node and requires the following verification:

A. whether the PRE-PREPARE message signature of the primary node is correct.

B. Whether the current replica node has received a PRE-PREPARE message with the same v and the number n, but with a different signature.

Whether the abstracts of C, d and m are consistent.

Whether D and n are in the interval [h, H].

Illegal requests are discarded. Correct request, replica node I sends a message to other nodes, including the primary node

The message, v, n, d, m is the same as the PRE-PREPARE message above, and I is the current replica node number.

Sign the replica node I. Record PRE-PREPARE and PREPARE messages to log for resuming outstanding request operations during view rotation.

If view rotation occurs in the PREPARE phase, the request for the PREPARE phase will be discarded.

(4) COMMIT

The primary and replica nodes receive the PREPARE message and need to verify the following:

A. whether the signature of the replica node PREPARE message is correct.

B. whether the current replica node has received n under the same view v.

Whether C and n are in the interval [h, H].

Whether D and d are the same as the d in the PRE-PPREPARE currently received

Illegal requests are discarded. If the replica node I receives 2f+1 authenticated PREPARE messages indicating that most nodes in the network have received the consent message, then send a message to other nodes, including the primary node

The message, v, n, d, I is the same as the PREPARE message above.

Sign the replica node I. COMMIT messages are logged to the log to recover outstanding requested operations during view rotation. Record PREPARE messages sent by other replica nodes to log.

The COMMIT phase is used to ensure that most nodes in the network have received enough information to reach a consensus. If the view rotation occurs in the COMMIT phase, the original request of the COMMIT phase will be saved, and the request number will not be lost.

(5) REPLY

The primary and replica nodes receive the COMMIT message and need to verify the following:

A. whether the signature of the replica node COMMIT message is correct.

B. whether the current replica node has received n under the same view v.

Whether the abstracts of C, d and m are consistent.

Whether D and n are in the interval [h, H].

Illegal requests are discarded. If the replica node I receives 2f+1 verified COMMIT messages, it means that most of the nodes in the current network have reached a consensus, run the client's request operation o, and return

To the client, r: is the result of the request operation. If the client receives the same REPLY message, it means that the request initiated by the client has reached a network-wide consensus, otherwise the client needs to determine whether to resend the request to the master node. Record COMMIT messages sent by other replica nodes to log.

3. Garbage collection of PBFT algorithm

In order to ensure that the previous request can be restored during the view rotation, each replica node records some messages to the local log. When the request is executed, the replica node needs to clear the previous request record message. The easiest thing to do is to perform consensus synchronization of the current state again after the Reply message, but the cost is relatively high, so you can perform a state synchronization after executing multiple request Ks (for example, 100s). A status synchronization message is a CheckPoint message.

Copy node I send

For other nodes, n is the last view request number reserved by the current node, d is a summary of the current state, and the CheckPoint message is logged to log. If the replica node I receives 2f+1 authenticated CheckPoint messages, the message in the previous log is cleared and n is used as the current stable checkpoint.

In practice, when replica node I sends CheckPoint messages to other nodes, other nodes have not completed K requests, so they will not immediately respond to I requests, but will move forward at their own pace, but the CheckPoint issued at this time does not form a stable. In order to prevent I from processing requests too quickly, set a high and low water level range [h, H] mentioned above to solve the problem. The low water level h is equal to the number of the previous stable checkpoint, and the high water level H = h + L, where L is the specified value, which is an integral multiple of the number of requests processed in the checkpoint cycle, and can be set to L = 2K. When the replica node I processes the request over the high water level H, it stops and waits for the stable checkpoint to change before moving on.

4. View rotation of PBFT algorithm

If the master node does evil, it may assign the same sequence number to different requests, or not assign the sequence number, or make the adjacent sequence numbers discontiguous. The backup node should have the responsibility to proactively check the validity of these serial numbers. If the primary node drops or does not broadcast the client's request, the client sets the timeout mechanism, and if the timeout occurs, the request message is broadcast to all replica nodes. The replica node detects that the master node has done something wrong or goes offline and initiates the view rotation protocol.

Replica node broadcasts to other nodes

News. N is the number of the latest stable checkpoint, C is the collection of CheckPoint messages validated by 2f+1, and P is the collection of PRE-PREPARE and PREPARE messages of outstanding requests from the current replica node.

Broadcast to other nodes when the master node p = v + 1 mod | R | receives 2f valid VIEW-CHANGE messages

News. V is a valid collection of VIEW-CHANGE messages. O is the collection of unfinished PRE-PREPARE messages re-initiated by the primary node. The selection rules for PRE-PREPARE message collections are as follows:

A, select the smallest stable checkpoint number min-s in V, and select the largest prepare message number max-s in V.

B. between min-s and max-s, if there is a P message set, create

News. Otherwise, create an empty PRE-PREPARE message, that is:

, m (null) is an empty message and d (null) is an empty message digest.

The replica node receives the NEW-VIEW message from the master node, verifies the validity, and, if valid, enters the PRE-PREPARE 1 state and starts the PRE-PREPARE message processing process in O.

5. Advantages and disadvantages of PBFT algorithm.

Advantages of PBFT algorithm:

PBFT algorithm has high transaction throughput, high availability and easy to understand.

Disadvantages of PBFT algorithm:

A. Computing efficiency depends on the number of nodes participating in the protocol. Because each replica node needs to synchronize P2P consensus with other nodes, the performance will decline rapidly with the increase of nodes. However, in the case of fewer nodes, it can have good performance, and the probability of bifurcation is very low, so it is not suitable for block chains with too many nodes, and the scalability is poor.

B. The system node is fixed and can not cope with the open environment of the public chain. It is only applicable to the alliance chain or private.

There is a chain environment.

C, PBFT algorithm requires the total number of nodes n > = 3f+1 (where f represents the number of bad nodes). The number of failure nodes in the system should not exceed 1 / 3 of the nodes in the whole network, and the fault tolerance rate is relatively low.

6. Application scenario of PBFT algorithm

The number of nodes in PBFT algorithm is fixed, the identity of nodes is determined in advance, and can not be dynamically added or deleted, so it can only be applied to federation chain or private chain scenarios with a fixed number of nodes.

PBFT is used in many scenarios. In block chain scenarios, it is generally suitable for private chain and alliance chain scenarios that require strong consistency, but if DPOS nodes represent election rules, it can also be applied to public chains, and the problem of Byzantine fault tolerance can be solved in an untrusted network. In the Hyperledger Fabric project, the consensus module is designed as a pluggable module to support consensus algorithms such as PBFT, Raft and so on.

3. POW algorithm 1. Brief introduction of POW algorithm

POW (Proof of Work), that is, proof of workload, is also called mining. Workload proof guesses a value (nonce) through calculation, so that the hash value of the content after piecing together the transaction data meets the prescribed upper limit. As the Hash problem requires a large number of calculations under the current computing model, it can ensure that only a few legal proposals can appear in the system for a period of time. If a legal proposal can be put forward, it will prove that the sponsor has indeed paid a certain amount of work.

Hash cash is a workload proof mechanism invented by Adam Back in 1997 to combat email denial of service and spam gateway abuse.

2. The principle of POW algorithm

The main feature of workload proof is that the client needs to do some difficult work to get a result, but it is easy for the verifier to check whether the client has done the corresponding work. A core feature of the workload proof scheme is asymmetry: the work is moderate for the requester and easy to verify for the verifier.

Given a basic string, add an integer value called nonce after the base string, and perform a SHA256 hash operation on the changed string (adding nonce). If the hash result (expressed in hexadecimal form) begins with a string (such as "0000"), the verification passes. In order to achieve the goal of workload proof, it is necessary to continuously increase the nonce value and perform SHA256 hashing on the new string.

Because the given basic string is uncertain in different situations, for different combinations of basic string and nonce, the number of times to calculate the hash value at the beginning of a string by using SHA256 is uncertain, but it will be a statistical probability event.

According to the rule, it is expected to make about 2 ^ 16 attempts (the pseudorandom property of the hash value can be estimated) to get the hash hash of 4 leading zeros.

3. POW implementation of Bitcoin.

A Bitcoin block consists of a block head and a list of transactions contained in the block. The block size is 80 bytes and consists of:

4 bytes: version number

32 bytes: hash value of the previous block

32 bytes: the Merkle root hash of the transaction list

4 bytes: current timestamp

4 bytes: current difficulty valu

4 bytes: random number Nonce value

The 80-byte block header is the input string of the bitcoin Pow algorithm.

The list of transactions is appended to the block, the first of which is a special transaction in which miners receive rewards and fees.

The process of workload proof is to continuously adjust the Nonce value and do a double SHA256 hash operation on the block head, so that the result satisfies the hash value of a given number of leading zeros, in which the number of leading zeros depends on the difficulty of mining. The more the number of leading zeros, the greater the difficulty of mining.

The new difficulty is calculated for every 2016 blocks created, and the next 2016 blocks use the new difficulty. The calculation steps are as follows:

Find the first block of the first 2016 blocks and calculate the time it takes to generate the 2016 blocks.

That is, the time difference between the last block and the first block. The time difference is not less than 3.5 days and not more than 56 days.

B. calculate the total difficulty of the first 2016 blocks, that is, the difficulty of a single block x the total time.

C. Calculate the new difficulty, that is, the total difficulty of 2016 blocks / 14 days, and get the difficulty value per second.

D, a new difficulty is required, and the difficulty is not lower than the minimum difficulty defined by the parameters.

4. Advantages and disadvantages of POW algorithm.

Advantages of POW algorithm:

A. complete decentralization

B, nodes go in and out freely

C, high security

Disadvantages of POW algorithm:

A. The right of bookkeeping is centralized to capital.

B. Mining causes a lot of waste of resources.

C, the network performance is too low, need to wait for multiple confirmations, it is easy to produce bifurcations, the period of block confirmation consensus is long, so it is not suitable for commercial applications.

5. Application scenario of POW algorithm

The POW consensus mechanism is used in most mining blockchain projects, such as BTC, ETH, and LTC.

4. POS algorithm 1. Brief introduction of POS algorithm

POS (Proof of Stake), that is, proof of interest, is an upgrade consensus mechanism of POW, which reduces the difficulty of mining in an equal proportion according to the proportion and time of tokens occupied by each node, so as to speed up the speed of finding random numbers.

In view of the shortcomings of POW, Sunny King proposed POS in 2012 and released dot coin PPCoin based on the hybrid mechanism of POW and POS.

2. The principle of POS algorithm

The core concept of POS principle is the age of money, that is, the time it takes to hold money. For example, if you have 10 coins and hold them for 90 days, you will have a currency age of 900 yuan days.

The use of currency means the destruction of the currency age.

There is a special transaction in POS called interest coin, in which the holder can consume the age of the currency to earn interest and gain priority to generate blocks for the network and POS coins.

POW proves that he is qualified to write blockchain through math, while PoS proves that he is qualified to write blockchain by the age of his currency.

3. Advantages and disadvantages of POS algorithm.

Advantages of POS algorithm:

A, do not consume a lot of power mining, save energy consumption.

B. shorten the time for reaching a consensus to a certain extent

C, prevent cheating.

Disadvantages of POS algorithm:

A. in essence, mining is still needed, and the pain point of commercial application has not been solved.

B. In extreme cases, it will lead to centralization.

4. POS implementation of Dot Coin

The POS proof formula for dot coins is as follows:

Proofhash < currency age x target value

Expand as follows:

Hash (nStakeModifier + txPrev.block.nTime + txPrev.offset + txPrev.nTime + txPrev.vout.n + nTime) < bnTarget x bnCoinDayWeight

Where proofhash corresponds to the hash value of a set of data, namely hash (nStakeModifier + txPrev.block.nTime + txPrev.offset + txPrev.nTime + txPrev.vout.n + nTime).

Currency age is bnCoinDayWeight, that is, currency days, that is, the number of currencies held multiplied by the number of days held, where the maximum number of days is 90 days.

The target value, bnTarget, is used to measure the difficulty of POS mining. The target value is inversely proportional to the difficulty, the larger the target value, the smaller the difficulty, and vice versa.

Therefore, the larger the amount of money held, the greater the chance of digging up the block.

5. DPOS algorithm 1. Brief introduction of DPOS algorithm

DPOS (Delegated Proof of Stake), the stock authorization proof algorithm, is a new consensus algorithm based on POW and POS, which was proposed and applied by Dan Larimer (now EOS CTO), the chief developer of Bitshares in April 2014. DPOS can not only solve the problem of excessive energy consumption of DPOS caused by POW in the process of mining, but also avoid the problem of biased trust balance under the distribution of POS rights and interests.

DPOS is created by trusted accounts elected by the community (super nodes, such as the top 101st votes can be). For example, 101 super nodes, that is, 101 mining pools, are selected, and the rights between the super nodes are exactly equal. Ordinary money holders can replace super nodes (mining pools) at any time by voting. The decentralization of DPOS does not mean that every currency holder has a direct equity interest, but requires indirect voting power to ensure that the elected super nodes do not do evil, and can also canvass their own votes to become super nodes or backup super nodes.

2. The principle of DPOS algorithm

The basic principles of DPOS algorithm:

A. shareholders exercise their voting rights according to their shares, rather than relying on mining for bookkeeping rights.

B. Maximize the profits of shareholders.

C, minimize the cost of maintaining network security.

D, maximize the efficiency of the network.

E, minimize the cost of running the network (bandwidth, CPU, etc.), and realize the security and stability of the network without wasting a lot of power.

In the DPOS consensus algorithm, the normal operation of the blockchain depends on the super node, and the main responsibilities of the super node are:

A. provide a server node to ensure the normal operation of the node

B, the node server collects transactions in the network

C, the node verifies the transaction and packages the transaction into a block

D, the node broadcasts the block, and other nodes add the block to their own database after verification

E. Lead and promote the development of block chain projects.

The operation mechanism of DPOS algorithm is as follows:

A. all money holders first select the super node to sign the block.

The rule of B and DPOS is that the longest chain wins, and each super node must take turns to generate blocks according to the production schedule. Assuming that the production blocks A, B and C are scheduled to take turns respectively, the blocks generated within a certain period of time are as follows:

C, if malicious nodes produce bifurcated blocks, assume that An and C are honest nodes, and only B nodes are malicious. Because the speed at which B generates blocks (producing one block per cycle) is slower than that at which An and C work together to produce blocks (one block per cycle), according to the rule of winning the longest chain, the honest node will still win.

D, the speed of duplicating two blocks of the same node is also slower than that of honest nodes working together to generate blocks, so the longest chain winning rule will ensure that the chain of honest nodes will win.

E, if the network of the three super nodes are fragmented and go their own way. It is true that three-chain parallelism is possible in the short term, but once the network connection is restored, the short chain will naturally return to the longest chain. The number of super nodes is odd, so the stalemate between the two factions will not last long, and eventually one of them will have a longer chain.

3. Punishment of malicious nodes by DPOS

Registration as a candidate supernode requires a deposit (about 10 XTS), which usually takes about two weeks to break even, thus promoting the stability of the trustee and ensuring that the mine will be dug for at least two weeks.

The penalty for malicious nodes by DPOS is that super nodes that do not produce blocks according to schedule will be voted out in the next round and the deposit will be confiscated.

Malicious nodes can do evil in a short time, and malicious blocks can only be retained for a short time. Soon, super nodes will return to the consensus reached by honest nodes, creating the longest chain and returning to the longest chain without evil blocks.

4. Advantages and disadvantages of DPOS algorithm.

Advantages of DPOS algorithm:

A, the energy consumption advantage, the network operation cost is low.

B, more decentralized in theory.

C, faster confirmation speed. The speed of the block is seconds and the transaction is confirmed in minutes.

Disadvantages of DPOS algorithm:

A, bad nodes can not be dealt with in time, only after the election can remove bad nodes.

B. Xiaosan's enthusiasm for voting is not high.

C, relying on tokens

5. Application scenario of DPOS algorithm

Bitshare is a kind of cryptographic currency using DPOS mechanism, which reduces the negative impact of centralization by introducing a techno-democratic layer.

There is still centralization in the DPOS system, but the centralization is controlled and the super nodes are elected by ordinary money holders. DPOS decentralizes the system by retaining control of the money holder.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.