In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article introduces you what are the basic theories of distributed systems in web development, the content is very detailed, interested friends can refer to, hope to be helpful to you.
I. the relationship between distributed system and Zookeeper
1.1 centralized servic
Let's start with the development of service deployment architecture. in fact, it is nothing more than centralized and distributed, that is, everything is done by one machine. Distribution is the joint completion of multiple servers. So at the beginning, we usually start with a server and deploy our services, and then there are some old routines. Web applications are deployed on Tomcat to open port 8080 to provide services, and then a database service it needs is open to port 3306 to provide. Its advantage is that the structure, deployment, and project architecture are relatively simple.
Then according to the development of the business to expand, that expansion can also be divided into two ways, one is horizontal expansion, the other is vertical expansion. Since one can not be done, either improve the performance of this server, or a few more together. But let's think about it, it's not an individual who will be obedient to the server. As soon as this machine hangs up, it's all dead. And the purchase of mainframes, as well as R & D, maintenance personnel, it will cost a lot of money. Here is an extension of Moore's Law.
Anyway, to put it simply, I spend twice as much money and can't buy twice as much performance at all. But horizontal expansion is not the same, one person can not beat, just ask a few more people to play together?
1.2 go to IOE for exercise
Alibaba came up with a slogan, specifically IBM minicomputer, Oracle database, EMC high-end storage, you can also learn about it if you are interested. Because the problem faced at that time is that if enterprises need to improve their stand-alone processing capacity, the cost will be very high and the performance-to-price ratio will be very low. I was afraid of this and that all day, and the whole service was stopped as soon as it went down. Slowly, many domestic companies responded together, and the distribution began.
1.3 distributed services
Distributed system has its specific definition: distributed system is a system in which hardware or software components are distributed on different network computers and communicate and coordinate with each other only through message transmission. So it's a bunch of computers working together to provide services, but for users, it's like a machine doing it.
There are many characteristics, roughly the following five:
Distribution: this means that multiple computers are placed in different locations
Peer-to-peer: multiple work nodes in the cluster are all the same and do the same job. And there is the concept of copy.
Concurrency: data inconsistencies that may be caused by multiple machines operating a piece of data at the same time
Global clock: the sequence of events on multiple hosts will affect the results, which is also a very complex problem in distributed scenarios
All kinds of faults: a node is down, the network is not good and unexpected.
1.4 several problems often encountered in distributed scenarios
Communication exception: it is actually a network problem, which leads to data inconsistency in multi-node state.
Network isolation: this is actually the internal normal of each sub-network, but the network of the whole system is abnormal. Problems that lead to local data inconsistencies
Node downtime problem
Distributed three states: success, failure, timeout these three states lead to various problems. Both the request and the result response may be lost, and it is not possible to determine whether the message was sent / processed successfully
Data loss: this is usually solved by reading from other nodes through the copy mechanism, or for stateful nodes, the loss of data can be solved by restoring the state.
Exception handling principle: any exception considered in the design phase must be assumed to occur in the actual operation
1.5 performance criteria for measuring distributed systems
Performance: mainly throughput capacity, response delay, concurrency ability. The total amount of data that the system can process at a certain time is usually measured by the total amount of data processed by the system per second, while the response delay refers to the time it takes to complete a function. Concurrency is the ability to perform a function at the same time, usually measured by QPS
Availability: the ability to provide services correctly in the face of various exceptions. For example, we often say that five 9s means that there is only five minutes of downtime in a year. Six nines is 31 seconds.
Scalability: the effect that the performance of the system can be improved by expanding the size of the machine.
Consistency: replica management
However, these standards are that too high requirements in one aspect will lead to deterioration on the other. For example, we need to be highly available and may need multiple copies, but in the case of multiple copies, it is difficult to achieve data consistency. Then it is difficult to achieve low latency under high throughput, so we need to consider our own business scenarios.
1.6 extension of consistency
Strong consistency: after the write operation is completed, the read operation must be able to read the latest data, which is very difficult to implement in distributed scenarios, such as Paxos algorithm, Quorum mechanism, and ZAB protocol.
Weak consistency: it is not promised that the value can be read and written immediately, nor how long the data will be consistent, but as far as possible to ensure that the data can reach a consistent state at a certain time level (for example, XX, XX minutes, XX seconds later).
It also has a special case called final consistency, which is to ensure data consistency as soon as possible. But there is no exact definition of how fast this is. For example, the girl friend wants to eat fried chicken, and you order takeout, but Meituan rider, hungry rider is not sure when it will be delivered, he can only promise to deliver it as soon as possible. That's what I mean.
Because the final consistency is really too weak, we still have some special cases where read-write consistency occurs. It means that users who read the results they write can always see their updated content as soon as they are read. This is just like Wechat moments. What we send out, Wechat is sure to let us see, but it is impossible to say whether friends can be seen immediately after they are posted.
There are also some monotonous read consistency, causal consistency is not explained, interested partners can search on their own.
In a word, in order to ensure the high availability of the system, prevent the problems caused by a single point of failure, and enable the copies distributed on different nodes to provide normal services for users, our Zookeeper came into being. It can help us solve the problem of data consistency in this distributed system.
We need to solve this problem, we need to understand distributed transactions, distributed consistency algorithms, Quorum mechanisms, CAP and BASE theory, and then we slowly expand.
II. Distributed transactions
Transaction: in a stand-alone storage system, it is used to ensure the consistency of the data state of the storage system. Is this a bit of a mouthful to read? OK, let's put it another way. In a broad sense, transaction refers to all the operations of a thing, or all of them succeed, or all fail, and there is no intermediate state. In a narrow sense, that's what the database does. The feature is also very simple, that is, the familiar ACID.
In a distributed system, each node only knows whether its operation is successful, but does not know what the other nodes are like, which may lead to inconsistent state of each node, so in order to implement ACID that spans multiple nodes and ensures transactions, it is necessary to introduce a coordinator, and then each node participating in the transaction is called a participant.
The typical routines are 2PC and 3PC, and then we'll start slowly.
2.1 what is 2PC?
There will be multiple roles in the process of participating in the transaction. For the time being, we understand that the coordinator is responsible for initiating the transaction, while the participant is responsible for executing the transaction.
Assuming that there are three roles above, one coordination and two participants, we need An and B to execute a transaction and require that the transaction either succeed or fail at the same time.
2PC phase 1: execute the transaction
At this point, the coordinator will first issue a command requiring both participant An and participant B to perform the transaction, but do not commit.
To be a little more detailed, you will generate logs to write redo,undo, lock resources, and execute transactions. But after the implementation, report directly to the coordinator and ask, Big Brother, can I submit it?
This should be often encountered in the daily process of writing Java, that is, a lot of operations have been written before, but in the end, something like conn.commit () must be written. This is the so-called execute but not commit.
2PC phase II: commit transaction
When the coordinator receives feedback from all the transaction participants in the first phase (Agraine B in the figure) (this feedback is simply understood as telling the coordinator that the previous first phase was successfully executed), a command is sent to all participants to commit the transaction.
If you want to be more specific, that is, the coordinator receives feedback and all participants can submit their responses, then notify the participants to commit, otherwise rollback
So 2PC is also called two-phase commit, in fact, it is simply divided into two steps, one-step execution and one-step submission.
Four disadvantages of 2PC: performance
The whole process shows that this obviously produces synchronous blocking, and each node that needs to operate the database takes up the resources of the database. Only when the coordinator receives feedback that all nodes are ready, the transaction coordinator notifies the commit or rollback, and the participant finishes the commit or rollback operation before releasing the resources.
Four disadvantages of 2PC: single point of failure
Then we have just learned that the coordinator is the core of this business. If the coordinator goes down at this time, it will cause problems that the notification cannot be communicated to the participants, for example, if the commit or rollback is not received, the whole transaction will stagnate.
Four disadvantages of 2PC: data inconsistency
The coordinator sends a commit or rollback in the second phase. However, this does not guarantee that every node will receive the command normally, so it may be that participant A receives the order and commits the transaction, but participant B does not. So Internet volatility is an eternal cause, and you can never avoid it.
Four disadvantages of 2PC: there is no fault-tolerant mechanism
The coordinator needs to receive all the feedback from the node to be ready to complete before issuing the instruction of commit. If any participant's response is not received, the coordinator will wait, and as long as there is a down node, the whole transaction will fail to roll back.
2.2 what is 3PC?
Under the premise of 2PC, an improvement is carried out to split the preparation stage in 2PC to form three stages of can commit,pre commit,do commit.
And the timeout mechanism is introduced. Once the transaction participant does not receive the coordinator's commit or rollback instruction within a specified period of time, the local commit will be carried out automatically to solve the coordinator's single point of failure.
3PC Phase I cancommit
The coordinator first asked: Hey, can you guys do it or not? The participants answered yes or no according to their own actual situation.
3PC second stage precommit
If all participants return consent, the coordinator sends a pre-submit request to all participants and enters the preparation phase, which means that participants lock resources, wait for instructions, and then execute the transaction, just like 2PC, which executes but does not commit. Then wait for the instructions from the coordinator. If you don't get the instructions for a long time, you will submit them locally after a period of time.
However, there will also be drawbacks. For example, the coordinator successfully sends rollbacks to all participants in 1JI 2, and then 3 just does not receive it, so 3 is automatically submitted, so the timeout mechanism does not fully guarantee the consistency of the data.
Third, distributed consistency algorithm
3.1 Paxos algorithm
I don't know if you have seen my article on high concurrency from scratch last year (3)-if you need to know more about the building of Zookeeper clusters and leader elections, please skip to that article.
Paxos algorithm is a consistent algorithm named Lesile Lamport, which is based on message passing and has high fault tolerance.
Do you feel tongue-twisting? Nothing, we just need to know that a series of problems such as kill, message delay, repetition and loss will inevitably occur in distributed systems. Paxos algorithm is something that still ensures data consistency in these abnormal cases. What does this have to do with Zookeeper? there is a ZAB protocol for Zookeeper, but the underlying layer of this ZAB protocol encapsulates the Paxos algorithm.
3.2 roles in Paxos and their relationship with Zookeeper clusters
Proposer proponent: as the name implies, is the person who initiated the proposal.
Acceptor recipients: they can vote, accept or reject proposals
Learner Learners: if the proposal is accepted by more than half of the Acceptor, learn this proposal
Mapped to the Zookeeper cluster, they are leader,follower,observer, which is a bit like the relationship between the chairman, NPC deputies, and the people of the whole country. The chairman puts forward a proposal, the NPC deputies participate in the vote, and the people of the whole country passively accept it. Compared with the previous 2PC Personality 3PC, it only needs half of the approval to submit. So this belongs to weak consistency, 2PC to 3PC, and these belong to strong consistency.
3. 3 Raft algorithm
Please click on this link, I am sure you will be able to grasp it soon. Http://thesecretlivesofdata.com/raft/, I'd like to explain a little bit here. This is a form of PPT that tells you what Raft is. It's very easy to understand. I'll skip some of the things in front of you here and go straight to the topic.
As mentioned here, Raft is a protocol for implementing distributed consensus algorithms.
It is assumed that a node has three different states
The first kind, follower state (wireless strip)
The second kind, candidate state (dotted line)
Third, leader state (solid line) remember that leader is elected from the candidate candidate.
First of all, as soon as we come up, all the nodes are follower state
Next, all the follower nodes look for leader, and when they can't find it, they will spontaneously become candidates and vote (ask others if they approve of me being leader). What will happen if they can't find it? Then leader must be dead.
At this point, it will send a proposal for voting to other nodes, and then other nodes will give it feedback, and when it receives feedback from more than half of the nodes, it can naturally become a leader.
After that, the request for writing data will be sent directly to leader and broadcast by leader to other follower. In this case, as long as more than half of the nodes return positive feedback, the transaction will be executed, and then leader will send them a commit command, even if the transaction is executed successfully.
3.4 ZAB protocol
High concurrency of content starting from scratch (4)-Zookeeper distributed queue
The underlying implementation of Zookeeper is the ZAB protocol, which implements crash recovery (leader crash) and message broadcast (client write data Zookeeper to ensure that multiple nodes are written successfully). The main thing is to ensure that transactions committed on the leader server will eventually be committed by all servers, and that transactions submitted only on the leader server will be discarded.
3.5 Quorum NWR mechanism
Quorum NWR:Quorum mechanism is a commonly used voting algorithm in distributed scenarios to ensure data security and achieve ultimate consistency in distributed environments. The main principle of this algorithm comes from the pigeon nest principle. Its greatest advantage is that it can not only achieve strong consistency, but also customize the consistency level.
Pigeon nest principle, also known as Dirichlet drawer principle, pigeon cage principle.
One of the simple expressions is: if there are n cages and 1 pigeon, and all the pigeons are kept in the pigeon cage, then at least one cage has at least 2 pigeons.
The other is: if there are n cages and kn+1 pigeons, and all pigeons are kept in pigeon cages, then at least one cage has at least 1 pigeon.
Why start with the drawer principle? On the one hand, we are familiar with this, and it is easy to understand. On the other hand, it has similarities and differences with the Quorum mechanism. Drawer principle, 2 drawers each drawer holds a maximum of 2 apples, now there are 3 apples, no matter how to put them, one of the drawers will definitely contain 2 apples. So we modified the drawer principle, two drawers put two red apples, the other put two green apples, we take out three apples, no matter how to take at least one is a red apple, this is also very simple to understand. We regard red apples as updated valid data and green apples as unupdated invalid data. You can see that we don't need to update all the data (not all red apples) to get valid data, but of course we need to read multiple copies (take out multiple apples).
What exactly does the NWR of the Quorum NWR mechanism mean?
N: the number of nodes copied, that is, the number of copies of a piece of data saved. W: the number of nodes in which the write operation was successful, that is, the number of successful copies of each data write. W must be less than or equal to N. R: the minimum number of nodes required for a read operation to get the latest version of data, that is, at least the number of copies that need to be read for each successful read.
Summary: these three factors determine availability, consistency, and partition fault tolerance. As long as you guarantee (W + R > N), you will be able to read the latest data, and the level of data consistency can achieve strong consistency according to the constraint of the number of read and write copies!
It is discussed in the following three cases: premise, when N is fixed.
W = 1, R = N ~ (th) write Once Read All
In a distributed environment, write a copy, so if you want to read the latest data, you must read all the nodes and then take the latest version of the value. Write operations are efficient, but read operations are inefficient. High consistency, poor partition fault tolerance and low availability
R = 1, W = N, Read Only Write All
In a distributed environment, all nodes are synchronized before they can be read, so as long as you read any node, you can read the latest data. Read operation is efficient, but write operation is inefficient. Partition has good fault tolerance, poor consistency, higher implementation difficulty and high availability.
W = Q, R = Q where Q = Nmax 2 + 1
It can be simply understood as writing more than half of the nodes, then reading is also more than half of the nodes, achieving a balance of read-write performance. It is suitable for general applications and achieves a balance between reading and writing performance. There is a balance between fault tolerance, availability, and consistency of partitions, such as Number3, Widel2, and Run2. And that's how ZooKeeper designed it.
It should be added that Zookeeper does not implement that the client must read more than half of the nodes, so it allows the client to read data that is not the latest synchronization, but it is less likely. In fact, the client will not be able to connect to the node where the data is not synchronized, because whether it is caused by network problems or machine problems, if leader cannot send data in the past, the client will certainly not be able to connect to it. If it happens that the access is initiated by the client in the middle state of the synchronous data, there is a way to solve it, and you can find out for yourself.
3.6 CAP theory
CAP theory was first put forward in July 2000. CAP theory tells us that it is impossible for a distributed system to meet the three requirements of Cpene AMagi P at the same time.
C:Consistency, strong consistency, consistent A:Availability, high availability, the services provided by the system must always be available, for each user's operation request can always return the result of P:Partition Tolerance partition fault tolerance, distributed systems still need to be able to provide services that meet consistency and availability when they encounter any network partition failure.
Since a distributed system can not meet the three requirements of Cpene An and P at the same time, we have to make a choice according to our needs.
Give up P: the simplest extreme way is to put it on one node, that is, there is only one copy of the data, all read and write operations are concentrated on one server, and there is a single point of failure. Giving up P means giving up the scalability of the system, so generally speaking, distributed systems will guarantee that P.
Abandon A: once the system encounters a network partition or other failure, the service needs to wait for a period of time, and the service cannot be provided normally within the waiting time, that is, the service is not available.
Abandon C: in fact, to give up consistency means to give up the strong consistency of the data and retain the final consistency, and how long it takes to achieve data synchronization depends on the design of the storage system.
CAP can only choose 2 from 3, because fault tolerance P must be necessary in distributed systems, so there are only two cases, network problems lead to either error return or blocking waiting, the former at the expense of consistency and the latter at the expense of availability. For example, HBase is about data consistency, while Cassandra is about usability.
Summary of experience:
Don't waste your energy designing distributed systems that satisfy CAP at the same time.
Partition fault tolerance is often a problem that distributed systems must face and solve. So we should focus on how to find a balance between An and C according to the characteristics of the business.
For stand-alone software, because you don't have to consider P, it must be CA, such as MySQL
For distributed software, because P must be considered, but not both An and C, there can only be a tradeoff between An and C, such as HBase, Redis and so on. Make sure that the service is basically available and the data is finally consistent.
Therefore, the BASE theory came into being.
3.7 BASE theory
In most cases, in fact, we do not have to seek strong consistency, some businesses can tolerate a certain degree of delayed consistency, so in order to take into account the efficiency, the final consistency theory BASE is developed, which is proposed by the architect from ebay. The full name of BASE theory: abbreviations of three phrases: Basically Available (basic availability), Soft state (soft state), and Eventually consistent (final consistency). The core idea is that even if strong consistency cannot be achieved, each application can make the system achieve final consistency in an appropriate way according to its own business characteristics. In a word, don't go to extremes. BASE is the result of a tradeoff between C and An in CAP theory.
It is not strong consistency, but final agreement. Not high availability, but basic availability.
Basically Available (basic availability): basic availability means that when a distributed system fails, it is allowed to lose part of its availability, that is, to ensure the availability of the core.
For example, Taobao double 11, in order to protect the stability of the system, normally place orders, other edge services can be temporarily unavailable. Downtime for some non-core services is allowed at this time.
Soft State (soft state): soft state means that the system is allowed to have an intermediate state, which does not affect the overall availability of the system. In general, there are at least three copies of a piece of data in distributed storage, and the delay that allows copies to be synchronized between different nodes is the embodiment of the soft state. In popular terms, delays are allowed when different nodes synchronize data, and the intermediate state that exists when data synchronization delays occur does not affect the overall performance of the system.
Eventually Consistent (final consistency): ultimate consistency means that all copies of data in the system can finally reach a consistent state after a certain period of time. Weak consistency is contrary to strong consistency. Final consistency is a special case of weak consistency, which requires final consistency rather than real-time strong consistency.
In general, we mentioned the analysis of centralized and distributed service deployment architectures and the various difficulties encountered in designing distributed systems: the problem of data consistency.
2PC and 3PC are common ideas, but they still have some drawbacks. Even if there are some thorny problems such as distributed network communication anomalies in Paxos Raft ZAB, the above algorithms can achieve consistency.
Parliamentary Quorum NWR mechanism: r + W > N = = > the minority is subordinate to the majority
The conflict between consistency and usability, CAP and BASE: distributed systems must meet P, and can only be weighed between C and A.
Most of the systems are BASE systems (basic availability + final consistency)
Which of the basic theories of distributed systems in web development are shared here, I hope the above content can be of some help to everyone and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.