What are the knowledge points of ZooKeeper? 04/26 Update SLTechnology News&Howtos

What are the knowledge points of ZooKeeper?

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article shows you what ZooKeeper knowledge points are, the content is concise and easy to understand, it can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

First, from centralized to distributed 1. Centralized type

Centralized features: do not consider network partitioning, downtime, collaboration problems. But expensive.

two。 Distributed system

Distributed: composed of multiple machines communicating through the network, multi-node machines, distributed, high frequency of failure.

Cause of failure: network problems, multi-machine network communication is easy to time-out, the middle may be cut off to cause partition.

Network partition: commonly known as brain fissure.

The three-state problem of the network: either the connection is successful, it fails, or it times out.

So distribution is mainly the problem of network and machine, and the biggest problem is the network problem.

3. ACID

For example, transactions in a database

Atomicity: success or failure

Consistency: if there is abnormal data, it will be the same.

Isolation: sessions are independent of each other

Persistence: that is, the submitted data is permanently stored on disk without loss.

4. CAP

Consistency: multiple machines have the same copy

Availability: respond to the client at the right time

Partition fault tolerance: no network partitions can occur

5. BASE

The main problem with distribution is the network, so we give priority to the distributed network and then trade off between consistency and availability. At this time, it is proposed that BASE is the tradeoff between the consistency and availability of CAP.

BASE is basically available and ultimately consistent

Ultimate consistency: you submit a machine, and other machines will eventually synchronize the machine you submitted to yourself.

Second, consistency

The master control machine is called the coordinator, and the copy machine is called the participant.

1. 2PC

Two phases, commit transaction and execute transaction.

Commit transaction

1)。 Inquire

2)。 Perform initialization (perform commit)

Execute a transaction

The failure condition of 2PC

Single machine

1. If the coordinator fails, a backup coordinator takes over and queries the participants to what extent they are currently executing, and then goes on.

two。 If the participant fails, the coordinator will wait for it to restart before proceeding.

In both cases, there will only be blocking and there will be no inconsistency.

Simultaneous failure:

Coordinator and participant fail at the same time

For example: there are machines 1, 2, 3, 4. Among them, 4 are coordinators, and 1, 2 and 3 are participants.

4 to 1Jing 2 failed after committing the transaction, just 3 also broke down at this time, note that this is 3 is not committed transaction data. Now the backup coordinator starts to ask the participant that 3 is dead and does not know what state it is in (accepting the commit transaction or feedback on whether it can or cannot be executed).

In the face of this situation, let 1minute 2 return to stop the transaction, and after the recovery, no matter what the state is, stop the transaction directly and roll back, which ensures consistency.

This is a loophole in 2PC. When it fails at the same time, it will all be rolled back, which is inefficient and costly.

Shortcoming

When the coordinator goes wrong and the participant goes wrong, the integrity of the transaction execution cannot be guaranteed in the two phases.

Consider that the coordinator goes down after sending the commit message, and the only participant who receives the message is also down. So even if the coordinator produces a new coordinator through the election protocol, the status of the transaction is uncertain, and no one knows whether the transaction has been committed.

2. 3PC

At that time, 2PC only considered the case of stand-alone failure, and did not consider the situation where the coordinator and the participant failed at the same time. 3PC is a supplementary protocol for 2PC vulnerabilities. It splits the commit transaction request in 2PC into two, adding a state (preparation phase). So that even if two simultaneous failures do not block, and ensure consistency.

Transaction inquiry phase (can phase)

Transaction preparation phase (pre) (including performing commit)

Execute the transaction phase (do commit) (change the state)

The following examples are analyzed:

1. 4 is the coordinator, and 1 ~ 2 is the participant, when the news node, the coordinator 4 and the participant 3 are dead at the same time. At this point, the backup coordinator starts and asks for the status of 1 and 2, both in the can phase, at the cost of rollback, directly rollback. When 3 restarts and finds itself in the can phase, under the rollback cost, execute the rollback and stop the transaction. In this way, there is data consistency among the three nodes. 3PC corresponds to 2PC to reduce the rollback cost.

When all three participants entered the per preparation phase, indicating that they had been questioned and voted (meaning they all received the data), coordinator 4 and participant 3 died suddenly. After the backup coordination, ask the status of other participants, a look at the pre stage, indicating that they have received the data, including the dead 3, because he voted to enter the preparatory stage, so he asked everyone to submit. When 3 restarts, I read the log and find that I have died in the preparation stage, and I can perform a commit, which ensures data consistency.

3. Paxos

There is a small island called paxos, where each decision has to pass a proposal and then half of it will take effect, and each decision proposal has a unique global number, which can only be self-increasing, not backward.

What is passed: that is, the proposed id is better than the largest id recorded by MPs.

The first stage: the proponent initiates a proposal to each member, and then waits for members' feedback to agree or disagree.

The second stage: if more than half agree, the transaction will be executed, otherwise it will not be executed.

If more than half agree, the motion will be passed, and the proponent will order the remaining members to synchronize their data and revise their maximum id.

Problem: concurrency is common in distribution, for example, there is now a proponent of p1p2

Proponents also put forward a proposal, at this time their id may be the same, the id of p1 is 3p2 id is also 3. When p1 is proposed to the member (assuming that the id in the member's hand is 2), now the member has first agreed to visit the member with p1p2, and the member has told him that he has agreed to the question that the id is 3p2 and the id is 3 disagree. Then p2 went back to increase his id request, and the congressman agreed to him at this time. P1 received half of the agreement to inform them to update id synchronization data, but found that lawmakers'id is larger than their own, and then p1 has increased id. This extreme situation leads to a deadlock.

The solution is that there is only one proponent, that is, the president in paxos.

3. Zookeeper1. What is zookeeper?

Zookeeper is an engineering application to solve the problem of distributed consistency.

Zookeeper does not use paxos protocol directly, but proposes a highly available consistency protocol, ZAB atomic broadcast protocol, which accords with its own practical application scenario on the basis of paxos protocol.

Features of zookeeper distributed consistency:

Order consistency: the client accesses a node of the zookeeper and initiates the transaction, which is queued to the leader to initiate the proposal, one by one.

Single view: the data on any node is the same, so the client accesses any node to see the same data.

Reliability: if you give feedback to a client and agree to his request, you really agree.

Real-time: zookeeper ensures that you can access the latest data within a certain period of time, such as 5 seconds. This is the result of ultimate consistency.

2. Zookeeper design goals

Simple data model: the tree structure of the folder

You can build a cluster:

Sequential access: the client makes a transaction request and gets a unique id number for the sequence of operations

High performance: this refers to reading data

3. Several roles of zookeeper

Zookeeper has several roles: leader, follower, observer;, in which observer is generally not configured, and it does not participate in voting. Observer can improve the read performance of the cluster without affecting write performance.

The nodes in zookeeper include physical machine nodes and znode data nodes. The znode data node refers to the directory folder. Data nodes have permanent data nodes and temporary nodes.

4. Watcher snooping mechanism

Zookeeper has a watcher snooping mechanism, such as a temporary data node. If the customer session is interrupted, the temporary node is deleted, and then the watcher listens. This is the HA implementation mechanism of hadoop, and zkfc implements the watcher mechanism of zookeeper to switch automatically.

5. Permissions of zookeeper

The data node of zookeeper is a folder directory, which has its own permission mechanism ACL.

The actual zookeeper is deleted and set. Create a directory, and these are the execution permissions.

4. Zab protocol 1. Three states of zab:

Looking/election: the system was in an election state when it first started or after the Leader crashed

The state of the Following:Follwoer node. Follower and Leader are in the data synchronization phase.

The state of Leading:Leader. There is currently a Leader main process in the cluster.

2. The three macro stages of zab:

Crash recovery

Quick election

Atomic broadcast

3. The four micro stages of zab:

Election

find

Synchronization

Broadcast

The client submits a request to leader and then leader initiates a proposal floower to vote, which is all atomic broadcast.

4. Explain the node status of zab

When ZooKeeper starts, the initial state of all nodes is Looking, then the cluster will initially elect a Leader node, and the elected Leader node will switch to Leading state.

When a node finds that a Leader has been elected in the cluster, the node will switch to the Following state and then keep synchronized with the Leader node

When the Following node loses contact with the Leader, the Follower node will switch to the Looking state and start a new round of election.

Each node transitions between Looking, Following, and Leading states throughout the life cycle of ZooKeeper.

When started, leader crashed the election.

After electing the Leader node, the zab enters the atomic broadcast phase, which means that for each node Leader synchronizes with itself, the Follower can only be synchronized with one Leader. The Leader node and the Follow node use heartbeat detection to sense the existence of each other.

When a Leader node receives heartbeat detection from Follower during the timeout, the Follower node will always stay connected to that node. If the Leader does not receive heartbeat detection or TCP disconnection from more than half of the Follower nodes within the timeout, then the Leader will end the current leadership and switch to the Looking state, and all Follower nodes will abandon the Leader node to switch to the Looking state, and all Follower nodes will also abandon the Leader node to switch to the Looking state and start a new round of election

5. Explain the four stages of zab in detail:

Phase I: preparing for the leader election

Conditions for becoming a leader:

Choose the largest epoch

Epoch is equal, choose the one with the largest zxid

The zxid of epoch is equal, so choose the one with the largest server id (that is, configure myid in zoo.cfg).

A node reads the default vote to itself at the beginning of the election. When it receives ballots from other nodes, it will change its own ballots and resend them to other nodes according to the above conditions. However, if a node gets more than half of the votes, the node will set its own status leading, and the other nodes will set its own status to following.

What is epoch and what is zxid?

First of all, ZooKeeper a transaction consists of two parts, a number of data, one is the global unique id, the data is the specific operation data, and is lastid plus 1

ZooKeeper each request is executed sequentially and strongly sequentially

Epoch is the leader identity and zxid is the transaction identity.

Epoch refers to: in the 1950s, one leader died and another took office. Now is the era of new leadership. When a new leader is created, the business number starts from 0.

Zxid is the general name: the first 32 bits are the leader number (epoch), and the last 32 bits are the transaction number under this leader.

The second stage: discovery phase

The discovery phase is mainly to find the largest epoch and the largest transaction number.

As mentioned earlier, the quick generation preparation leader, and then the other nodes are fllower, and then the discovery phase is fllower reporting to leader, its own epoch and transaction number, and then leader sorting, selecting the largest epoch and the largest transaction number. Then notify fllower to change its epoch.

The third stage: synchronization phase

Leader uses the previous stage to know the maximum transaction number, and then tells other fllower to leader the data. The transaction number may be different, so synchronize. Maintain the ultimate consistency of the data.

The fourth stage: atomic broadcasting stage

At this time, leader really provides services to the outside world, accepts requests from the client, generates a data, more than half agree, and then commits the transaction. The rest of the nodes go directly to leader to synchronize data.

The original hung-up leader transaction?

It starts, finds that his time is out of date, deletes transactions, discovers intentional leader, becomes fllower, and then synchronizes data.

In the election, the node with the latest proposal history (the largest lastZxid) is elected as the leader, thus eliminating the step of discovering the latest proposal. This is the premise that the local area has the latest proposed node as well as the latest submission record.

6. The difference between zab and paxos

Paxos does not give the leader election protocol, only the consistency protocol.

Fifth, the application scene of ZooKeeper in big data

ZooKeeper directory has several features, including temporary directory, permanent directory, sequential directory, strong consistency (sequential access) and watcher mechanism.

Taking advantage of these characteristics, we can achieve:

Publish subscriptions, such as some configuration information

Load balancing, such as kafka production and consumption balancing

Master election, for example, hbase uses it to hmaster election

Master / slave handover, for example, HA of hdfs uses it for handover.

1. Publish and subscribe

For example, our database configuration information file can be placed on ZooKeeper.

By using the watcher mechanism of ZooKeeper to realize the configuration change, the program can get the latest configuration information while running, and there is no need to start and stop.

2. ZooKeeper HA application (master / slave switching)

Hdfs's HA, which uses ZooKeeper to switch between Active standby.

General steps:

1)。 Multiple nodemanager register a data node lock with ZooKeeper at the same time. Because ZooKeeper is strongly consistent, there can only be one successful registration, and the successful one is active.

2)。 Those who fail to register successfully become standby, and then register the listening event watcher in the lock directory.

Note: the registered lock node directory is a temporary node, if the active is dead, this directory will be gone, and this lock directory is a permission-controlled ACL to prevent brain fissure from reconnecting after active's fake death.

3)。 Master / Slave switching: when the active is hung up, the session ends and the temporary directory is deleted automatically. Other standby listens that the temporary directory has been deleted, and each standby re-creates a temporary directory with permissions at the same time. The successful one is changed to active, and the one that is not successful is still standby.

4)。 If you find that there is no permission range lock temporary directory after the suspended active starts, it will automatically change to the standby state.

This is the ZooKeeper active / standby switching application, using temporary directory, ACL,watcher mechanism to achieve.

3. Master election

Using ZooKeeper master to elect is actually very simple, just like switching between master and slave.

Using strong consistency to create the same directory at the same time, there can only be one success.

The successful node returns the success status, the other node returns an exception, the successful node becomes master, and the other node changes to slave.

4. Application of zookeeper in hbase

Hmaster monitors if regionserver is down.

First, rs (regionserver abbreviation) registers a temporary directory rs/ [hostname] directory with ZooKeeper, and then hmaster registers watcher to monitor changes under the rs directory to see if the rs server is down.

Metadata storage (subscription publication): the location and status of each region are stored on ZooKeeper, so that everyone can subscribe to the current state of region, such as whether region is merging or slicing, and how many region are on which regionserver. So client access first accesses ZooKeeper to get location information to read data without going through hmaster.

5. Application of ZooKeeper in kafka

Information registered with ZooKeeper by kafka

First, kafka creates the broker into the ZooKeeper temporary directory.

/ broker/ids/ [1murn] indicates that broker is still alive. The topic information is then created to the ZooKeeper temporary directory. / brokers/topics/ [topicname] / paritiong information. If there are consumers, consumers will also create their own offset information of consumption information in ZooKeeper. Wait for the temporary directory.

Kafka registers broker and topic information to load balance for production and consumption, which leverages ZooKeeper load balancing. The consumer producer monitors the quantity between broker and topic,topic and partition and reorders it.

What are the knowledge points of ZooKeeper mentioned above? have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.