What is the principle and mechanism of ZooKeeper in the development of Java big data 07/13 Update SLTechnology News&Howtos

What is the principle and mechanism of ZooKeeper in the development of Java big data

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article shows you what the principle mechanism of ZooKeeper in Java big data development is. The content is concise and easy to understand. It will definitely make you shine. I hope you can gain something through the detailed introduction of this article.

Let's start with some explanation of these parameters.

1.1 Parameter interpretation

The parameters in the zoo.cfg configuration file in Zookeeper are interpreted as follows:

① tickTime =2000: communication heartbeat number, Zookeeper server and client heartbeat time, unit millisecond

The base time Zookeeper uses, the time interval between servers or between clients and servers to maintain heartbeats, that is, one heartbeat is sent per tickTime, and the time unit is milliseconds. It is used for the heartbeat mechanism and sets the minimum session timeout to twice the heartbeat time. (Minimum timeout of session 2*tickTime)

② initLimit =10: LF initial communication time limit

The maximum number of heartbeats (tickTime) that a follower server in a cluster can tolerate on an initial connection with a leader server, which is used to limit the time frame for Zookeeper servers in the cluster to connect to the Leader.

③ syncLimit =5: LF synchronous communication time limit

The maximum response time unit between Leader and Follower in the cluster. If the response exceeds syncLimit * tickTime, Leader considers Follwer dead and deletes Follwer from the server list.

④ dataDir: data file directory + data persistence path

It is mainly used to save data in Zookeeper.

clientPort =2181: Client connection port

The port on which to listen for client connections.

1.2 internals

1.2.1 Electoral mechanisms

Half mechanism: More than half of the machines in the cluster survive and the cluster is available. Zookeeper is suitable for installing an odd number of servers.

Zookeeper does not specify Master and Slave in the configuration file. However, when Zookeeper works, one node is the Leader and the other machines are Followers. The Leader is temporarily generated through an internal election mechanism.

The following figure is a ZK cluster composed of five servers, their id from 1-5, and they are all newly started, that is, no historical data, in terms of the amount of data stored, are the same. Assuming these servers start up in sequence, let's see what happens!

(1) Server 1 starts, only one of its servers starts at this time, and the messages it sends out have no response, so its election state is always LOOKING state;

(2) Server 2 starts up and communicates with server 1, which starts up first, to exchange their election results. Since neither of them has historical data, server 2 with the larger id value wins, but since more than half of the servers agree to elect it (more than half of them in this example are 3), servers 1 and 2 continue to maintain LOOKING status;

(3) Server 3 starts. According to the theoretical analysis above, server 3 becomes the leader of servers 1, 2 and 3. Different from the above, three servers elected it at this time, so it became the Leader of this election.

(4) Server 4 starts. According to the previous analysis, server 4 should theoretically be the largest of servers 1, 2, 3, and 4, but since more than half of the servers in the front have elected server 3, it can only receive the life of the younger brother;

(5) Server 5 starts, as 4 when the younger brother.

1.2.2 Node types

There are two types of Znode:

ephemeral: When the client and server are disconnected, the created node deletes itself

Persistent: After the client and server are disconnected, the created nodes are not deleted

② Znode has four types of directory nodes (persistent by default)

First, persistent directory node (PERSISTENT)

After the client disconnects from zookeeper, the node still exists.

Second, persistent sequential numbering directory node (PERSISTENT_SEQUENTIAL) After the client disconnects from zookeeper, the node still exists, but Zookeeper numbers the node name sequentially;

Third, temporary directory node (EPHEMERAL)

After the client disconnects from zookeeper, the node is deleted;

Fourth, temporary sequential numbering directory node (EPHEMERAL_SEQUENTIAL)

After the client disconnects from zookeeper, the node is deleted, but Zookeeper numbers the node name sequentially.

1.2.3 Principle of monitoring

(1) In Zookeeper's API operation, create the main() main method, that is, the main thread;

(2) Create the Zookeeper client (zkClient) in the main thread, which creates two threads:

Thread connet is responsible for network communication connections, connecting servers;

Thread Listeners are responsible for listening;

(3) The client connects to the server through the connet thread. In the figure getChildren("/" , true)," / " means listening to the root directory, true means listening, false means not listening;

(4) Add the registered monitoring event to the Zookeeper registration monitoring list, indicating that the/path in this server, i.e. the root directory, is monitored by the client;

(5) Once the data or path changes in the root directory of the monitored server, Zookeeper will send this message to the Listener thread;

(6) The Listener thread calls the process method internally and takes appropriate measures, such as updating the server list.

Monitoring type:

(1) Listen for changes in node data: get path [watch]

(2) Listen for changes in child node increase and decrease: ls patch [watch]

1.2.4 Data Writing

(1) Client writes data to one of Zookeeper's servers, sending a write request;

(2) If the Server is not the Leader, then the Server will forward the received request to the Leader, which will broadcast the write request to each server, and each Server will notify the Leader after writing successfully;

(3) When the Leader receives most of the Server data successfully written, then it means that the data was successfully written, for example, three nodes, as long as the data of two nodes was successfully written, it is considered that the data was successfully written;

(4) Server1 will notify the Client that the data was successfully written, and then the entire write operation is considered successful.

What is the principle mechanism of ZooKeeper in Java big data development? Have you learned knowledge or skills? If you want to learn more skills or enrich your knowledge reserves, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.