What is the zookeeper framework like? 07/11 Update SLTechnology News&Howtos

What is the zookeeper framework like?

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

What is the zookeeper framework like? I believe many inexperienced people are at a loss about it. Therefore, this article summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

To sum up, zookeeper:zookeeper is by far the most widely used distributed component. Its functions and responsibilities are single, but it is very important.

What on earth is zookeeper? (technical article)

1) zookeeper is actually a framework developed by yahoo for consistency processing in distributed systems.

2) background introduction: it was originally used as a by-product of Hadoop research and development. Because it is difficult to deal with consistency in distributed systems, there is no need for other distributed systems to make wheels repeatedly, so zookeeper is widely used in subsequent distributed systems. Therefore, zookeeper is widely used in most of the subsequent distributed systems, so that zookeeper has become the basic component of various distributed systems, and its important position can be imagined. (netty,netty, which I learned before, is an excellent package for socket network programming and a communication component framework. If you don't know netty, you can collect this article first, and then take five minutes to learn. Netty is very simple. It's actually a Jar package, which is used as a communication component.)

3) specific application scenarios: the famous hadoop, kafka and dubbo are all based on zookeeper.

4) benefits: ensuring the ultimate consistency of data in a distributed environment, which is the problem that zookeeper can solve.

5) consistency has been mentioned many times above, so what exactly is consistency? let's add this concept:

The so-called consistency actually revolves around "seeing". Who can see it? Can you see it? When did you see it? For example, Taobao backstage sellers put up a highly promoted product in the background and submit it to the master database through server A, assuming that immediately after submission, a user goes to query the product from the database through application server B. there will be a phenomenon that the seller has been updated successfully, but the buyer cannot see it; and after a period of time, the data of the master database is synchronized to the slave database before the buyer can find it. (true technical text)

Assuming that the buyer can see the seller's update immediately after the seller's update is successful, it is called strong consistency.

If the buyer cannot see the seller's updated content after the seller's update is successful, it is called weak consistency.

After the seller updates successfully, the buyer can finally see the seller's update after a period of time, which is called the final consistency.

6) add some common ways to solve consistency problems:

Query retry compensation. For the uncertain situation in the distributed application, first use the query interface to query the current state, and if the current state is inconsistent, use the compensation interface to retry the state, or roll back the interface to roll back the business. Typical scenarios such as the interaction between banks and Alipay. Alipay sends a transfer request to the bank, if it has not received a response, it can query the status of the transaction through the bank's query interface, and if the transaction is not received by the other party, it will be pushed in the mode of compensation.

Scheduled task push. For the above situation, it is possible that one push can not be done, so it needs 2 or 3 pushes. There is no doubt that the initial order loss rate of Alipay is very high, which all depends on the continuous regular task push to increase the success rate.

TCC . Try-confirm-cancel . It is actually a two-phase protocol, and the second phase can implement commit operation or reverse operation.

7) what on earth can zookeeper do? I mentioned earlier that hadoop, kafka, and dubbo are all built on zookeeper. Here, I'll elaborate on zookeeper in terms of dubbo. (authentic technical article)

As a well-known distributed SOA framework, the main service registration discovery function of dubbo is provided by zookeeper.

For a service framework, the registry is the core of its core, although temporary suspension will not cause problems with the whole service, but once it is dead, the overall risk is very high. In general, when a registry is a single machine, it is easy to implement. All machines get up to register the service to it, and all callers maintain a long connection with it. Once the service changes, notify the caller through the long connection. However, when the size of the service cluster expands, it is not easy. The number of connections maintained by a single machine is limited, and it is prone to failure.

As a stable service framework, dubbo can choose and recommend zookeeper as the registry. The underlying layer encapsulates zkclient and curator, which are commonly used in zookeeper, into ZookeeperClient.

Service consumers subscribe to changes in their parent nodes, such as startup and stop, can be learned through node creation and deletion, and anomalies such as callee disconnection can also be automatically deleted by temporary node session disconnection.

Service consumers will also put their subscribed services into zookeeper in the way created by nodes.

Then we can get the mapping relationship, such as who provides the service, who subscribes to whom to provide the service, and then monitor based on this relationship, we can easily know the whole system situation.

Basic data model of zookeeper (good technical article):

In a word, similar to the node model of the Linux file system

Its nodes have the following interesting and important features:

If multiple machines create the same node at the same time, only one will compete for success. Using this feature, you can make distributed locks.

The life cycle of the temporary node is the same as that of the session, and the temporary node is deleted when the session is closed. This feature is often used to do heartbeat, dynamic monitoring, load, and so on.

The sequential node guarantees the node name globally. This feature can be used to generate global self-growing id in a distributed environment.

Through the primitive services provided by zookeeper, you can have an accurate and intuitive understanding of what zookeeper can do.

Primitive services provided by zookeeper:

Create a nod

Delete nod

Update Node

Get node information

Authority control

Event monitoring

In fact, it is the addition, deletion, query and modification of nodes plus access control and event monitoring, but through the combination of these primitives and the use of different scenarios, many uses can be achieved.

Data publication and subscription. That is, the registry, see dubbo usage above. Mainly through the node management to achieve publishing and event monitoring to achieve subscription.

Load balancing. See the usage of kafka above.

Naming service. The node structure of zookeeper naturally supports naming service, that is, the information is stored centrally and managed in a tree, which is convenient for unified reference.

Distributed coordination notification. Coordination notifications are actually similar to publish subscriptions in that many kinds of coordination notifications are actually decoupled due to the introduction of third-party zookeeper.

Cluster management and master election. Through the second feature above, you can easily know the survival status of cluster machines, so that you can easily manage the cluster; through the above features, you can make master contention.

Distributed locks. In fact, it is the application of the point feature.

Distributed queues. It is actually the application of the third feature.

Distributed concurrent waiting. Similar to the join problem of multithreading, the execution of the main task depends on the completion of other subtasks. Join can be used in a single machine with multithreading, but how to implement it in a distributed environment. With zookeeper, a master task node can be created. Once the subtask under the flag is completed, a subtask node will be hung under the master task node. If there are enough nodes, the main task can be executed.

It can be found that all primitives are the basis of zookeeper, while other usage summaries are nothing more than classifying primitives into different scenarios.

After reading the above, have you mastered what the zookeeper framework looks like? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.