How to realize zookeepr Analysis 07/19 Update SLTechnology News&Howtos

How to realize zookeepr Analysis

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article shows you how to achieve zookeepr analysis, the content is concise and easy to understand, can definitely brighten your eyes, through the detailed introduction of this article, I hope you can get something.

Zk, a distributed application coordination service

Zk is a distributed, open source, distributed coordination service, it provides a set of simple native interfaces, distributed applications can be implemented based on it, a high level of synchronization, clustering, configuration management and naming services. It is based on development and is designed using simple principles. Use a data model similar to the file system directory tree structure. It is based on java implementation and can serve c and java applications.

Coordination is a notorious job. It is easy to cause resource competition and deadlock problems. The implementation motivation of zk is to alleviate the crazy behavior of distributed applications in solving each other's italic problems.

Design goals of zk

Zk is a simple thing. Zk coordinates applications with each other through distributed processing processes, using a hierarchical namespace similar to a file system. These namespaces, which contain the registrants of data, are called znodes by the name of zk. It is similar to files and directories. Unlike traditional file systems, which are used for storage, zk's data is in memory, so it means high throughput and bottom latency of zk. Zk implements a high-quality, high-tech, high-availability, strict access order service. Judging from the performance of zk, it can be used in large, distributed systems. In terms of availability, it also does not encounter the problem of single point of failure, and strict sequencing also means that very complex synchronization operations can be achieved in the customer order.

Zk is replicable. Like its coordinated applications, zk itself replicates to a range of hosts. They are a whole. As shown in the picture, zk server

All the severs that make up the zk server must be able to perceive each other. They maintain a state diagram of all machines in memory and keep practical logs and snapshots on disk. As long as most servers are available. The zk service is available.

The client connects to a single server, sends requests, gets feedback, gets monitoring events, and sends heartbeats through the tcp protocol. If the connected server dies, the client connects to other service machines.

The zk is sequential. Each update of zk is labeled with a number that represents the entire order of zk transactions, which is used by subsequent operations to achieve a high level of abstraction, such as synchronization.

Zk is fast. Zk is particularly fast in the read-based operation. Zk runs on thousands of machines and performs better in read-based requests. The reading-writing ratio is 10:1.

Data model and naming hierarchy

The namespace provided by zk is like a standard file system. A name is a combination of paths separated by (/), and each node in zk is uniquely identified by a path. The hierarchical naming structure of zk is shown in the figure.

Nodes and the short-lived nodes

Unlike traditional file systems, every node and its child node in zk has data associated with them. They are like a file system, that is, every file is a directory. (the data stored in the zk node, including status information, configuration, local information, etc., so the amount of data is usually only a few bytes or a few kilobytes), in order to clarify the zk node. We call it znode,znode to maintain this data structure, including the version number of state changes, acl change information and timestamps to use and set East-West validation and coordinated updates. If the data of a node changes, its version number will also change. For example, if a client reads an node data, it will also get its version number. The data stored in node is atomic in reading and writing. Read will read all the associated data, write will replace all the data, will not give up halfway. Each node has an access control list (acl), which strictly limits who can do what!

There is also a concept of short-lived node in zk. These node are as long as the session life that created this node. When the session is closed, the node is automatically deleted. Short-lived node is very important in your development and implementation of [tbd].

Conditional update and monitoring

Zk supports the concept of monitoring. The client can monitor a node. When the node changes, the monitoring is triggered and deleted. When the monitoring is triggered, the client receives a data packet of data changes. And when the client and the server are disconnected, the client will also receive a notification. This is very useful in [tbd].

Guarantee

Zk is simple and fast, and, like his initial goal, is the foundation for building complex systems, such as synchronous systems. In fact, it has the following guarantee:

Timing consistency: update operations are performed in the order in which the client sends them.

Atomicity: the state in which updates either fail or succeed without partial success.

Unified state diagram: no matter which machine a client connects to, the graph of all machines is the same.

Reliability: an update is applied and will remain in effect until new data is overwritten.

Timeline: the system state diagram seen by the client within a certain period of time is guaranteed to be up-to-date.

For more applications in these areas, please see [tbd]

Simple api

Zk was designed to provide a simple program interface, so it only supports one operation:

Create: on the node tree, create a node

Delete: delete a node

Exists: check whether a node exists

Get data: getting data from a node

Set data: write data to a node

Get children: get the child node of a node

Sync: waiting for data propagation to synchronize

For more in-depth technical discussions and practices in building high-level applications, please refer to [tbd]

Technical realization

The following is a diagram of zk service components, with multiple copies of each zk component in addition to the request processor.

The in-memory database on each machine contains the entire data, updates the log to disk, and writes are serialized to disk before it spreads and synchronizes to other in-memory databases.

Each zk host provides services, and the client connects to a host to request services. The read service is obtained from the memory of the machine connected by the customer, and requests to write the service and change the state of the service are processed through a consistent protocol.

This consistent protocol requires that all write requests be sent uniformly to a host called leader. The rest of the zk hosts, called followers, receive data messages through leader, and the message layer is responsible for the failed replacement of leaders and synchronization between leaders and followers.

Zk uses an atomic message protocol, and because the message is atomic, it ensures that the local information is consistent with other host information. When leader receives a write request, it evaluates the server status, decides when to write, then initiates the write transaction, and finally gets the new status of the service request.

Application

Zk interface is particularly simple, you can implement through it, sequential operations, such as synchronization, group management, etc., many distributions should use it, to learn more, follow [tbd].

Performance

Zk is designed to be highly available, isn't it? A survey by the development team at yahoo shows that it is. It is particularly high performance when reads are much larger than writes, because writes synchronize all server states. (reading is much greater than writing is a classic coordination service case). The following is a chart of the test results of the change in zk's read-write ratio.

The test data of this chart is that version 3.2 zk runs on a dual-core 2Ghz Xeon processor and two 15k speed sata devices, a disk dedicated to zk logs, a write data snapshot, read and write operations are 1k size, servers represents the zk service, composed of the number of hosts, nearly 30 other host simulation clients, the zk service is configured to leader does not allow connections from the client. In addition, 3.2 reads and writes twice as much as 3.1.

The test results also show the reliability of zk. The following figure shows the performance of the server under various errors, including the following:

1 pawning and recovery of the follower machine.

2, another follower pawnout and recovery.

3The pawning of the leader.

4. Two follower are pawned at the same time.

5, another leader's pawn.

Usability

This figure shows that we have a zk server made up of seven hosts, and we inject the wrong system performance over a period of time. In this test, we have the same access saturation as before, and for one thing, we keep the write ratio at 30%, which is also our conservative load ratio.

From the chart, you can see a few important points. When followers dies and recovers quickly, zk can maintain high throughput. More importantly, the leader election algorithm allows the system to recover quickly to maintain its high throughput. You can see that zk spent less than 200ms to elect a new leader. Third, when follower recovers processing power, zk throughput begins to remain at a high level again.

Zookeepr project

Zk has been successfully applied in many industrial systems. For example, in yahoo. It is responsible for coordinating the failed recovery of yahoo. Thousands of topics subscribe to highly extended messenger services, and data transfers. Yahoo's acquisition service, crawler, is also responsible for its failed recovery coordination. As a member of yahoo, advertising also achieves high availability through zk.

The above content is how to implement zookeepr analysis. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.