In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
How to explore the use and principle of Zookeeper, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
Introduction
Zookeeper is a software that provides consistency services for distributed applications. It is a sub-project of the open source Hadoop project, and it is implemented according to the papers published by google. Next, we will first install and use this software, and then explore the more important consistency algorithms.
Installation and use
The installation of Zookeeper can basically be completed by following the steps on the http://hadoop.apache.org/zookeeper/docs/current/ zookeeperStarted.html page. Here we mainly introduce the steps of deploying a cluster, because this official page does not seem to be very detailed (Running Replicated Zookeeper).
Due to the lack of machines on hand, three server are deployed on one machine, and you can do the same if you are also strapped for money. So I built three folders, as follows
Server1 server2 server3
Then a zookeeper download package is extracted in each folder, and several folders are built. The overall structure is as follows, and the last one is the unzipped file of the downloaded package.
Data dataLog logs zookeeper-3.3.2
Then first go to the data directory, create a myid file, and write a number in it. For example, mine is server1, then write a myid file corresponding to server1, then write a file corresponding to myid file and write a file named 3.
Then go to the zookeeper-3.3.2/conf directory. If you have just downloaded it, there will be three files, configuration.xml and log4j.properties,zoo_sample.cfg. The first thing we need to do is to create a zoo.cfg configuration file in this directory. Of course, you can change the zoo_sample.cfg file to zoo.cfg. The configuration is as follows:
TickTime=2000initLimit=5syncLimit=2dataDir=xxxx/zookeeper/server1/datadataLogDir=xxx/zookeeper/server1/dataLogclientPort=2181server.1=127.0.0.1:2888:3888server.2=127.0.0.1:2889:3889server.3=127.0.0.1:2890:3890
Several configurations of standard red should be clearly stated on the official website, but it should be noted that clientPort port if you deploy multiple server on one machine, then each machine needs a different clientPort, for example, my server1 is 2181 clientPort, server2 is 2182, server3 is 2183, Dir and dataLogDir also need to be distinguished.
The only thing to note in the last few lines is that the number server.X corresponds to the number in data/myid. If you write 1Jing 2 server 3 in the myid files of 3 server, then each zoo.cfg in server is matched with server.1,server.2,server.3 OK. Because on the same machine, the two ports and three server connected behind should not be the same, otherwise the ports will conflict. The first port is used for the exchange of information among cluster members, and the second port is specially used for electing leader when the leader is hung up.
Enter the zookeeper-3.3.2/bin directory,. / zkServer.sh start starts a server, and a large number of errors will be reported? In fact, it doesn't matter, because now there is only one server,zookeeper server in the cluster, and it will initiate a request to elect leader according to the server list of zoo.cfg, and report an error because it cannot connect to other machines, so when we start the second zookeeper instance, leader will be selected, so that the consistency service can be used. This is because as long as two machines are available, you can select leader and provide services (2n+1 machines). Can allow n machines to hang up).
Then we can use it, and we can first get a simple sense of what zookeeper does through the client-side interaction program that comes with zookeeper. Enter zookeeper-3.3.2/bin (any of the three server),. / zkCli.sh-server 127.0.0.1 server 2182, I am connected to a machine with port 2182 on.
So, first of all, let's randomly type a command. Because zookeeper doesn't know him, he will give the help of the command, as shown in the following figure.
Ls (view current node data)
Ls2 (view the current node data and see data such as the number of updates)
Create (create a node)
Get (get a node that contains data such as data and the number of updates)
Set (modify Node)
Delete (delete a node)
Through the above command practice, we can find that zookeeper uses a tree structure similar to a file system, where data can be hung on a node and can be deleted and modified. In addition, we also found that when changing a node, the living machines in the cluster are updated to consistent data.
Data model
After simply using zookeeper, we find that its data model is similar to the file structure of the operating system, which is shown in the following figure.
(1) each node is called znode in zookeeper, and it has a unique path identification, for example, the identification of / SERVER2 node is / APP3/SERVER2.
(2) Znode can have child znode, and data can be stored in znode, but nodes of EPHEMERAL type cannot have child nodes.
(3) there can be multiple versions of the data in the Znode. For example, if there are multiple versions of the data stored in a path, you need to bring the version to query the data under this path.
(4) znode can be a temporary node. Once the client creating the znode loses contact with the server, the znode will also be deleted automatically. The communication between the client and the server of the Zookeeper uses a persistent connection, and each client and server keep the connection through a heartbeat. This connection state is called session. If the znode is a temporary node, the session fails, and the znode is deleted.
(5) the directory name of znode can be numbered automatically. If App1 already exists and is created, it will be automatically named App2.
(6) znode can be monitored, including changes to the data stored in this directory node, changes in child nodes, etc., once the change can be notified to the monitoring client, this function is the most important feature of zookeeper for applications. The functions that can be realized through this feature include centralized configuration management, cluster management, distributed locking, and so on.
Using Zookeeper with java code
Zookeeper is mainly used by creating Zookeeper instances under its jar package and calling its API methods. The main operation is to add, delete and modify znode, listen for changes in znode and deal with it.
The following are the main API uses and explanations
/ / create a Zookeeper instance. The first parameter is the destination server address and port, and the second parameter is the Session timeout. The third is the callback method ZooKeeper zk = new ZooKeeper when the node changes ("127.0.0.1 Watcher 2181", 500000 new Watcher () {/ / monitor all triggered events public void process (WatchedEvent event) {/ / dosomething}}) / / create a node root, the data is mydata, without ACL permission control, and the node is permanent (that is, the client shutdown will not disappear) zk.create ("/ root", "mydata" .getBytes (), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT) / / create a childone znode under root, the data is childone, without ACL permission control, and the node is permanent zk.create ("/ root/childone", "childone" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.PERSISTENT); / / get the name of the child node under / root node and return Listzk.getChildren ("/ root", true); / / get the data under / root/childone node and return byte [] zk.getData ("/ root/childone", true, null) / / modify the data under node / root/childone. The third parameter is version. If it is-1, it will ignore the modified data version and directly change zk.setData ("/ root/childone", "childonemodify" .getBytes (),-1); / / delete / root/childone this node. If the second parameter is version, delete it directly. Ignore the version zk.delete ("/ root/childone",-1). / / close sessionzk.close ()
Implementation ideas of mainstream application scenarios of Zookeeper (excluding official examples)
(1) configuration management
Centralized configuration management is very common in application clusters. In general, a set of centralized configuration management centers are implemented within commercial companies to meet the needs of different application clusters for sharing their own configurations. And every machine in the cluster can be notified when the configuration changes.
Zookeeper can easily implement this centralized configuration management, such as configuring all the configurations of APP1 under / APP1 znode, monitoring the node / APP1 as soon as all APP1 machines start up (zk.exist ("/ APP1", true)), and implementing the callback method Watcher, then each machine will be notified when the data changes on the zookeeper / APP1 znode node, and the Watcher method will be executed. Then the App can remove the data again (zk.getData ("/ APP1", false,null))
The above example is only a simple coarse-grained configuration monitoring, fine-grained data can be hierarchical monitoring, all of which can be designed and controlled.
(2) Cluster management
In an application cluster, we often need to let each machine know which machines are alive in the cluster (or some other cluster on which it depends), and in the cluster machine because of downtime, network disconnection and other reasons can quickly notify each machine without human intervention.
Zookeeper is also easy to implement this function. For example, if I have a znode called / APP1SERVERS on the zookeeper server, then each machine in the cluster will go to this node to create an EPHEMERAL type node, such as server1 to create / APP1SERVERS/SERVER1 (you can use ip to ensure non-repetition), server2 to create / APP1SERVERS/SERVER2, and SERVER1 and SERVER2 to watch / APP1SERVERS as the parent node. Then the client that watch the node will be notified of the changes in the data or child nodes under this parent node. Because the EPHEMERAL type node has a very important feature, that is, when the client-side and server-side connection is broken or the session expires, the node will disappear, so when a machine dies or is disconnected, the corresponding node will disappear, and then all the clients in the cluster that watch / APP1SERVERS will receive a notification and get the latest list.
Another application scenario is cluster selection master. Once the master is down, you can immediately select a master from the slave. The implementation steps are the same as the former, except that the node type created by the machine in the APP1SERVERS is changed to the EPHEMERAL_SEQUENTIAL type when the machine starts, so that each node is automatically numbered, for example:
Zk.create ("/ testRootPath/testChildPath2", "1" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL_SEQUENTIAL); zk.create ("/ testRootPath/testChildPath3", "2" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL_SEQUENTIAL); zk.create ("/ testRootPath/testChildPath4", "3" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL_SEQUENTIAL) / / create a subdirectory node zk.create ("/ testRootPath/testChildPath5", "4" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL_SEQUENTIAL); System.out.println (zk.getChildren ("/ testRootPath", false))
Print result: [testChildPath20000000000, testChildPath30000000001, testChildPath50000000003, testChildPath40000000002]
Zk.create ("/ testRootPath", "testRootData" .getBytes (), Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); / / create a subdirectory node zk.create ("/ testRootPath/testChildPath2", "1" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL); zk.create ("/ testRootPath/testChildPath3", "2" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL) Zk.create ("/ testRootPath/testChildPath4", "3" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL); / / create a subdirectory node zk.create ("/ testRootPath/testChildPath5", "4" .getBytes (), Ids.OPEN_ACL_UNSAFE,CreateMode.EPHEMERAL); System.out.println (zk.getChildren ("/ testRootPath", false))
Print result: [testChildPath3, testChildPath2, testChildPath5, testChildPath4]
By default, we specify master as the lowest numbered node, so when we monitor the / APP1SERVERS node, we get a list of servers. As long as all cluster machines logically assume that the lowest numbered node is master, then master is selected, and when this master goes down, the corresponding znode disappears, and then the new server list is pushed to the client, and each node logically assumes that the lowest numbered node is master. In this way, dynamic master elections can be achieved.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.