Installation and deployment of Zookeeper 07/06 Update SLTechnology News&Howtos

Installation and deployment of Zookeeper

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

1. Install JDK

Refer to the installation of jdk.

2. ZooKeeper installation 2.1. Stand-alone mode 2.1.1. Download and install ZooKeeper

Steps:

1) download the stable version of zookeeper at the download address: http://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.9/.

2) $tar-zxvf zookeeper-3.4.9.tar.gz opens the zookeeper installation package tar.

3) copy the file directory opened by tar to the installation directory

$cp-R zookeeper-3.4.9 / software/

4) change to the zookeeper installation directory to create a soft connection for zookeeper

$ln-s zookeeper-3.4.9/ zk

5) configure the installation path of zookeeper to the operating system environment variable

Edit environment variable file: sudo vim/etc/profile

Add the following to the file:

Export ZOOKEEPER_HOME = / software/zk

Export PATH=$ ZOOKEEPER_HOME / bin:$PATH

$source / etc/profile # environment variable takes effect immediately

2.1.2. Create a zoo.cfg profile

1). Change to the conf directory under the installation directory of zookeeper

$cp zoo_sample.cfg zoo.cfg

2) modify the content of zoo.cfg

TickTime=2000

InitLimit=10

SyncLimit=5

DataDir=/home/hadoop/zookeeper/data

ClientPort=2181

Note: detailed description of zookeeper configuration parameters (mainly $ZOOKEEPER_HOME/conf/zoo.cfg file)

Parameter name

Description

ClientPort

The port on which the client connects to the server, that is, the external service port, is generally set to 2181.

DataDir

The directory where the snapshot file snapshot is stored. By default, transaction logs are also stored here. It is recommended to configure the parameter dataLogDir at the same time. The write performance of the transaction log directly affects the performance of zk.

TickTime

This time is used as the interval between Zookeeper servers or between clients and servers to maintain a heartbeat, that is, a heartbeat is sent at each tickTime time.

DataLogDir

Transaction log output directory. Try to configure a separate disk or mount point for the output of the transaction log, which will greatly improve ZK performance. (No Java system property)

GlobalOutstandingLimit

Maximum number of requests stacked. The default is 1000. When ZK runs, although server is no longer free to handle more client requests, it still allows clients to submit requests to the server to improve throughput performance. Of course, this request heap needs to be limited in order to prevent Server memory overflow.

(Java system property:zookeeper.globalOutstandingLimit.)

PreAllocSize

Open up disk space in advance for subsequent writing to the transaction log. The default is 64m, and each transaction log size is 64m. If the snapshot frequency of ZK is high, it is recommended to reduce this parameter appropriately. (Java system property:zookeeper.preAllocSize)

SnapCount

After each snapCount transaction log output, a snapshot (snapshot) is triggered, at which point ZK generates a snapshot.* file and creates a new transaction log file, log.*. The default is 100000. (in a real code implementation, a certain amount of random number processing is performed to prevent all servers from taking snapshots at the same time and affecting performance.) (Java system property:zookeeper.snapCount)

TraceFile

The log used to record all requests can be used during debugging, but it is not recommended in the production environment, which can seriously affect performance. (Java system property:? RequestTraceFile)

MaxClientCnxns

The limit on the number of connections between a single client and a single server is ip-level, and the default is 60. If set to 0, then there is no limit. Please note that the scope of use of this limit is only the limit on the number of connections between a single client machine and a single ZK server, not for the specified client IP, nor the connection limit for ZK clusters, nor the connection limit for all clients by a single ZK. Specify the restriction policy for the client IP. Here is a patch. You can try it: http://rdc.taobao.com/team/jm/archives/1334 (No Java system property)

ClientPortAddress

For machines with multiple network cards, you can specify a different listening port for each IP. By default, all IP listen on the port specified by clientPort. New in 3.3.0

MinSessionTimeoutmaxSessionTimeout

Session timeout limit, if the timeout set by the client is not in this range, it will be forced to the maximum or minimum time. The default Session timeout is in the range of 2 * tickTime ~ 20 * tickTime New in 3.3.0

Fsync.warningthresholdms

When the transaction log is output, a warning message is output in the log if the call to the fsync method exceeds the specified timeout. The default is 1000ms. (Java system property: fsync.warningthresholdms) New in 3.3.4

Autopurge.purgeInterval

As mentioned above, in version 3.4.0 and later, ZK provides the feature to automatically clean transaction logs and snapshot files. This parameter specifies the cleaning frequency (in hours). You need to configure an integer of 1 or more. The default is 0, which means that automatic cleaning is not enabled. (No Java system property) New in 3.4.0

Autopurge.snapRetainCount

This parameter is used in conjunction with the above parameter, which specifies the number of files to be retained. The default is 3. (No Java system property) New in 3.4.0

ElectionAlg

In previous versions, this parameter configuration allowed us to choose the leader election algorithm, but since only one "TCP-based version of fast leader election" algorithm will be left in future versions, this parameter does not seem to be useful at present, and I will not expand on it here. (No Java system property)

InitLimit

During the startup process, Follower synchronizes all the latest data from Leader and then determines the starting status of its external service. Leader allows Follower to do this in initLimit time. In general, we don't have to care too much about the setting of this parameter. If the amount of data in the ZK cluster is really large, the time it takes to synchronize data from the Leader when Follower starts will be correspondingly longer, so in this case, it is necessary to increase this parameter appropriately. (No Java system property)

SyncLimit

During the operation, Leader is responsible for communicating with all the machines in the ZK cluster, such as checking the survival status of the machines through some heartbeat detection mechanism. If the Leader sends a heartbeat after the syncLimit and has not received a response from the Follower, then the Follower is considered to be offline. Note: do not set this parameter too large, or some problems may be masked. (No Java system property)

LeaderServes

By default, Leader accepts client connections and provides normal read and write services. However, if you want Leader to focus on the coordination of machines in the cluster, you can set this parameter to no, which will greatly improve the performance of write operations. (Java system property: zookeeper.leaderServes)

Server.x= [hostname]: nnnnn [: nnnnn]

The x here is a number, which is consistent with the id in the myid file. Two ports can be configured on the right, the first for data synchronization and other communications between F and L, and the second for voting communication during the Leader election.

(No Java system property)

Group.x=nnnnn [: nnnnn] weight.x=nnnnn

For machine grouping and weight setting, see here (No Java system property)

CnxTimeout

During the Leader election process, the timeout for opening a connection defaults to 5s. (Java system property: zookeeper. CnxTimeout)

Zookeeper.DigestAuthenticationProvider.superDigest

ZK permission settings are related. For more information, see "manipulating authorized nodes using super identity" and "ZooKeeper permission Control".

SkipACL

No ACL check is performed on all client requests. If permission restrictions were previously set on the node, once this opening is opened on the server, it will also be invalidated. (Java system property: zookeeper.skipACL)

ForceSync

This parameter determines whether you need to call FileChannel.force when the transaction log is committed to ensure that the data is fully synchronized to disk. (Java system property: zookeeper.forceSync)

Jute.maxbuffer

The maximum amount of data per node is 1m by default. This restriction must be set on both the server and client sides to take effect. (Java system property: jute.maxbuffer)

2.1.3. Start the ZooKeeper service

$zkServer.sh start

2.1.4. Start the ZooKeeper client

$zkCli.sh

2.2. Pseudo-distributed model 2.2.1. Create a new myid file

Each zk server is assigned an id (uniqueness), which is stored in the myid file, with values between 1 and 255.

The myid file is stored in the path of the dataDir=/home/hadoop/zookeeper/data configuration item in the zoo.cfg configuration file.

Here, three zk services are created and three zoo1.cfg, zoo2.cfg and zoo3.cfg are created. Three dataDir paths are required, for example:

DataDir=/home/hadoop/zookeeper/data1

DataDir=/home/hadoop/zookeeper/data2

DataDir=/home/hadoop/zookeeper/data3

2.2.2. Add configuration items to zoo.cfg

Syntax: server.n=hostname:port1:port2

Where n is the server ID, that is, the value in myid; hostname is the server name; port1 is the port through which follower connects to leader; and port2 is used for leader election.

2.2.3. Start the ZooKeeper service

$> zkServer.sh start / software/zk/conf/zoo1.cfg

$> zkServer.sh start/software/zk/conf/zoo2.cfg

$> zkServer.sh start/software/zk/conf/zoo3.cfg

2.2.4. View zk service status

1), use four-word command

$> echo ruok | nc localhost 2181

$> echo conf | nc localhost 2181

$> echo envi | nc localhost 2181

$> echo stat | nc localhost 2181

2), use other commands

$zkServer.sh status / software/zk/conf/zoo1.cfg

2.2.5. Connect to the zk service through zk client

$zkCli.sh / / connects to localhost2181 by default and uses zoo.cfg

$zkCli.sh-server s200 zk 2182 / / Connect to the specified zk service

2.3. Fully distributed 2.3.1. Distribute ZooKeeper

Distribute the zookeeper installation files to the server where the zookeeper service needs to be installed.

2.3.2. Distribute the environment variables of ZooKeeper

Distribute one of the configured environment variables to all servers and restart the server or use the source command to make the environment variables take effect.

2.3.3. Create a new myid file

Refer to "New myid File" in pseudo-distribution mode.

2.3.4. Add configuration items to zoo.cfg

Add configuration items to the $ZOOKEEPER_HOME/conf/zoo.cfg file on the zk server server and refer to the pseudo-distribution pattern.

Example:

Server.201=s201:2888:3888

Server.202=s202:2888:3888

Server.203=s203:2888:3888

2.3.5. Start the zk service

$zkServer.sh start

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.