Hadoop + Hbase + Zookeeper installation and configuration complete version (Hadoop1 series) 04/16 Update SLTechnology News&Howtos

Hadoop + Hbase + Zookeeper installation and configuration complete version (Hadoop1 series)

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Step 1: install the Hadoop cluster

1. Prepare the media needed to build the environment

Enterprise-R5-U4-Server-x86_64-dvd.iso

Hadoop-1.1.1.tar.gz

Jdk-6u26-linux-x64-rpm.bin

2. Create a virtual machine with 5 nodes

192.168.0.202 hd202 # NameNode

192.168.0.203 hd203 # SecondaryNameNode

192.168.0.204 hd204 # DataNode

192.168.0.205 hd205 # DataNode

192.168.0.206 hd206 # DataNode

During the installation of the virtual machine, the sshd service needs to be installed. If disk space allows, install the system package as fully as possible.

3. Install Jdk in the virtual machines of all five nodes (installed as root users)

[root@hd202 ~] # mkdir / usr/java

[root@hd202 ~] # mv jdk-6u26-linux-x64-rpm.bin / usr/java

[root@hd202 ~] # cd / usr/java

[root@hd202 java] # chmod 744 jdk-6u26-linux-x64-rpm.bin

[root@hd202 java] #. / jdk-6u26-linux-x64-rpm.bin

[root@hd202 java] # ln-s jdk1.6.0_26 default

4. Create hadoop administrative users (users must be created in all 5 virtual machines)

[root@hd202 ~] # useradd cbcloud # add users directly without creating a user group first. By default, the user belongs to the same group and user name. That is cbcloud.cbcloud

[root@hd202 ~] # passwd cbcloud # modify the password of user cbcloud. The test environment can be set to 111111.

5. Edit / etc/hosts file (using root users to edit on all five virtual machines)

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

:: 1 localhost6.localdomain6 localhost6

192.168.0.202 hd202

192.168.0.203 hd203

192.168.0.204 hd204

192.168.0.205 hd205

192.168.0.206 hd206

6. Edit / etc/sysconfig/network file (using root users to edit on all five virtual machines)

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=hd202 # hostname (192.168.0.203 should be changed to hd203, and so on, all five machines should be changed to the corresponding names)

GATEWAY=192.168.0.1

7. Configure user equivalence among the five machines (operate with the user cbcloud login created earlier)

[cbcloud@hd202 ~] $mkdir .ssh

[cbcloud@hd202 ~] $chmod 700.ssh

[cbcloud@hd202] $ssh-keygen-t rsa

[cbcloud@hd202] $ssh-keygen-t dsa

[cbcloud@hd202] $cat ~ / .ssh/id_rsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd202] $cat ~ / .ssh/id_dsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd203 ~] $mkdir .ssh

[cbcloud@hd203 ~] $chmod 700.ssh

[cbcloud@hd203] $ssh-keygen-t rsa

[cbcloud@hd203] $ssh-keygen-t dsa

[cbcloud@hd203] $cat ~ / .ssh/id_rsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd203] $cat ~ / .ssh/id_dsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd204 ~] $mkdir .ssh

[cbcloud@hd204 ~] $chmod 700.ssh

[cbcloud@hd204] $ssh-keygen-t rsa

[cbcloud@hd204] $ssh-keygen-t dsa

[cbcloud@hd204] $cat ~ / .ssh/id_rsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd204] $cat ~ / .ssh/id_dsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd205 ~] $mkdir .ssh

[cbcloud@hd205 ~] $chmod 700.ssh

[cbcloud@hd205] $ssh-keygen-t rsa

[cbcloud@hd205] $ssh-keygen-t dsa

[cbcloud@hd205] $cat ~ / .ssh/id_rsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd205] $cat ~ / .ssh/id_dsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd206 ~] $mkdir .ssh

[cbcloud@hd206 ~] $chmod 700.ssh

[cbcloud@hd206] $ssh-keygen-t rsa

[cbcloud@hd206] $ssh-keygen-t dsa

[cbcloud@hd206] $cat ~ / .ssh/id_rsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd206] $cat ~ / .ssh/id_dsa.pub > ~ / .ssh/authorized_keys

[cbcloud@hd202 ~] $cd .ssh

[cbcloud@hd202 .ssh] $scp authorized_keys cbcloud@hd203:/home/cbcloud/.ssh/authorized_keys2 # remotely copy the authorized_keys file on the hd202 machine to the / home/cbcloud/.ssh/ directory on hd203 and rename it to authorized_keys2

[cbcloud@hd203 ~] $cd .ssh

[cbcloud@hd203 ~] $cat authorized_keys2 > authorized_keys # that is, merge the contents of authorized_keys on hd202 into the authorized_keys file on the hd203 machine.

Then copy the merged authorized_keys file to the hd204, merge with the authorized_keys file on the 204, and so on. Finally, after merging the contents of the authorized_keys file of the five nodes, the authorized_keys file containing the key contents of the five nodes is overwritten to the other four nodes.

Note: the permission of the authorized_keys file must be 644, otherwise user equivalence will be invalidated.

Execute the following command on all five nodes:

[cbcloud@hd202 ~] $cd .ssh

[cbcloud@hd202 ~] $chmod 644 authorized_keys

8. Start installing the hadoop cluster

8.1 create a directory (execute the following command on all five virtual machines _ use the root user)

[root@hd202 ~] # mkdir / home/cbcloud/hdtmp

[root@hd202 ~] # mkdir / home/cbcloud/hddata

[root@hd202 ~] # mkdir / home/cbcloud/hdconf

[root@hd202] # chown-R cbcloud:cbcloud / home/cbcloud/hdtmp

[root@hd202] # chown-R cbcloud:cbcloud / home/cbcloud/hddata

[root@hd202] # chown-R cbcloud:cbcloud / home/cbcloud/hdconf

[root@hd202 ~] # chmod-R 755 / home/cbcloud/hddata # remember that hddata is used for DataNode nodes to store data. Hadoop is strictly assigned, and the permission of this directory must be 755. If it is not this permission value, when you start DataNode later, the DataNode node will not be able to start successfully because of incorrect permissions.

Decompress hadoop-1.1.1.tar.gz to / home/cbcloud directory (you only need to execute it on one hd202 machine)

[root@hd202 ~] # mv hadoop-1.1.1.tar.gz / home/cbcloud

[root@hd202 ~] # cd / home/cbcloud

[root@hd202 cbcloud] # tar-xzvf hadoop-1.1.1.tar.gz

[root@hd202 cbcloud] # mv hadoop-1.1.1 hadoop

[root@hd202 cbcloud] # chown-R cbcloud.cbcloud hadoop/

8.3 configure the system environment variable / etc/profile (execute on all five virtual machines _ use root user)

[root@hd202 ~] # vi / etc/profile

Add the following at the end of the file

Export JAVA_HOME=/usr/java/default

Export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

Export PATH=$JAVA_HOME/bin:$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin

Export HADOOP_HOME=/home/cbcloud/hadoop

Export HADOOP_DEV_HOME=/home/cbcloud/hadoop

Export HADOOP_COMMON_HOME=/home/cbcloud/hadoop

Export HADOOP_HDFS_HOME=/home/cbcloud/hadoop

Export HADOOP_CONF_DIR=/home/cbcloud/hdconf

Export HADOOP_HOME_WARN_SUPPRESS=1

Export PATH=$PATH:$HADOOP_HOME/bin

Export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib

8.4 configure user environment variables

[cbcloud@hd202 ~] $vi .bash _ profile

Add the following at the end of the file

Export JAVA_HOME=/usr/java/default

Export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib

Export PATH=$JAVA_HOME/bin:$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin

Export HADOOP_HOME=/home/cbcloud/hadoop

Export HADOOP_DEV_HOME=/home/cbcloud/hadoop

Export HADOOP_COMMON_HOME=/home/cbcloud/hadoop

Export HADOOP_HDFS_HOME=/home/cbcloud/hadoop

Export HADOOP_CONF_DIR=/home/cbcloud/hdconf

Export HADOOP_HOME_WARN_SUPPRESS=1

Export PATH=$PATH:$HADOOP_HOME/bin

Export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib

8.5 modify the hadoop configuration file (using cbcloud user actions, and only need to operate on one hd202 machine)

[cbcloud@hd202 ~] $cp $HADOOP_HOME/conf/* $HADOOP_CONF_DIR/*

# as you can see from the red line of the environment variable in the previous step, the configuration files currently used by hadoop should be located in the / home/cbcloud/hdconf directory, so you need to copy all the configuration files in the / home/cbcloud/hadoop/conf directory to the / home/cbcloud/hdconf directory.

8.5.1 Editing the core-site.xml profile

[cbcloud@hd202 ~] $cd / home/cbcloud/hdconf

[cbcloud@hd202 hdconf] $vi core-site.xml

Fs.default.name

Hdfs://hd202:9000

Hadoop.tmp.dir

/ home/cbcloud/hdtmp

8.5.2 Editing hdfs-site.xml

[cbcloud@hd202 hdconf] $vi hdfs-site.xml

Dfs.data.dir

/ home/cbcloud/hddata

Dfs.replication

three

8.5.3 Editing mapred-site.xml

[cbcloud@hd202 hdconf] $vi mapred-site.xml

Mapred.job.tracker

Hd202:9001

8.5.4 Editing masters

[cbcloud@hd202 hdconf] $vi masters

Add the following

Hd203 # because hd203 is SecondaryNameNode, you only need to configure hd203 here, not hd202

8.5.5 Editing slaves

[cbcloud@hd202 hdconf] $vi slaves

Add the following

Hd204

Hd205

Hd206

Copy the / home/cbcloud/hadoop directory and / home/cbcloud/hdconf directory to the other four virtual machines

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hadoop hd203:/home/cbcloud # due to the previous configuration of user equivalence, this command no longer requires a password to execute

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hadoop hd204:/home/cbcloud

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hadoop hd205:/home/cbcloud

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hadoop hd206:/home/cbcloud

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hdconf hd203:/home/cbcloud

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hdconf hd204:/home/cbcloud

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hdconf hd205:/home/cbcloud

[cbcloud@hd202 hdconf] $scp-r / home/cbcloud/hdconf hd206:/home/cbcloud

8.7 execute commands on NameNode (hd202) to format the command space

[cbcloud@hd202 ~] $cd $HADOOP_HOME/bin

[cbcloud@hd202 bin] $hadoop namenode-format

If there is no ERROR gray in the information printed by the console, the command to format the namespace is executed successfully.

8.8 start hadoop

[cbcloud@hd202 ~] $cd $HADOOP_HOME/bin

[cbcloud@hd202 bin] $. / start-dfs.sh

Starting namenode, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-namenode-hd202.out

Hd204: starting datanode, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-datanode-hd204.out

Hd205: starting datanode, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-datanode-hd205.out

Hd206: starting datanode, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-datanode-hd206.out

Hd203: starting secondarynamenode, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-secondarynamenode-hd203.out

8.9 start mapred

[cbcloud@hd202 bin] $. / start-mapred.sh

Starting jobtracker, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-jobtracker-hd202.out

Hd204: starting tasktracker, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-tasktracker-hd204.out

Hd205: starting tasktracker, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-tasktracker-hd205.out

Hd206: starting tasktracker, logging to / home/cbcloud/hadoop/libexec/../logs/hadoop-cbcloud-tasktracker-hd206.out

8.10 View the process

[cbcloud@hd202 bin] $jps

4335 JobTracker

4460 Jps

4153 NameNode

[cbcloud@hd203 hdconf] $jps

1142 Jps

1078 SecondaryNameNode

[cbcloud@hd204 hdconf] $jps

1783 Jps

1575 DataNode

1706 TaskTracker

[cbcloud@hd205 hdconf] $jps

1669 Jps

1461 DataNode

1590 TaskTracker

[cbcloud@hd206 hdconf] $jps

1494 DataNode

1614 TaskTracker

1694 Jps

8.11 View cluster status

[cbcloud@hd202 bin] $hadoop dfsadmin-report

Configured Capacity: 27702829056 (25.8 GB)

Present Capacity: 13044953088 (12.15 GB)

DFS Remaining: 13044830208 (12.15 GB)

DFS Used: 122880 (120 KB)

DFS Used%: 0%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0

Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.0.205:50010

Decommission Status: Normal

Configured Capacity: 9234276352 (8.6GB)

DFS Used: 40960 (40 KB)

Non DFS Used: 4885942272 (4.55 GB)

DFS Remaining: 4348293120 (4.05GB)

DFS Used%: 0%

DFS Remaining%: 47.09%

Last contact: Wed Jan 30 18:02:17 CST 2013

Name: 192.168.0.206:50010

Decommission Status: Normal

Configured Capacity: 9234276352 (8.6GB)

DFS Used: 40960 (40 KB)

Non DFS Used: 4885946368 (4.55 GB)

DFS Remaining: 4348289024 (4.05GB)

DFS Used%: 0%

DFS Remaining%: 47.09%

Last contact: Wed Jan 30 18:02:17 CST 2013

Name: 192.168.0.204:50010

Decommission Status: Normal

Configured Capacity: 9234276352 (8.6GB)

DFS Used: 40960 (40 KB)

Non DFS Used: 4885987328 (4.55 GB)

DFS Remaining: 4348248064 (4.05GB)

DFS Used%: 0%

DFS Remaining%: 47.09%

Last contact: Wed Jan 30 18:02:17 CST 2013

Note: if the error "INFO ipc.Client: Retrying connect to server" is reported, it is due to the failure of core-site.xml. Stop, restart hadoop, and format namenode.

In addition, turn off the firewall every time you start VM.

8.12 View the operation of Hadoop through a WEB browser

Http://192.168.1.202:50070 to view Hadoop operation

8.13 View the operation of Job through a WEB browser

Http://192.168.0.202:50030 to view Job execution

9. List the directories that exist in the HDFS file system

[cbcloud@hd202 logs] $hadoop dfs-ls

Ls: Cannot access.: No such file or directory.

The above error is due to the empty directory being accessed.

You can execute hadoop fs-ls / instead

[cbcloud@hd202 logs] $hadoop fs-ls /

Found 1 items

Drwxr-xr-x-cbcloud supergroup 0 2013-01-30 15:52 / home

You can see that there is an empty result

Execute hadoop fs-mkdir hello # hello as the name of the folder

[cbcloud@hd202 logs] $hadoop fs-mkdir hello

[cbcloud@hd202 logs] $hadoop fs-ls

Found 1 items

Drwxr-xr-x-cbcloud supergroup 0 2013-01-30 21:16 / user/cbcloud/hello

10. HDFS usage test

[cbcloud@hd202 logs] $hadoop dfs-rmr hello

Deleted hdfs://hd202:9000/user/cbcloud/hello # Delete the folder created earlier

[cbcloud@hd202 logs] $hadoop dfs-mkdir input

[cbcloud@hd202 logs] $hadoop dfs-ls

Found 1 items

Drwxr-xr-x-cbcloud supergroup 0 2013-01-30 21:18 / user/cbcloud/input

11. Run the wordcount example with Hadoop's own framework

11.1. Establish a data file

Create two files input1 and input2 in the host 192.168.0.202 virtual machine

[cbcloud@hd202 hadoop] $echo "Hello Hadoop in input1" > input1

[cbcloud@hd202 hadoop] $echo "Hello Hadoop in input2" > input2

11.2. Publish data files to Hadoop cluster

1. Set up an input directory in HDFS

[cbcloud@hd202 hadoop] $hadoop dfs-mkdir input

2. Copy the files input1 and input2 to the input directory of HDFS

[cbcloud@hd202 hadoop] $hadoop dfs-copyFromLocal / home/cbcloud/hadoop/input* input

3. Check whether the copy is successful in the input directory.

[cbcloud@hd202 hadoop] $hadoop dfs-ls input

Found 2 items

-rw-r--r-- 3 cbcloud supergroup 23 2013-01-30 21:28 / user/cbcloud/input/input1

-rw-r--r-- 3 cbcloud supergroup 23 2013-01-30 21:28 / user/cbcloud/input/input2

11.3. Execute the wordcount program # make sure there is no output directory on HDFS, and view the results

[cbcloud@hd202 hadoop] $hadoop jar hadoop-examples-1.1.1.jar wordcount input output

13-01-30 21:33:05 INFO input.FileInputFormat: Total input paths to process: 2

13-01-30 21:33:05 INFO util.NativeCodeLoader: Loaded the native-hadoop library

13-01-30 21:33:05 WARN snappy.LoadSnappy: Snappy native library not loaded

13-01-30 21:33:07 INFO mapred.JobClient: Running job: job_201301302110_0001

13-01-30 21:33:08 INFO mapred.JobClient: map 0 reduce 0

13-01-30 21:33:32 INFO mapred.JobClient: map 50% reduce 0

13-01-30 21:33:33 INFO mapred.JobClient: map 100% reduce 0

13-01-30 21:33:46 INFO mapred.JobClient: map 100 reduce 100%

13-01-30 21:33:53 INFO mapred.JobClient: Job complete: job_201301302110_0001

13-01-30 21:33:53 INFO mapred.JobClient: Counters: 29

13-01-30 21:33:53 INFO mapred.JobClient: Job Counters

13-01-30 21:33:53 INFO mapred.JobClient: Launched reduce tasks=1

13-01-30 21:33:53 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=29766

21:33:53 on 13-01-30 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms) = 0

21:33:53 on 13-01-30 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms) = 0

13-01-30 21:33:53 INFO mapred.JobClient: Launched map tasks=2

13-01-30 21:33:53 INFO mapred.JobClient: Data-local map tasks=2

13-01-30 21:33:53 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=13784

13-01-30 21:33:53 INFO mapred.JobClient: File Output Format Counters

13-01-30 21:33:53 INFO mapred.JobClient: Bytes Written=40

13-01-30 21:33:53 INFO mapred.JobClient: FileSystemCounters

13-01-30 21:33:53 INFO mapred.JobClient: FILE_BYTES_READ=100

13-01-30 21:33:53 INFO mapred.JobClient: HDFS_BYTES_READ=262

13-01-30 21:33:53 INFO mapred.JobClient: FILE_BYTES_WRITTEN=71911

13-01-30 21:33:53 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=40

13-01-30 21:33:53 INFO mapred.JobClient: File Input Format Counters

13-01-30 21:33:53 INFO mapred.JobClient: Bytes Read=46

13-01-30 21:33:53 INFO mapred.JobClient: Map-Reduce Framework

13-01-30 21:33:53 INFO mapred.JobClient: Map output materialized bytes=106

13-01-30 21:33:53 INFO mapred.JobClient: Map input records=2

13-01-30 21:33:53 INFO mapred.JobClient: Reduce shuffle bytes=106

13-01-30 21:33:53 INFO mapred.JobClient: Spilled Records=16

13-01-30 21:33:53 INFO mapred.JobClient: Map output bytes=78

21:33:53 on 13-01-30 INFO mapred.JobClient: CPU time spent (ms) = 5500

21:33:53 on 13-01-30 INFO mapred.JobClient: Total committed heap usage (bytes) = 336928768

13-01-30 21:33:53 INFO mapred.JobClient: Combine input records=8

13-01-30 21:33:53 INFO mapred.JobClient: SPLIT_RAW_BYTES=216

13-01-30 21:33:53 INFO mapred.JobClient: Reduce input records=8

13-01-30 21:33:53 INFO mapred.JobClient: Reduce input groups=5

13-01-30 21:33:53 INFO mapred.JobClient: Combine output records=8

21:33:53 on 13-01-30 INFO mapred.JobClient: Physical memory (bytes) snapshot=417046528

13-01-30 21:33:53 INFO mapred.JobClient: Reduce output records=5

21:33:53 on 13-01-30 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1612316672

13-01-30 21:33:53 INFO mapred.JobClient: Map output records=8

[cbcloud@hd202 hadoop] $hadoop dfs-ls output

Found 2 items

-rw-r--r-- 3 cbcloud supergroup 0 2013-01-30 21:33 / user/cbcloud/output/_SUCCESS

-rw-r--r-- 3 cbcloud supergroup 40 2013-01-30 21:33 / user/cbcloud/output/part-r-00000

[cbcloud@hd202 hadoop] $hadoop dfs-cat output/part-r-00000

Hadoop 2

Hello 2

In 2

Input1 1

Input2 1

Step 2: build a Zookeeper cluster environment

The method of building a Hadoop cluster on Oracle Linux 5.4 64bit has been documented in detail in the previous Hadoop1.1.1 cluster installation record. Now continue with the previous article, further install Zookeeper and HBASE

1. Install zookeeper (install on hd202)

1.1. prepare to install media zookeeper-3.4.5.tar.gz

1.2.Use cbcloud users to upload media to the / home/cbcloud/ directory on the hd202 virtual machine

1.3.Uncompress zookeeper-3.4.5.tar.gz

[cbcloud@hd202 ~] $tar zxvf zookeeper-3.4.5.tar.gz

Create a directory on the hd204, hd205 and hd206 machines

[cbcloud@hd204 ~] $mkdir / home/cbcloud/zookeeperdata

[cbcloud@hd205 ~] $mkdir / home/cbcloud/zookeeperdata

[cbcloud@hd206 ~] $mkdir / home/cbcloud/zookeeperdata

1.5. execute the following on hd202

[cbcloud@hd202 ~] $mv zookeeper-3.4.5 zookeeper

[cbcloud@hd202 ~] $cd zookeeper/conf

[cbcloud@hd202 ~] $mv zoo_sample.cfg zoo.cfg

[cbcloud@hd202 ~] $vi zoo.cfg

# The number of milliseconds of each tick

TickTime=2000

# The number of ticks that the initial

# synchronization phase can take

InitLimit=10

# The number of ticks that can pass between

# sending a request and getting an acknowledgement

SyncLimit=5

# the directory where the snapshot is stored.

# do not use / tmp for storage, / tmp here is just

# example sakes.

DataDir=/home/cbcloud/zookeeperdata

# the port at which the clients will connect

ClientPort=2181

Server.1=hd204:2888:3888

Server.2=hd205:2888:3888

Server.3=hd206:2888:3888

# Be sure to read the maintenance section of the

# administrator guide before turning on autopurge.

# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance

# The number of snapshots to retain in dataDir

# autopurge.snapRetainCount=3

# Purge task interval in hours

# Set to "0" to disable auto purge feature

# autopurge.purgeInterval=1

Copy the zookeeper folder to the hd204, hd205 and hd206 virtual machines

[cbcloud@hd202] $scp-r zookeeper hd204:/home/cbcloud/

[cbcloud@hd202] $scp-r zookeeper hd205:/home/cbcloud/

[cbcloud@hd202] $scp-r zookeeper hd206:/home/cbcloud/

Create a new myid file under the / home/cbcloud/zookeeperdata directory of hd204, hd205 and hd206 virtual machines, and insert the numbers 1, 2, 3 in turn

[cbcloud@hd204 ~] $cd zookeeperdata

[cbcloud@hd204 zookeeperdata] $touch myid

[cbcloud@hd204 zookeeperdata] $vi myid

Add the following

1 # corresponds to the number of server.1=hd204:2888:3888 in the previous configuration file

[cbcloud@hd205 ~] $cd zookeeperdata

[cbcloud@hd205 zookeeperdata] $touch myid

[cbcloud@hd205 zookeeperdata] $vi myid

Add the following

2 # corresponds to the number of server.2=hd205:2888:3888 in the previous configuration file

[cbcloud@hd206 ~] $cd zookeeperdata

[cbcloud@hd206 zookeeperdata] $touch myid

[cbcloud@hd206 zookeeperdata] $vi myid

Add the following

3 # corresponds to the number of server.3=hd206:2888:3888 in the previous configuration file

Start zookeeper and execute zkServer.sh start under the / home/cbcloud/zookeeper/bin directory on hd204, hd205, and hd206 machines

[cbcloud@hd204 ~] $cd zookeeper

[cbcloud@hd204 zookeeper] $cd bin

[cbcloud@hd204 bin] $. / zkServer.sh start

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Starting zookeeper... STARTED

[cbcloud@hd205 ~] $cd zookeeper

[cbcloud@hd205 zookeeper] $cd bin

[cbcloud@hd205 bin] $. / zkServer.sh start

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Starting zookeeper... STARTED

[cbcloud@hd206 ~] $cd zookeeper

[cbcloud@hd206 zookeeper] $cd bin

[cbcloud@hd206 bin] $. / zkServer.sh start

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Starting zookeeper... STARTED

1.9 View the process status of zookeeper

[cbcloud@hd204 bin] $. / zkServer.sh status

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Mode: follower # as can be seen from this mode, hd204 is currently a follower mode

[cbcloud@hd205 bin] $. / zkServer.sh status

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Mode: leader # from this mode, we can see that hd204 is currently the leadership mode.

[cbcloud@hd206 bin] $. / zkServer.sh status

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Mode: follower # as can be seen from this mode, hd206 is currently a follower mode

2. View the process details of zookeeper

[cbcloud@hd204 bin] $echo stat | nc localhost 2181

Zookeeper version: 3.4.5-1392090, built on 09Accord 30 Universe 17:52 GMT

Clients:

/ 127.0.0.1 41205 [0] (queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0

Received: 2

Sent: 1

Connections: 1

Outstanding: 0

Zxid: 0x0

Mode: follower

Node count: 4

[cbcloud@hd205 bin] $echo stat | nc localhost 2181

Zookeeper version: 3.4.5-1392090, built on 09Accord 30 Universe 17:52 GMT

Clients:

/ 127.0.0.1 queued=0,recved=1,sent=0 38712 [0]

Latency min/avg/max: 0/0/0

Received: 2

Sent: 1

Connections: 1

Outstanding: 0

Zxid: 0x100000000

Mode: leader

Node count: 4

[cbcloud@hd206 bin] $echo stat | nc localhost 2181

Zookeeper version: 3.4.5-1392090, built on 09Accord 30 Universe 17:52 GMT

Clients:

/ 127.0.0.1 39268 [0] (queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0

Received: 2

Sent: 1

Connections: 1

Outstanding: 0

Zxid: 0x100000000

Mode: follower

Node count: 4

Step 3: build a HBase cluster

1. Prepare to install the media hbase-0.94.4.tar.gz

2. Use user cbcloud to upload the installation media to the / home/cbcloud/ directory on the hd202 virtual machine

3. Use cbcloud users to log in to the hd202 virtual machine and extract the hbase-0.94.4.tar.gz.

[cbcloud@hd202 ~] $tar zxvf hbase-0.94.4.tar.gz

[cbcloud@hd202 ~] $mv hbase-0.94.4 hbas

4. Create hbase's configuration file directory hbconf on all five virtual machines (using cbcloud user operations)

[cbcloud@hd202 ~] $mkdir / home/cbcloud/hbconf

[cbcloud@hd203 ~] $mkdir / home/cbcloud/hbconf

[cbcloud@hd204 ~] $mkdir / home/cbcloud/hbconf

[cbcloud@hd205 ~] $mkdir / home/cbcloud/hbconf

[cbcloud@hd206 ~] $mkdir / home/cbcloud/hbconf

5. Configure system environment variables (operated by root users)

[root@hd202 ~] # vi / etc/profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[root@hd203 ~] # vi / etc/profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[root@hd204 ~] # vi / etc/profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[root@hd205 ~] # vi / etc/profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[root@hd206 ~] # vi / etc/profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

6. Configure user environment variables (operated by cbcloud users)

[cbcloud@hd202 ~] $vi .bash _ profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[cbcloud@hd203 ~] $vi .bash _ profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[cbcloud@hd204 ~] $vi .bash _ profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[cbcloud@hd205 ~] $vi .bash _ profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

[cbcloud@hd206 ~] $vi .bash _ profile

Add the following at the end of the file

Export HBASE_CONF_DIR=/home/cbcloud/hbconf

Export HBASE_HOME=/home/cbcloud/hbase

7. Copy all the files under the confs subdirectory under the $HBASE_HOME directory to the $HBASE_CONF_DIR directory (operate on hd202 only)

[cbcloud@hd202 ~] $cp / home/cbcloud/hbase/conf/* / home/cbcloud/hbconf/

8. Edit the hbase_env.sh under the $HBASE_CONF_DIR directory (operate on hd202 only)

Find the export HBASE_OPTS= "- XX:+UseConcMarkSweepGC" line, comment it out, and then add the following

Export HBASE_OPTS= "$HBASE_OPTS-XX:+HeapDumpOnOutOfMemoryError-XX:+UseConcMarkSweepGC"

Export JAVA_HOME=/usr/java/default

Export HBASE_HOME=/home/cbcloud/hbase

Export HADOOP_HOME=/home/cbcloud/hadoop

Export HBASE_MANAGES_ZK=true / / zookeeper processes are automatically managed by HBASE

9. Edit the hbase_site.xml under the $HBASE_CONF_DIR directory (operate on hd202 only)

Add the following

Hbase.rootdir

Hdfs://hd202:9000/hbase

Hbase.cluster.distributed

True

Hbase.master

Hd202:60000

Hbase.master.port

60000

The port master should bind to.

Hbase.zookeeper.quorum

Hd204,hd205,hd206

Hbase.zookeeper.property.dataDir

/ home/cbcloud/zookeeperdata

10. Edit the regionservers file

Delete localhost, and then add the following

Hd204

Hd205

Hd206

11. Copy the $HBASE_HOME directory and $HBASE_CONF_DIR directory to the other four virtual machines

[cbcloud@hd202] $scp-r hbase hd203:/home/cbcloud/

[cbcloud@hd202] $scp-r hbase hd204:/home/cbcloud/

[cbcloud@hd202] $scp-r hbase hd205:/home/cbcloud/

[cbcloud@hd202] $scp-r hbase hd206:/home/cbcloud/

[cbcloud@hd202] $scp-r hbconf hd203:/home/cbcloud/

[cbcloud@hd202] $scp-r hbconf hd204:/home/cbcloud/

[cbcloud@hd202] $scp-r hbconf hd205:/home/cbcloud/

[cbcloud@hd202] $scp-r hbconf hd206:/home/cbcloud/

12. Start HBASE

[cbcloud@hd202 ~] $cd hbase

[cbcloud@hd202 hbase] $cd bin

[cbcloud@hd202 bin] $. / start-hbase.sh # start hbase on the primary node

Starting master, logging to / home/cbcloud/hbase/logs/hbase-cbcloud-master-hd202.out

Hd204: starting regionserver, logging to / home/cbcloud/hbase/logs/hbase-cbcloud-regionserver-hd204.out

Hd205: starting regionserver, logging to / home/cbcloud/hbase/logs/hbase-cbcloud-regionserver-hd205.out

Hd206: starting regionserver, logging to / home/cbcloud/hbase/logs/hbase-cbcloud-regionserver-hd206.out

[cbcloud@hd202 bin] $jps

3779 JobTracker

4529 HMaster

4736 Jps

3633 NameNode

[cbcloud@hd203 ~] $cd hbase

[cbcloud@hd203 hbase] $cd bin

[cbcloud@hd203 bin] $. / hbase-daemon.sh start master # launch HMaster on SecondaryNameNode

Starting master, logging to / home/cbcloud/hbase/logs/hbase-cbcloud-master-hd203.out

[cbcloud@hd203 bin] $jps

3815 Jps

3618 SecondaryNameNode

3722 HMaster

[cbcloud@hd204 hbconf] $jps

3690 TaskTracker

3614 DataNode

4252 Jps

3845 QuorumPeerMain

4124 HRegionServer

[cbcloud@hd205 hbconf] $jps

3826 QuorumPeerMain

3612 DataNode

3688 TaskTracker

4085 HRegionServer

4256 Jps

[cbcloud@hd206 ~] $jps

3825 QuorumPeerMain

3693 TaskTracker

4091 HRegionServer

4279 Jps

3617 DataNode

13. Use the WEB interface to view HMaster http://192.168.0.202:60010

14. Method of shutting down HBbase

Step 1: shut down the HMaster service on SecondaryNameNode

[cbcloud@hd203 ~] $cd hbase

[cbcloud@hd203 hbase] $cd bin

[cbcloud@hd203 bin] $. / hbase-daemon.sh stop master

Stopping master.

[cbcloud@hd203 bin] $jps

4437 Jps

3618 SecondaryNameNode

Step 2: shut down the HMaster service on NameNode

[cbcloud@hd202 ~] $cd hbase

[cbcloud@hd202 hbase] $cd bin

[cbcloud@hd202 bin] $. / stop-hbase.sh

Stopping hbase.

[cbcloud@hd202 bin] $jps

5620 Jps

3779 JobTracker

3633 NameNode

Step 3: shut down the zookeeper service

[cbcloud@hd204 ~] $cd zookeeper/bin

[cbcloud@hd204 bin] $. / zkServer.sh stop

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Stopping zookeeper... STOPPED

[cbcloud@hd204 bin] $jps

3690 TaskTracker

3614 DataNode

4988 Jps

[cbcloud@hd205 hbconf] $cd..

[cbcloud@hd205 ~] $cd zookeeper/bin

[cbcloud@hd205 bin] $. / zkServer.sh stop

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Stopping zookeeper... STOPPED

[cbcloud@hd205 bin] $jps

3612 DataNode

3688 TaskTracker

4920 Jps

[cbcloud@hd206 ~] $cd zookeeper

[cbcloud@hd206 zookeeper] $cd bin

[cbcloud@hd206 bin] $. / zkServer.sh stop

JMX enabled by default

Using config: / home/cbcloud/zookeeper/bin/../conf/zoo.cfg

Stopping zookeeper... STOPPED

[cbcloud@hd206 bin] $jps

4931 Jps

3693 TaskTracker

3617 DataNode

Step 4: close hadoop

[cbcloud@hd202 bin] $. / stop-all.sh

Stopping jobtracker

Hd205: stopping tasktracker

Hd204: stopping tasktracker

Hd206: stopping tasktracker

Stopping namenode

Hd205: stopping datanode

Hd206: stopping datanode

Hd204: stopping datanode

Hd203: stopping secondarynamenode

15. The order in which HBase is started is exactly the opposite of the order above

Step 1: start hadoop

Step 2: start the zookeeper on each DataNode node

Step 3: start HMaster on NameNode

Step 4: start HMaster on SecondaryNameNode

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.