In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
#
# install hadoop2.6.0 fully distributed cluster
#
# File and system version:
#
Hadoop-2.6.0
Java version 1.8.0_77
Centos 64 bit
# preparation
#
Under / home/hadoop/: mkdir Cloud
Put the java and hadoop installation packages under / home/hadoop/Cloud
# configure static ip
#
Master192.168.116.100
Slave1192.168.116.110
Slave2192.168.116.120
# modify machine-related names (all under root permission)
#
Su root
Vim / etc/hosts
Enter: (space + tab key) under the original information
192.168.116.100 master
192.168.116.110 slave1
192.168.116.120 slave2
Vim / etc/hostname
Master
Shutdown-r now (restart the machine)
Vim / etc/hostname
Slave1
Shutdown-r now
Vim / etc/hostname
Slave2
Shutdown-r now
# install openssh
#
Su root
Yum install openssh
Ssh-keygen-t rsa
And then confirmed all the time.
Send the public keys of slave1 and slave2 to master:
Scp / home/hadoop/.ssh/id_rsa.pub hadoop@master:~/.ssh/slave1.pub
Scp / home/hadoop/.ssh/id_rsa.pub hadoop@master:~/.ssh/slave2.pub
Under master: cd .ssh /
Cat id_rsa.pub > > authorized_keys
Cat slave1.pub > > authorized_keys
Cat slave2.pub > > authorized_keys
Send the public key package to slave1 and slave2:
Scp authorized_keys hadoop@slave1:~/.ssh/
Scp authorized_keys hadoop@slave2:~/.ssh/
Ssh slave1
Ssh slave2
Ssh master
Corresponding input yes
Here, ssh password-less login configuration is complete.
#
# Design JAVA_HOME HADOOP_HOME
#
Su root
Vim / etc/profile
Enter:
Export JAVA_HOME=/home/hadoop/Cloud/jdk1.8.0_77
Export JRE_HOME=$JAVA_HOME/jre
Export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
Export HADOOP_HOME=/home/hadoop/Cloud/hadoop-2.6.0
Export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Then source / etc/profile
(all three should be configured)
#
# configure hadoop file
#
Under / home/hadoop/Cloud/hadoop-2.6.0/sbin:
Vim hadoop-daemon.sh
Modify the path of pid
Vim yarn-daemon.sh
Modify the path of pid
Under / home/hadoop/Cloud/hadoop-2.6.0/etc:
Vim slaves input:
Master
Slave1
Slave2
Vim hadoop-env.sh input:
Export JAVA_HOME=/home/hadoop/Cloud/jdk1.8.0_77
Export HADOOP_HOME_WARN_SUPPRESS= "TRUE"
Vim core-site.xml input:
# core
Io.native.lib.avaliable
True
Fs.default.name
Hdfs://master:9000
True
Hadoop.tmp.dir
/ home/hadoop/Cloud/workspace/temp
# core
Vim hdfs-site.xml
# hdfs
Dfs.replication
three
Dfs.permissions
False
Dfs.namenode.name.dir
/ home/hadoop/Cloud/workspace/hdfs/data
True
Dfs.namenode.dir
/ home/hadoop/Cloud/workspace/hdfs/name
Dfs.datanode.dir
/ home/hadoop/Cloud/workspace/hdfs/data
Dfs.webhdfs.enabled
True
# hdfs
Vim mapred-site.xml
# # mapred
Mapred.job.tracker
Master:9001
# # mapred
Send the configured hadoop to slave1 and slave2
Scp-r hadoop-2.6.0 hadoop@slave1:~/Cloud/
Scp-r hadoop-2.6.0 hadoop@slave2:~/Cloud/
Send the Java package to slave1 and slave2:
Scp-r jdk1.8.0_77 hadoop@slave1:~/Cloud/
Scp-r jdk1.8.0_77 hadoop@slave2:~/Cloud/
At this point, the hadoop cluster configuration is complete
#
# you can now start hadoop
#
Format namenode first
Hadoop namenode-format (because of the hadoop-env.sh and system environment designed earlier, it can be executed in any directory)
If it's right to check the log, go down.
Start-all.sh
And then
For complete information, check it out through jps:
[hadoop@master ~] $jps
42306 ResourceManager
42407 NodeManager
42151 SecondaryNameNode
41880 NameNode
41979 DataNode
[hadoop@slave1 ~] $jps
21033 NodeManager
20926 DataNode
[hadoop@slave2 ~] $jps
20568 NodeManager
20462 DataNode
At this point, the hadoop-2.6.0 fully distributed configuration is complete.
The following is the browser port number of hadoop:
Localhost:50070
Localhost:8088
#
# configure C's API connection HDFS
#
Find /-name libhdfs.so.0.0.0
Vi / etc/ld.so.conf
Write:
/ home/hadoop/Cloud/hadoop-2.6.0/lib/native/
/ home/hadoop/Cloud/jdk1.8.0_77/jre/lib/amd64/server/
Then design the boot load:
/ sbin/ldconfig-v
Then configure the environment variables:
Find and print:
Find / home/hadoop/Cloud/hadoop-2.6.0/share/-name * .jar | awk'{printf ("export CLASSPATH=%s:$CLASSPATH\ n", $0);}'
You will see printed content such as:
Export CLASSPATH=/home/hadoop/Cloud/hadoop-2.6.0/share/hadoop/common/lib/activation-1.1.jar:$CLASSPATH
Export CLASSPATH=/home/hadoop/Cloud/hadoop-2.6.0/share/hadoop/common/lib/jsch-0.1.42.jar:$CLASSPATH
.
Add everything printed to the environment variable vim / etc/profile
Then write C language code to verify that the configuration is successful:
Vim above_sample.c
The code is as follows:
#
# include "hdfs.h"
# include
# include
# include
Int main (int argc, char * * argv) {
HdfsFS fs = hdfsConnect ("192.168.116.100", 9000); / / A little modification has been made here
Const char* writePath = "/ tmp/testfile.txt"
HdfsFile writeFile = hdfsOpenFile (fs,writePath, O_WRONLY | O_CREAT, 0,0,0)
If (! writeFile) {
Fprintf (stderr, "Failed toopen% s for writing!\ n", writePath)
Exit (- 1)
}
Char* buffer = "Hello,World!"
TSize num_written_bytes = hdfsWrite (fs,writeFile, (void*) buffer, strlen (buffer) + 1)
If (hdfsFlush (fs, writeFile)) {
Fprintf (stderr, "Failed to'flush'% s\ n", writePath)
Exit (- 1)
}
HdfsCloseFile (fs, writeFile)
}
#
Compile the C language code:
Gcc above_sample.c-I / home/hadoop/Cloud/hadoop-2.6.0/include/-L / home/hadoop/Cloud/hadoop-2.6.0/lib/native/-lhdfs / home/hadoop/Cloud/jdk1.8.0_77/jre/lib/amd64/server/libjvm.so-o above_sample
Perform the compilation to complete the generated above_sample file:
. / above_sample
Check to see if the log and hadoop file directories generate testfile files
At this point, the API connection HDFS configuration of C language is complete.
#
# File operation of cluster
#
# (automatically distribute scripts) auto.sh
Vim auto.sh
Chmod + x auto.sh
. / auto.sh jdk1.8.0_77 ~ / Cloud/
Automatically distribute scripts
# #
#! / bin/bash
Nodes= (slave1 slave2)
Num=$ {# nodes [@]}
File=$1
Dst_path=$2
For ((iTun0) I test1.txt imports all files from the current directory into hdfs's in directory: hadoop dfs-put / inhadoop dfs-ls / in/*hadoop dfs-cp / in/test1.txt / in/test1.txt.bakhadoop dfs-ls / in/*hadoop dfs-rm / in/test1.txt.bakmkdir dir_from_hdfs to dir_from_hdfs: hadoop dfs-get / in/* / dir_from_hdfscd / home/hadoop/Cloud / hadoop-1.2.1 is separated by spaces Count the number of words in all text files in the in directory (note that the output/wordcount directory cannot be an existing directory): hadoop jar hadoop-examples-2.6.0.jar wordcount in / output/wordcount view the statistical results: hadoop fs-cat output/wordcount/part-r-00000# management # 1. Cluster-related management: edit log: modify the log. When the file system client client writes, we will put this record in the modification log. After recording the modification log, NameNode modifies the data structure in memory. Before each write is successful, the edit log is synchronized to the file system fsp_w_picpath: the namespace mirror, which is the checkpoint of the in-memory metadata on the hard disk. When NameNode fails, the metadata information for the latest checkpoint is loaded into memory from fsp_w_picpath, and then note that the operation in the modification log is re-performed. Secondary NameNode is used to help metadata nodes checkpoint the metadata information in memory to the hard disk. two。 Cluster attributes: advantages: 1) ability to handle very large files; 2) streaming access to data. HDFS can handle the task of "write once, read and write multiple times" well. That is, once a dataset is generated, it is copied to different storage nodes and then responds to a variety of data analysis task requests. In most cases, the analysis task involves most of the data in the dataset. Therefore, the HDFS request to read the entire dataset is more efficient than reading a single record. Disadvantages: 1) not suitable for low-latency data access: HDFS is designed to handle large data set analysis tasks, mainly to achieve big data analysis, so the delay time may be high. 2) unable to store a large number of small files efficiently: because Namenode places the metadata of the file system in memory, the number of files that the file system can hold is determined by the memory size of Namenode. 3) Multi-user writing and arbitrary modification of files are not supported: there is only one writer in a file in HDFS, and the write operation can only be completed at the end of the file, that is, only append operations can be performed. At present, HDFS does not support multiple users to write to the same file and modify it anywhere in the file.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.