In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Environment preparation # support platform #
GNU/Linux is a platform for product development and operation. Hadoop has been verified on a cluster system consisting of 4000 nodes of GNU/Linux hosts.
Win32 platform is supported as a development platform. Since distributed operations have not been fully tested on the Win32 platform, they are not supported as a production platform.
Required software #
The software required for Linux and Windows includes:
JavaTM1.5.x, must be installed, it is recommended to choose the Java version issued by Sun.
Ssh must install and keep sshd running so that the remote Hadoop daemon can be managed with Hadoop scripts.
Additional software requirements under Windows
Cygwin-provides shell support in addition to the above software.
Installation step #
In this paper, Ubuntu as the test environment, in view of the configuration of the test environment, do not split complex users, first deployed to the current users.
Install softwar
If your cluster has not yet installed the required software, you have to install them first.
Update apt-get source configuration # $sudo apt-get update install java environment #
The environment of this article uses jdk1.7
There are two ways to use openjdk and install it directly with apt-get
$sudo apt-get install-y openjdk-7-jdk$export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
Or download jdk through the official website of oracle, and then extract the installation.
Http://www.oracle.com/technetwork/java/javase/archive-139210.html
Then set up JAVA_HOME
The environment of this article is JAVA_HOME=/usr/local/jdk
Lrwxrwxrwx 1 root root 22 Jun 22 10:20 jdk-> / usr/local/jdk1.7.0_80/drwxr-xr-x 8 uucp 4096 Apr 11 2015 jdk1.7.0_80/
Environment variables can be configured to .bash _ profile
Configure the SSH environment #
Install ssh services and clients
$sudo apt-get install-y openssh-server
Start the SSH service
$sudo service ssh start
Configure login-free
$ssh-keygen-t rsa-f ~ / .ssh/id_rsa-P''$cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys$chmod 6000.ssh/authorized_keys
Landing-free test
$ssh localhostThe authenticity of host 'localhost (:: 1)' can't be established.ECDSA key fingerprint is SHA256:8PGiorJvZpfFOJkMax6qVaSG8KyRRNnVJGjhNqVqh/k.Are you sure you want to continue connecting (yes/no)? yes$exit installation Hadoop#$cd / usr/local$sudo wget http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz$sudo tar xzvf hadoop-2.6.4.tar.gz$sudo ln-s hadoop-2.6.4.tar.gz hadoop# modify directory permission Change to the current user's $sudo chown-R XXXXX hadoop* configuration #
Configure pseudo-distributed:
Modify etc/hadoop/core-site.xml
Fs.defaultFS hdfs://localhost:9000
Etc/hadoop/hdfs-site.xml
Dfs.replication 1 starts hadoop# $bin/hdfs namenode-format $sbin/start-dfs.sh # View process $jps429 SecondaryNameNode172 NameNode1523 Jps286 DataNode
Namenode web address: http://localhost:50070/
You can execute the command to test it.
# create input files$ mkdir input$ echo "Hello Docker" > input/file2.txt$ echo "Hello Hadoop" > input/file1.txt# create input directory on HDFS$ hadoop fs-mkdir-p input# put input files to HDFS$ hdfs dfs-put. / input/* input# run wordcount$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount input output# print the input files$ echo-e "\ ninput file1.txt:" $hdfs dfs-cat input/file1.txt$ echo-e "\ ninput file2. Txt: "$hdfs dfs-cat input/file2.txt# print the output of wordcount$ echo-e"\ nwordcount output: "$hdfs dfs-cat output/part-r-00000
The following message appears during debug debugging:
WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
After consulting the information, it is said that the file is closed when the file is read quickly, and it may also be caused by other bug, which is ignored here.
You can also temporarily ban mapreduce.ifile.readahead=false.
Hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar wordcount-D mapreduce.ifile.readahead=false input output
MR is currently running in local mode. If you want to run through YARN, you need to configure and start the YARN service.
Single node YARN#
Modify etc/hadoop/mapred-site.xml
Mapreduce.framework.name yarn
Etc/hadoop/yarn-site.xml
Yarn.nodemanager.aux-services mapreduce_shuffle
Start the service
$sbin/start-yarn.sh
You can rerun the example of the previous step
You can check the RM web address: http://localhost:8088/
Docker configuration Test Environment #
The premise already has a docker environment
Will download
Jdk-7u80-linux-x64.tar.gz is placed in the jdk folder
Put the hadoop-2.6.4.tar.gz package in the dist folder
Directory structure:
. / ├── Dockerfile ├── dist │ └── hadoop-2.6.4.tar.gz └── jdk └── jdk-7u80-linux-x64.tar.gz
DockerFile hainiubl/hadoop-node:apache:
FROM ubuntu:latestMAINTAINER sandy # install softwareRUN apt-get updateRUN apt-get install-y ssh vim openssh-serverADD jdk/jdk-7u80-linux-x64.tar.gz / usr/localRUN ln-s / usr/local/jdk1.7.0_80 / usr/local/jdk & & rm-rf / usr/local/jdk-7u80-linux-x64.tar.gz# install hadoopADD dist/hadoop-2.6.4.tar.gz / usr/local/RUN ln-s / usr/local/hadoop-2.6.4 / Usr/local/hadoopENV JAVA_HOME=/usr/local/jdkENV HADOOP_HOME=/usr/local/hadoopENV HADOOP_MAPRED_HOME=$HADOOP_HOMEENV HADOOP_COMMON_HOME=$HADOOP_HOMEENV HADOOP_HDFS_HOME=$HADOOP_HOMEENV HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopENV YARN_HOME=$HADOOP_HOMEENV YARN_CONF_DIR=$HADOOP_HOME/etc/hadoopENV PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin# ssh without keyRUN ssh-keygen-t rsa-f ~ / .ssh/id_rsa-P'& &\ Cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys
Build:
$docker build-t hainiubl/hadoop-node:apache. /
Directory structure:
. / ├── Dockerfile ├── config │ ├── core-site.xml │ ├── hadoop-env.sh │ ├── hdfs-site.xml │ ├── mapred-site.xml │ ├── run-wordcount.sh │ └── yarn-site.xml
DockerFile hainiubl/hadoop-pseudo
ROM hainiubl/hadoop-node:apacheMAINTAINER sandy ADD config/* / root/RUN mv / root/core-site.xml $HADOOP_HOME/etc/hadoop/ & & mv / root/hadoop-env.sh $HADOOP_HOME/etc/hadoop/RUN chmod + x ~ / run-wordcount.sh & &\ chmod + x $HADOOP_HOME/sbin/start-dfs.sh & &\ chmod + x $HADOOP_HOME/sbin/start-yarn.shCMD ["sh", "- c", "/ etc/init.d/ssh start; bash"]
Build:
$docker build-t hainiubl/hadoop-pseudo:apache. /
Start the docker node:
$docker run-itd-p 50070 it hadoop-pseudo sh 50070-p 8088 it hadoop-pseudo sh 8088-- name hadoop-pseudo hainiubl/hadoop-pseudo:apache & $docker exec-it hadoop-pseudo sh-c "/ usr/local/hadoop/bin/hdfs namenode-format & & / usr/local/hadoop/sbin/start-dfs.sh & & bash"
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.