How to install HDFS 07/19 Update SLTechnology News&Howtos

How to install HDFS

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to install HDFS", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to install HDFS" this article.

1. System environment

Install java (greater than version 1.6)

Install ssh and rsync and start the ssh service

Download the hadoop package and use the compiled package, address: http://www.apache.org/dyn/closer.cgi/hadoop/common/

Environment description:

Linux distribution: centos6.5 (64-bit, the new version of hadoop only provides 64-bit packages)

Hadoop version: 2.5.1

Java version: 1.7.0.67

Three virtual machines are set up, host is namenode (192.168.59.103), datanode1 (192.168.59.104) and datanode2 (192.168.59.105)

two。 Configure SSH

1. Generate ssh key (both namenode and datanode machines need to be configured as follows)

$ssh-keygen-t dsa-P''- f ~ / .ssh/id_dsa$ cat ~ / .ssh/id_dsa.pub > > ~ / .ssh/authorized_keys$ chmod 644authorized_keys # this step is needed in centos, but not in ubuntu

two。 Through the following command, if you can log in directly, the password-free login has been set up successfully.

$ssh localhost

3. Also generate auto-login ssh key on two datanode machines, and id_dsa.pub namenode to the .ssh directory of datanode1 and datanode2

# on the namenode machine, the .ssh$ scp id_dsa.pub hadoop@datanode1:~/.ssh/id_dsa.pub.namenode$ scp id_dsa.pub hadoop@datanode2:~/.ssh/id_dsa.pub.namenode# under the root directory is under $cat id_dsa.pub.namenode > > authorized_keys under the .ssh directory of datanode1 and datanode2

4. Verify that you can log in to dataname1 and datanode2 without a password through ssh, and the red indicates that the configuration is successful.

[hadoop@namenode .ssh] $ssh datanode1Last login: Sun Nov 30 11:03:52 2014 from 192.168.59.103 [hadoop@datanode1] $exitlogoutConnection to datanode1 closed. [hadoop@namenode .ssh] $ssh datanode2Last login: Sun Nov 30 11:03:15 2014 from localhost.localdomain [hadoop@datanode2 ~] $exitlogoutConnection to datanode2 closed.3. Configure basic environment variables

Under the namenode,datanode1 and datanode2 roots, respectively. bash_profile configures the following:

# java environment variable export JAVA_HOME=/home/hadoop/local/jdkexport PATH=$JAVA_HOME/bin:$PATHexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar# hadoop environment variable export HADOOP_DEV_HOME=/home/hadoop/local/hadoopexport PATH=$PATH:$HADOOP_DEV_HOME/binexport PATH=$PATH:$HADOOP_DEV_HOME/sbinexport HADOOP_MAPARED_HOME=$ {HADOOP_DEV_HOME} export HADOOP_COMMON_HOME=$ {HADOOP_DEV_HOME} export HADOOP_HDFS_HOME=$ {HADOOP_DEV _ HOME} export YARN_HOME=$ {HADOOP_DEV_HOME} export HADOOP_CONF_DIR=$ {HADOOP_DEV_HOME} / etc/hadoopexport HDFS_CONF_DIR=$ {HADOOP_DEV_HOME} / etc/hadoopexport YARN_CONF_DIR=$ {HADOOP_DEV_HOME} / etc/hadoopexport CLASSPATH=.:$JAVA_HOME/lib:$HADOOP_HOME/lib:$CLASSPATHexport PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

After the configuration is complete, you need to log in again to load the environment variables (or through source ~ /. Bash_profile); you can see if the environment variables are configured properly by using the following command.

[hadoop@namenode ~] $java-versionjava version "1.7.0x67" Java (TM) SE Runtime Environment (build 1.7.0_67-b01) Java HotSpot (TM) 64-Bit Server VM (build 24.65-b04 Mixed mode) [hadoop@namenode ~] $hadoopUsage: hadoop [--config confdir] COMMANDwhere COMMAND is one of:fs run a generic filesystem user clientversion print the versionjar run a jar filechecknative [- a |-h] checknative hadoop and compression libraries availabilitydistcp copy file or directories recursivelyarchive-archiveName NAME-p * create a hadoop archiveclasspath prints the classpath needed to get theHadoop jar and the required librariesdaemonlog get/set the log level for each daemonorCLASSNAME run the class named CLASSNAMEMost commands print help when invoked w parameters.4. Configure hadoop

After the namenode configuration is completed, you can directly scp to other datanode, which ensures the consistency of the machine configuration.

$cd ~ / local/hadoop/etc/hadoop

All the configuration files are here.

Open hadoop-env.sh and configure the jdk environment variable:

# replace export JAVA_HOME=$ {JAVA_HOME} with export JAVA_HOME=/home/hadoop/local/jdk

Configure yarn-env.sh:

# export JAVA_HOME=/home/y/libexec/jdk1.6.0/ remove the comments and replace them with the following content export JAVA_HOME=/home/hadoop/local/jdk

Configure the slave file (datanode machine configuration for hdfs):

[hadoop@namenode hadoop] $cat slavesnamenodedatanode1datanode2

Configure core-site.xml (core policy profile for hadoop)

[hadoop@namenode hadoop] $cat core-site.xmlfs.default.namehdfs://namenode:9000hadoop.tmp.dir/home/hadoop/hadoop/tmpA base for other temporary directories.

Configure hdfs-site.xml (policy profile for hdfs):

Dfs.namenode.secondary.http-addressnamenode:9001dfs.namenode.name.dirfile:/home/hadoop/hadoop/namedfs.datanode.data.dir file:/home/hadoop/hadoop/data dfs.replication 2storage copy numberdfs.webhdfs.enabledtrue

At this point, the entire hdfs distributed file system is configured and the above hadoop-env.sh,yarn-env.sh,core-site.xml,hdfs-site.xml is synchronized to the datanode machine.

5. Start the service

Use the following command to do so

[hadoop@namenode hadoop] $cd / home/hadoop/local/hadoop [hadoop@namenode hadoop] $bin/hdfs namenode-format [hadoop@namenode hadoop] $sbin/start-dfs.sh

The second command is to format the namenode node of the entire hdfs. After formatting, the following directory structure appears under the name of the configured hadoop

[hadoop@namenode hadoop] $tree namename ├── current │ ├── edits_inprogress_0000000000000000001 │ ├── fsimage_0000000000000000000 │ ├── fsimage_0000000000000000000.md5 │ ├── seen_txid │ └── VERSION └── in_use.lock

In this directory, there are two important files: fsimage and edits.

The fsimage image file contains the indoe information of all directories and files of the entire HDFS file system. For files, it includes block description information, modification time, access time, etc.; for directories, it includes modification time, access control information (directory users, groups, etc.) and so on.

In addition, the edit file mainly records all kinds of update operations carried out by HDFS when NameNode has been started, and all write operations performed by the HDFS client will be recorded in the edit file.

After starting dfs, you can see the following directory results from the data directory we configured

[hadoop@namenode hadoop] $tree datadata ├── current │ ├── BP-441758184-192.168.59.103-1417330891399 │ │ ├── current │ ├── finalized │ ├── rbw │ └── VERSION │ │ ├── dncp_block_verification.log.curr │ │ └── tmp │ VERSION in_use.lock

After executing the put command, you can see how a file is stored in data. As follows:

$hadoop fs-put etc/hadoop/core-site.xml / data/input [hadoop@namenode hadoop] $tree datadata ├── current │ ├── BP-441758184-192.168.59.103-1417330891399 │ │ ├── current │ ├── finalized │ ├── blk_1073741827 │ └── blk_1073741827_1003.meta │ ├── rbw │ ├── blk_1073741825 │ ├── blk_1073741825_1001.meta │ ├── blk_1073741826 │ └── blk_1073741826_1002.meta │ └── VERSION │ │ ├── dncp_block_verification.log.curr │ │ ├── dncp_block_verification.log.prev │ │ └── tmp │ └── VERSION └── in_use.lock are all the contents of the article "how to install HDFS" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.