Hadoop install 04/07 Update SLTechnology News&Howtos

Hadoop install

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

1.Hadoop installation steps

Copy the Hadoop file to the / usr/local directory and extract Tar-zxvf hadoop-3.0.0.tar.gz

Rename the decompressed file to hadoop mv hadoop-3.0.0.tar.gz hadoop

1.1.Configuring host ip mapping relationship vim / etc/host

172.26.19.40 hmaster

172.26.19.41 hslave1

172.26.19.42 hslave2

172.26.19.43 hslave3

1.2.Configuring Hadoop classpath for vim / etc/profile

# set java environment / usr/local/java/jdk1.8.0_151

Export JAVA_HOME=/usr/java/jdk1.8.0_151

Export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

Export PATH=$PATH:$JAVA_HOME/bin

Export PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin

Export PATH=$PATH:/usr/local/hive/bin

two。 Configure Hadoop related, configure in / usr/local/hadoop/etc/hadoop directory

Vim hadoop-env.sh runtime environment depends on Java JDK

Export JAVA_HOME=/usr/java/jdk1.8.0_151

Configure Hadoop NameNode node (Hadoop is ready to start)

2.1. vim core-site.xml each node needs to be configured.

-configure the communication node url, which is required by all nodes

The hadoop.tmp.dir configuration Hadoop file stores the default directory (if not configured, the temporary directory / tmp is saved by default)

Fs.defaultFS

Hdfs://master:9000

Hadoop.tmp.dir

/ var/hadoop

3.Hadoop defaults to 4 basic configuration files

Hadoop default configuration

Core-default.xml-corresponding core-site.xml

Hdfs-default.xml- corresponds to hdfs-site.xml

Mapred-default.xml

Yarn-default.xml

3. Override the default configuration after Hdfs-site.xml configuration (Hdfs-site.xml can only be configured in Namenode)

-configure the number of saved copies of hadoop replication (can only be configured on namenode nodes)

Number of copies saved by dfs.replication HDFS (3 copies by default)

Interval between dfs.namenode.heartbeat.recheck-interval DataNode health checks (milliseconds)

During dfs.permissions.enabled testing, you can turn off permission checking (otherwise there is no permission to access it)

Dfs.replication

three

Dfs.namenode.heartbeat.recheck-interval

20000

Dfs.permissions.enabled

False

3.2. mapred-site.xml can only be configured in Namenode)

Mapreduce.framework.name associates mapreduce with yarn resource scheduling platform (that is, mapreduce computing engine uses yarn as scheduling engine)

Mapreduce.framework.name

Yarn

3.3yarn-site.xml can only be configured in Namenode

Yarn.resourcemanager.hostname configure the hostname of the Namenode

Yarn.nodemanager.aux-services/yarn.nodemanager.aux-services.mapreduce_shuffle.class configuration Computing MapReduce Computing Service

Yarn.resourcemanager.hostname

Hmaster

Yarn.nodemanager.aux-services

Mapreduce_shuffle

Yarn.nodemanager.aux-services.mapreduce_shuffle.class

Org.apache.hadoop.mapred.ShuffleHandler

4. After the basic configuration, format the namenode node first.

Hdfs namenode-format

Start-dfs.sh starts all nodes of the Hadoop cluster

Stop-dfs.sh stops all nodes of the Hadoop cluster

Start the namenode node (master)

Hdfs-daemon start namenode

Hdfs-daemon stop namenode

Hadoop-daemon.sh start namenode

Hadoop-daemon.sh stop namenode

Hadoop-daemon.sh start datanode

Hadoop-daemon.sh stop datanode

Jps to see if the relevant processes are up.

Hdfs dfsadmin-report | more to view the status of the Hadoop cluster

Http://172.26.19.40:50070/ Hadoop Cluster UI Management Interface

Http://172.26.19.40:9000/ Cluster Internal Communication Interface Service

5. Only under the namenode node / usr/local/hadoop/etc/hadoop

Add all DataNode nodes hostname to the slaves file (to facilitate future scripts to configure all slaves nodes in batches)

Vim slaves

Hslave1

Hslave2

Hslave3

6. Privacy setting

Cd ~ go to the root directory ll-an and find the .ssh file

Execute ssh-keygen-t rsa in the .ssh directory to generate the root user private key and public key as follows

Then copy the public key id_rsa.pub to all slave nodes .ssh directory

(when logging in to slave with root on master, the private key of master is encrypted and transmitted to the slave node. The public key copied by all slave nodes can be decrypted, which proves that the root user has logged in.)

Execute the command Ssh-copy-id slave2 to copy the public key id_rsa.pub to the .ssh directory of slave2 (generate the file authorized_keys)

At this time, you can log in to slave1 without secret access on master.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.