Hadoop common commands 07/15 Update SLTechnology News&Howtos

Hadoop common commands

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Network configuration

Hostname to view the name of the pearl

Vim / etc/sysconfig/network set hostname

Ifconfig to check ip status

Vim / etc/sysconfig/network-scripts/ifcfg-eth0 setup Network

DEVICE= "eth0" interface name (device, network card)

Configuration method of BOOTPROTO=STATIC IP (static: fixed IP,dhcp:,none: manual)

Is the network port valid when the ONBOOT=yes system starts?

IPADDR=192.168.1.2 IP URL

GATEWAY=192.168.1.0 Gateway

DNS1=8.8.8.8 DNS server

Service network restart restarts the network card service

Service network start starts network card service

Service network stop stops the network card service

Ifconfig eth0 up | down enables and closes the established Nic

Ifconfig to check whether the configured ip information is valid.

Vim / etc/hosts sets the mapping relationship between Zhuji and ip

192.168.1.2 master

192.168.1.3 slave1

192.168.1.4 slave2

Ping master

Service iptables stop, turn off the firewall

Chkconfig iptables off shuts down the self-starting firewall service

Configure SSH

Rpm-qa | grep openssh to see if the ssh service is installed

Rpm-qa | grep rsync to see if the rsync service is installed

Yum install ssh installs ssh protocol

Yum install rsync rsync is a remote data synchronization tool

Service sshd restart starts the sshd service

Ssh-keygen-t rsa-p''generates a cryptographic key pair (storage path is / home/Haddop/.ssh)

Cat ~ / .ssh/id_rsa.pub > > ~ / .ssh/authorized_keys appends id_rsa.pub to the authorized key

Chmod 6000.ssh/authorized_keys grants read and write permission

Vim / etc/ssh/sshd_config modifies the configuration file of sshd service

RSAAuthentication yes # enable RSA authentication

PubkeyAuthentication yes # enables public and private key pairing authentication

AuthorizedKeysFile .ssh / authorized_keys # public key file path (same as the file generated above)

Service sshd restart restarts the sshd service, and the modification takes effect.

Ssh master strictly verifies ssh login (password will be required for the first time)

Single-to-multipoint SSH password-less login

Ssh-keygen

The ssh-copy-id storm@slave1 format is "ssh-copy-id username @ hostname"

Ssh-copy-id storm@slave2 copies the public key of the local name to the authorized_keys file of the remote machine

Install JDK

Root user login

Mkdir / usr/java create / usr/java directory

Cp / root/Downloads/jdk-6u31-linux-i584.bin / usr/java replication

Permissions granted by chmod + x jdk-6u31-linux-i584.bin to execute

. / jdk-6u31-linux-i584.bin executes the extracted bin file

Rm-rf jdk-6u31-linux-i584.bin deletes the jdk installation file

Vim / etc/profile

Add the following at the end:

# set java environment

Export JAVA_HOME=/usr/java/jdk1.6.0_31/

Export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib

Export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

Source / etc/profile makes the configuration file of profile effective

Java-version verifies that jdk is installed successfully

Install the remaining machines:

Scp / usr/java/jdk1.6.0_31/ Hadoop@mster$i:/usr/java

Install using the shell script:

For i in $(seq 1 100)

Do echo slave$i

Scp / usr/javajdk1.6.0_31/ Hadoop@slave$i:/usr/java

Done

The configuration file of the profile environment variable should be configured and sent to all clusters at the next time.

Installation of Hadoop cluster

Cp / root/Downloads/Hadoop-1.0.0.tar.gz / usr

Cd / usr

Tar-zxvf Hadoop-1.0.0.tar.gz unpack the installation package for tar.gz

Rename the mv Hadoop-1.0.0.tar.gz hadoop folder

The owner of the chown-R Hadoop:Hadoop Hadoop hadoop file is reassigned,-R is recursive, and the hadoop folder is assigned to the hadoop users under the hadoop group

Rm-rf Hadoop-1.0.0.tar.gz delete installation files

Configure the environment variables for Hadoop

Vim / etc/profile

Export HADOOP_HOME=/usr/Hadoop

Export PATH=$PATH:$HADOOP_HOME/bin

Source / etc/profile makes the configuration effective

Configure hadoop

Configure hadoop-env.sh (file located in / usr/Hadoop/bin/conf)

Vim / usr/Hadoop/bin/conf/Hadoop-env.sh

Export JAVA_HOME=/usr/java/jdk1.6.0_31

Configure the core-site.xml file

Mkdir / usr/Hadoop/tmp creates the folder tmp, which is used to hold hadoop temporary data

Vim / usr/Hadoop/bin/conf/core-site.xml

Hadoop.tmp.dir

/ usr/hadoop/tmp

(note: please first set up a tmp folder under the / usr/hadoop directory, which defaults to the temporary directory of the system: / tmp/Hadoop-hadoop. This directory will be killed every time it is restarted, and format must be re-executed, otherwise an error will occur. )

A base for other temporary directories.

Fs.default.name

Hdfs://192.168.1.2:9000

Configure hdfs-site.xml. The default backup method is 3

Dfs.replication

one

(note: replication is the number of copies of data. If there are less than 3 copies of data, you will report an error by default.)

Configure mapred-site.xml

Modify the configuration file of mapreduce in hadoop, the address and port of the configured jobTracker

Mapred.job.tracker

Http://192.168.1.2:9001

Configure mster

Modify the / usr/Hadoop/conf/masters file to specify the hostname of the master machine

Vim / usr/Hadoop/conf/masters

192.168.1.2 (or master)

Configure slave

Vim / usr/Hadoop/conf/slaves

Slave1

Slave2

Note: when starting a stand-alone machine, the conf/slaves must not be empty. If there is no other machine, just designate yourself.

In a cluster environment, slaves can not be configured on slave machines.

Repeat this configuration on other machines in the cluster

It is recommended that ordinary users copy to the corresponding directory of other machines through scp under hadoop. Step 6 is unique to master machines.

Use the shell script:

For i in $(seq1 100)

Do echo slave$i

Scp / usr/hadoop Hadoop@slave$i:/usr

Scp / etc/profile Hadoop@slave$i:/etc

Done

After copying files, you may find that the hadoop directory is root permission

Chown-R hadoop:Hadoop Hadoop is authorized to hadoop users

Hadoop starts the relevant commands:

Hadoop namenode-format formats namenode on a master machine

You only need to execute it once. If you want to execute it again, be sure to delete the configuration in the configuration file core-site.xml first.

The file under the corresponding path of hadoop.tmp.dir of

Service iptables stop turns off all machine firewalls in the cluster

For i in (seq 1 100)

Do ssh node $I "hostname

Service iptables stop

Chkconfig iptables off

Service iptables status "

Done

Start-all.sh starts all services of hadoop, including (services related to hdfs and mapreduce)

As you can see from the startup log below, start namenode first, then datanode1,datanode2,., and then secondarynamenode. Start jobtracker again, and then start tasktracker1,tasktracker2,..

After starting hadoop successfully, the dfs folder is generated in the tmp folder in master, and the dfs file plus mapred folder is generated in the tmp folder in slave.

Jps View process

The result on master is

JobTracker

NameNode

Jps

SecondaryNameNode

The result on slave is

TaskTracker

DataNode

Jps

Hadoop dfsadmin-report to view the status of the hadoop cluster

Hadoop dfsadmin-safemode leave turns off the safe mode of hdfs

Http:192.168.1.2:50030 visits the corresponding web page of mapreduce

Http:192.168.1.2:50070 visits the corresponding web page of hdfs

The ultimate solution to which the server has been unable to start:

Delete the / usr/Hadoop/tmp file on all machines in the cluster

Delete the pid files on all machines in the cluster. It is saved in the / tmp directory by default. Authorization to hadoop user

Re-execute stop-all.sh and turn off the services that can be turned off first.

Execute the ps-ef | grep java | grep hadoop command to query whether there are any hadoop-related processes running. If so, the kill-9 process command kills them.

Reformat the words Zhuji master

Execute start-all.sh to start hadoop

When you find that there is no error, execute the Hadoop dfsadmin-report command to check the hadoop running status and find that only one node is started. There may still be a safe mode.

Execute hadoop dfsadmin-safemode leave to turn off safe mode on the host

Execute hadoop dfsadmin-report again

Solve the "no datanode to stop" problem

Reason:

Each namenode format will re-create a namenodeId, and the / tmp/dfs/data contains the id,namenode format under the last format to clear the data under the namenode, but not the data under the datanode, so the startup will fail. Before each format, clear all the directories under the tmp.

The first method:

Delete the tmp folder rm-fr / usr/Hadoop/tmp on master

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.