In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Hadoop is a distributed file system (Hadoop Distributed File System) HDFS. Hadoop is a software framework capable of distributed processing of a large amount of data. Hadoop processes data in a reliable, efficient, and scalable manner. Hadoop is reliable because it assumes that computing elements and storage will fail, so it maintains multiple copies of working data to ensure that processing can be redistributed against failed nodes. Hadoop comes with a framework written in the Java language.
The master node of Hadoop includes name nodes, dependent name nodes, and jobtracker daemons, as well as utilities and browsers used to manage the cluster. Slave nodes include tasktracker and data nodes. The master node includes daemons that provide Hadoop cluster management and coordination, while the slave nodes include daemons that implement Hadoop file system (HDFS) storage functions and MapReduce functions (data processing functions).
Namenode is the master server in Hadoop, software that usually runs on separate machines in the HDFS instance, and manages file system namespaces and access to files stored in the cluster. One namenode and one secondary namenode can be found in each Hadoop cluster. When an external client sends a request to create a file, NameNode responds with the block ID and the DataNode IP address of the first copy of the block. This NameNode also notifies other DataNode that will receive a copy of the block.
The Datanode,hadoop cluster contains a NameNode and a large number of DataNode. DataNode is usually organized in the form of a rack, which connects all systems through a switching machine. The DataNode responds to read and write requests from the HDFS client. They also respond to commands from NameNode to create, delete, and copy blocks.
JobTracker is a master service. After the software starts, JobTracker receives the Job and is responsible for scheduling each subtask of the Job task to run on the TaskTracker, monitors them, and reruns it if a failed task is found.
TaskTracker is a slaver service that runs on multiple nodes. TaskTracker actively communicates with JobTracker, receives jobs, and is responsible for performing each task directly. TaskTracker needs to be run on HDFS's DataNode.
NameNode, Secondary, NameNode, and JobTracker run on the Master node, while on each Slave node, deploy a DataNode and TaskTracker so that the data handler running by the Slave server can process native data as directly as possible.
Server2.example.com 172.25.45.2 (master)
Server3.example.com 172.25.45.3 (slave)
Server4.example.com 172.25.45.4 (slave)
Server5.example.com 172.25.45.5 (slave)
Configuration of hadoop traditional version:
Server2,server3,server4 and server5 add hadoop users:
Useradd-u 900 hadoop
Echo westos | passwd-- stdin hadoop
Server2:
Sh jdk-6u32-linux-x64.bin # # install JDK
Mv jdk1.6.0_32/ / home/hadoop/java
Mv hadoop-1.2.1.tar.gz / home/hadoop/
Su-hadoop
Vim .bash _ profile
Export JAVA_HOME=/home/hadoop/javaexport CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/libexport PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
Source .bash _ profile
Tar zxf hadoop-1.1.2.tar.gz # # configure hadoop single node
Ln-s hadoop-1.1.2 hadoop
Cd / home/hadoop/hadoop/conf
Vim hadoop-env.sh
Export JAVA_HOME=/home/hadoop/java
Cd..
Mkdir input
Cp conf/*.xml input/
Bin/hadoop jar hadoop-examples-1.1.2.jar
Bin/hadoop jar hadoop-examples-1.1.2.jar grep input output 'dfs [a murz.] +'
Cd output/
Cat *
1 dfsadmin
Set master to slave login without password:
Server2:
Su-hadoop
Ssh-keygen
Ssh-copy-id localhost
Ssh-copy-id 172.25.45.3
Ssh-copy-id 172.25.45.4
Cd / home/hadoop/hadoop/conf
Vim core-site.xml # # specify namenode
Fs.default.namehdfs://172.25.45.2:9000
Vim mapred-site.xml # # specify jobtracker
Mapred.job.tracker172.25.45.2:9001
Vim hdfs-site.xml # # specify the number of copies saved in the file
Dfs.replication1
Cd..
Bin/hadoop namenode-format # # formatted into a new file system
Ls / tmp
Hadoop-hadoop hsperfdata_hadoop hsperfdata_root yum.log
Bin/start-dfs.sh # # start the hadoop process
Jps
Bin/start-mapred.sh
Jps
Open it in the browser: 172.25.45.2purl 50030
Open 172.25.45.2purl 50070
Bin/hadoop fs-put input test # # test the distributed file system into the newly created file
Bin/hadoop jar hadoop-examples-1.2.1.jar wordcount output
At the same time in the web page
View the uploaded files on the web page:
Bin/hadoop fs-get output test
Cat test/*
Rm-fr test/ # # Delete downloaded files
2. Server2:
Shared file system:
Su-root
Yum install nfs-utils-y
/ etc/init.d/rpcbind start
/ etc/init.d/nfs start
Vim / etc/exports
/ home/hadoop * (rw,anonuid=900,anongid=900)
Exportfs-rv
Exportfs-v
Server3 and server4:
Yum install nfs-utils-y
/ etc/init.d/rpcbind start
Showmount-e 172.25.45.2 # #
Export list for 172.25.45.2:
/ home/hadoop *
Mount 172.25.45.2:/home/hadoop / home/hadoop/
Df
Server2:
Su-hadoop
Cd hadoop/conf
Vim hdfs-site.xml
Dfs.replication2
Vim slaves # # ip on server
172.25.45.3172.25.45.4
Vim masters # # ip on Mastermind
172.25.45.2
Tip: # # if any previous process is open, it must be closed before it can be formatted to ensure that there are no processes running in jps.
To close a process
After bin/stop-all.sh # # execution, sometimes tasktracker,datanode will be turned on, so close them
Bin/hadoop-daemon.sh stop tasktracker
Bin/hadoop-daemon.sh stop datanode
Delete files in / tmp as hadoop user, and keep files without permission
Su-hadoop
Bin/hadoop namenode-format
Bin/start-dfs.sh
Bin/start-mapred.s
Bin/hadoop fs-put input test # #
Bin/hadoop jar hadoop-examples-1.2.1.jar grep test output 'dfs [a murz.] +' #
When uploading while opening 172.25.45.2 in the browser, you will find that there are files being uploaded.
Su-hadoop
Bin/hadoop dfsadmin-report
Dd if=/dev/zero of=bigfile bs=1M count=200
Bin/hadoop fs-put bigfile test
Open 172.25.45.2purl 50070 in the browser
3. Add server5.example.com 172.25.45.5 as the new slave terminal:
Su-hadoop
Yum install nfs-utils-y
/ etc/init.d/rpcbind start
Useradd-u 900 hadoop
Echo westos | passwd-- stdin hadoop
Mount 172.25.45.2:/home/hadoop/ / home/hadoop/
Su-hadoop
Vim hadoop/conf/slaves
172.25.45.3172.25.45.4172.25.45.5
Cd / home/hadoop/hadoop
Bin/hadoop-daemon.sh start datanode
Bin/hadoop-daemon.sh start tasktracker
Jps
Delete a slave:
Server2:
Su-hadoop
Cd / home/hadoop/hadoop/conf
Vim mapred-site.xml
Dfs.hosts.exclude/home/hadoop/hadoop/conf/datanode-excludes
Vim / home/hadoop/hadoop/conf/datanode-excludes
172.25.45.3 # # Delete 172.25.45.3 not as a slave
Cd / home/hadoop/hadoop
Bin/hadoop dfsadmin-refreshNodes # # refresh node
Bin/hadoop dfsadmin-report # # check the status of the node and find that the data on server3 is transferred to serve5
On server3:
Su-hadoop
Bin/stop-all.sh
Cd / home/hadoop/hadoop
Bin/hadoop-daemon.sh stop tasktracker
Bin/hadoop-daemon.sh stop datanode
Server2:
Vim / home/hadoop/hadoop/conf/slaves
172.25.45.4
172.25.45.5
4. Configure the new version of hadoop:
Server2:
Su-hadoop
Cd / home/hadoop
Tar zxf jdk-7u79-linux-x64.tar.gz
Ln-s jdk1.7.0_79/ java
Tar zxf hadoop-2.6.4.tar.gz
Ln-s hadoop-2.6.4 hadoop
Cd / home/hadoop/hadoop/etc/hadoop
Vim hadoop-env.sh
Export JAVA_HOME=/home/hadoop/javaexport HADOOP PREFIX=/home/hadoop/hadoop
Cd / home/hadoop/hadoop
Mkdir inp
Cp etc/hadoop/*.xml input
Tar-tf hadoop-native-64-2.6.0.tar
Tar-xf hadoop-native-64-2.6.0.tar-C hadoop/lib/native/
Cd / home/hadoop/hadoop
Rm-fr output/
Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar grep input output 'dfs [a murz.] +'
Cd / hone/hadoop/hadoop/etc/hadoop/
Vim slaves
172.25.45.3172.25.45.4
Vim core-site.xm
Fs.defaultFShdfs://172.25.45.2:9000
Vim mapred-site.xml
Mapred.job.tracker172.25.45.2:9001
Vim hdfs-site.xml
Dfs.replication2
Cd / home/hadoop/hadoop
Bin/hdfs namenode-format
Sbin/start-dfs.sh
Jps
Bin/hdfs dfs-mkdir / user/hadoop # # files to be uploaded must create a new directory before uploading
Bin/hdfs dfs-put input/ test
Rm-fr input/
Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar grep test output 'dfs [a murz.] +'
Bin/hdfs dfs-cat output/*
1dfsadmin
Open 172.25.45.2purl 50070 in the browser
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.