In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1.hadoop introduction
Hadoop is an open source distributed computing platform under the Apache Software Foundation. Hadoop, based on Hadoop distributed File system (HDFS,Hadoop Distributed Filesystem) and MapReduce (open source implementation of Google MapReduce), provides users with a transparent distributed infrastructure.
For Hadoop clusters, they can be divided into two main categories of roles: Master and Salve. A HDFS cluster consists of a NameNode and several DataNode. NameNode as the master server manages the namespace of the file system and the client's access to the file system, and the DataNode in the cluster manages the stored data. The MapReduce framework consists of a single JobTracker running on the master node and a TaskTracker running on each cluster slave node. The master node is responsible for scheduling all the tasks that make up a job, which are distributed on different slave nodes. The master node monitors their execution and re-performs the previous failed tasks; the slave node is only responsible for the tasks assigned by the master node. When a Job is submitted, after the JobTracker receives the submission job and configuration information, it distributes the configuration information to the slave node, schedules the task and monitors the execution of the TaskTracker.
As can be seen from the above introduction, HDFS and MapReduce together constitute the core of the Hadoop distributed system architecture. HDFS implements distributed file system on the cluster, while MapReduce implements distributed computing and task processing on the cluster. HDFS provides file operation and storage support in the process of MapReduce task processing. MapReduce implements task distribution, tracking and execution on the basis of HDFS, and collects results. The two interact with each other to complete the main tasks of Hadoop distributed cluster.
1.2 description of the environment
Master 192.168.0.201
Slave 192.168.0.220
Both nodes are CentOS7
1.3 Environmental preparation
Permanently shut down the firewall and selinux
Systemctl disable firewalldsystemctl stop firewalldsetenforce 0
1.4 Network configuration
Two modified hostnames: master/salve
Set hosts to parse each other
1.5 configure ssh mutual trust
Master yum-y install sshpass ssh-keygen enter all the way ssh-copy-id-I ~ / .ssh/id_rsa.pub root@192.168.0.220slave yum-y install sshpass ssh-keygen all the way ssh-copy-id-I ~ / .ssh/id_rsa.pub root@192.168.0.201 test ssh the other host, if you are not prompted for a password, OK
two。 Install JDK
Both machines are installed.
Tar zxvf jdk-8u65-linux-x64.tar.gzmv jdk1.8.0_65 / usr/jdk
2.1 set environment variables
Both machines are set up
Export JAVA_HOME=/usr/jdkexport JRE_HOME=/usr/jdk/jreexport CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/libexport PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin executes source / etc/profile
3. Test JDK
Java-version
3.1 install Hadoop
Download CDH-2.6-hadoop:archive.cloudera.com/cdh6 from the official website
Tar zxvf hadoop-2.6.0-cdh6.4.8.tar.gzmv hadoop-2.6.0-cdh6.4.8 / usr/hadoopcd / usr/hadoopmkdir-p dfs/namemkdir-p dfs/datamkdir-p tmp
3.2 add slave
Cd / usr/hadoop/etc/hadoop vim slaves 192.168.0.220 # add slaveIP
3.3 modify hadoop-env.sh and yarn.env.sh
Vim hadoop-env.sh / vim yarn-env.shexport JAVA_HOME=/usr/jdk # add java variable
3.4 modify core-site.xml
Fs.defaultFS hdfs://192.168.0.201:9000 io.file.buffer.size 131702 hadoop.tmp.dir / usr/hadoop/tmp hadoop.proxyuser.hadoop.hosts * Hadoop.proxyuser.hadoop.groups *
3.5 modify hdfs-site.xml
Dfs.namenode.name.dir / usr/hadoop/dfs/name dfs.datanode.data.dir / usr/hadoop/dfs/data dfs.replication 2 dfs.namenode.secondary.http-address 192. 168.0.201:9001 dfs.webhdfs.enabled true dfs.permissions false
3.6 modify mapred-site.xml
Configuration > mapreduce.framework.name yarn mapreduce.jobhistory.address 192.168.0.201purl 10020 mapreduce.jobhistory.webapp.address 192.168.0.2011888
3.7Modification of yarn-site.xml
Yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.auxservices.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.resourcemanager.address 192.168.0.201:8032 yarn.resourcemanager.scheduler.address 192.168.0.201:8030 yarn.resourcemanager.resource-tracker.address 192.168.0.201:8031 yarn.resourcemanager.admin.address 192.168.0.201:8033 yarn.resourcemanager.webapp.address 192.168.0.201:8088 yarn.nodemanager.resource.memory-mb 768
4. Copy the configuration file to the server
Scp-r / usr/hadoop root@192.168.0.220:/usr/
5. Format nanenode
. / bin/hdfs namenode-format
5.1 start hdfs
. / sbin/start-dfs.sh$. / sbin/start-yarn.sh
5.2 check startup
Enter 192.168.0.201140 8088
Enter the URL: 192.168.0.201pur9001
Detailed description of the profile:
Core-site.xml
The hadoop.tmp.dir hadoop file system depends on the underlying configuration, and many paths depend on it. If the storage location of namenode and datanode is not configured in hdfs-site-xml, the default fs.defaultFS is placed under this path. The value here refers to the default HDFS path. There is only one HDFS cluster, which is specified here!
Hdfs-site.xml
Dfs.replication specifies the number of copies of the DataNode storage block. The default value is 3. We now have 4 DataNode, which is less than 4.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.