In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Environment description
According to the requirements, deploy the hadoop-3.0.0 basic functional architecture with three nodes as the installation environment and the operating system CentOS 7 x64
Openstack creates three virtual machines and starts deployment
IP address hostname
10.10.204.31 master
10.10.204.32 node1
10.10.204.33 node2
Functional node planning
Master node1 node2
NameNode
DataNode DataNode DataNode
HQuorumPeer NodeManager NodeManager
ResourceManager SecondaryNameNode
HMaster
Three nodes perform initialization operation
1. Update the system environment
Yum clean all & & yum makecache fast & & yum update-y & & yum install-y wget vim net-tools git ftp zip unzip
two。 Modify the hostname according to the plan
Hostnamectl set-hostname master
Hostnamectl set-hostname node1
Hostnamectl set-hostname node2
3. Add hosts parsing
Vim / etc/hosts
10.10.204.31 master
10.10.204.32 node1
10.10.204.33 node2
4.ping tests that the host names of the three hosts resolve to each other normally.
Ping master
Ping node1
Ping node2
5. Download and install the JDK environment
# hadoop 3.0 requires JDK 8.0 support
Cd / opt/
# normally, you need to log in to the oracle official website, register your account and agree to its agreement before you can download it. Here, you can download it directly by wget according to the link.
Wget-no-cookies-no-check-certificate-header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie"https://download.oracle.com/otn-pub/java/jdk/8u202-b08/1961070e4c9b4e26a04e7f5a083f551e/jdk-8u202-linux-x64.tar.gz""
# create JDK and hadoop installation paths
Mkdir / opt/modules
Cp / opt/jdk-8u202-linux-x64.tar.gz / opt/modules
Cd / opt/modules
Tar zxvf jdk-8u202-linux-x64.tar.gz
# configure environment variables
Export JAVA_HOME= "/ opt/modules/jdk1.8.0_202"
Export PATH=$JAVA_HOME/bin/:$PATH
Source / etc/profile
# permanent configuration method
Vim / etc/bashrc
# add lines
Export JAVA_HOME= "/ opt/modules/jdk1.8.0_202"
Export PATH=$JAVA_HOME/bin/:$PATH
6. Download unzipped hadoop-3.0.0 installation package
Cd / opt/
Wget http://archive.apache.org/dist/hadoop/core/hadoop-3.0.0/hadoop-3.0.0.tar.gz
Cp / opt/hadoop-3.0.0.tar.gz / modules/
Cd / opt/modules
Tar zxvf hadoop-3.0.0.tar.gz
7. Turn off selinux/firewalld Firewall
Systemctl disable firewalld
Vim / etc/sysconfig/selinux
SELINUX=disabled
8. Restart the server
Reboot
Master node operation
Description:
Test environment, all using root accounts to install and run hadoop
1. Add ssh password-free login
Cd
Ssh-keygen
# # enter three times
# copy key file to node1/node2
Ssh-copy-id master
Ssh-copy-id node1
Ssh-copy-id node2
two。 Test password-free login is normal
Ssh master
Ssh node1
Ssh node2
3. Modify hadoop configuration file
For hadoop configuration, you need to modify the configuration file:
Hadoop-env.sh
Yarn-env.sh
Core-site.xml
Hdfs-site.xml
Mapred-site.xml
Yarn-site.xml
Workers
Cd / opt/modules/hadoop-3.0.0/etc/hadoop
Vim hadoop-env.sh
Export JAVA_HOME=/opt/modules/jdk1.8.0_202
Vim yarn-env.sh
Export JAVA_HOME=/opt/modules/jdk1.8.0_202
Profile resolution:
Https://blog.csdn.net/m290345792/article/details/79141336
Vim core-site.xml
Fs.defaultFS
Hdfs://master:9000
Io.file.buffer.size
131072
Hadoop.tmp.dir
/ data/tmp
Hadoop.proxyuser.hadoop.hosts
Hadoop.proxyuser.hadoop.groups
# read / write buffer size in io.file.buffer.size queue file
Vim hdfs-site.xml
Dfs.namenode.secondary.http-address
Slave2:50090
Dfs.replication
three
The number of copies. The default configuration is 3, which should be less than the number of datanode machines.
Hadoop.tmp.dir
/ data/tmp
# namenode configuration
# dfs.namenode.name.dir NameNode persists the namespace and the path on the local file system of the transaction log, and if this is a comma-separated list of directories, tables with names are replicated in all directories for redundancy.
# dfs.hosts / dfs.hosts.exclude include / discard the list of data storage nodes, and use these files to control the list of allowed data storage nodes if necessary
# large file system with dfs.blocksize HDFS block size of 128MB (default)
# dfs.namenode.handler.count multiple NameNode server threads handle rpc from a large number of data nodes
# datanode configuration
A comma-separated path list of storage blocks on the local file system of # dfs.datanode.data.dir DataNode. If this is a comma-separated directory list, the data will be stored in all named directories, usually on different devices.
Vim mapred-site.xml
Mapreduce.framework.name
Yarn
Mapreduce.application.classpath
/ opt/modules/hadoop-3.0.0/etc/hadoop
/ opt/modules/hadoop-3.0.0/share/hadoop/common/
/ opt/modules/hadoop-3.0.0/share/hadoop/common/lib/
/ opt/modules/hadoop-3.0.0/share/hadoop/hdfs/
/ opt/modules/hadoop-3.0.0/share/hadoop/hdfs/lib/
/ opt/modules/hadoop-3.0.0/share/hadoop/mapreduce/
/ opt/modules/hadoop-3.0.0/share/hadoop/mapreduce/lib/
/ opt/modules/hadoop-3.0.0/share/hadoop/yarn/
/ opt/modules/hadoop-3.0.0/share/hadoop/yarn/lib/
Vim yarn-site.xml
Yarn.nodemanager.aux-services
Mapreduce_shuffle
Yarn.nodemanager.aux-services.mapreduce.shuffle.class
Org.apache.hadoop.mapred.ShuffleHandle
Yarn.resourcemanager.resource-tracker.address
Master:8025
Yarn.resourcemanager.scheduler.address
Master:8030
Yarn.resourcemanager.address
Master:8040
# resourcemanager and nodemanager configuration
# yarn.acl.enable allows ACLs. Default is false.
# yarn.admin.acl sets adminis on the cluster. ACLs are of for comma-separated-usersspacecomma-separated-groups. The default is to specify a value that represents anyone. In particular, the space indicates that there is no permission.
# yarn.log-aggregation-enable Configuration to enable or disable log aggregation configuration whether log aggregation is allowed.
# resourcemanager configuration
# yarn.resourcemanager.address value: ResourceManager host:port is used for client task submission. Description: if host:port is set, the yarn.resourcemanager.hostname.host:port hostname will be overwritten.
# yarn.resourcemanager.scheduler.address value: ResourceManager host:port is used by the application manager to obtain resources from the scheduler. Description: if host:port is set, the yarn.resourcemanager.hostname hostname will be overwritten
# yarn.resourcemanager.resource-tracker.address value: ResourceManager host:port is used for NodeManagers. Note: if you set host:port, the hostname setting of yarn.resourcemanager.hostname will be overridden.
# yarn.resourcemanager.admin.address value: ResourceManager host:port is used to manage commands. Description: if host:port is set, the setting of yarn.resourcemanager.hostname hostname will be overridden
# yarn.resourcemanager.webapp.address value: ResourceManager web-ui host:port. Description: if host:port is set, the setting of yarn.resourcemanager.hostname hostname will be overridden
# yarn.resourcemanager.hostname value: ResourceManager host. Description: can be set to replace all yarn.resourcemanager address resources with a single host name. As a result, the default port is the ResourceManager component.
# yarn.resourcemanager.scheduler.class value: ResourceManager scheduling class. Description: Capacity scheduling (recommended), Fair scheduling (also recommended), or Fifo scheduling. Use a fully qualified class name, such as org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.
# yarn.scheduler.minimum-allocation-mb value: the minimum memory allocated for each requested container on Resource Manager.
# yarn.scheduler.maximum-allocation-mb value: maximum memory allocated for each requested container on Resource Manager
# yarn.resourcemanager.nodes.include-path / yarn.resourcemanager.nodes.exclude-path value: allowed / discarded nodeManagers list description: if necessary, use these files to control the allowed NodeManagers list
Vim workers
Master
Slave1
Slave2
4. Modify startup file
# because the test environment starts the hadoop service with a root account, you need to add permissions to the startup file
Cd / opt/modules/hadoop-3.0.0/sbin
Vim start-dfs.sh
# add lines
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
Vim stop-dfs.sh
# add lines
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
Vim start-yarn.sh
# add lines
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
Vim stop-yarn.sh
# add lines
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
5. Push hadoop profile
Cd / opt/modules/hadoop-3.0.0/etc/hadoop
Scp. / root@node1:/opt/modules/hadoop-3.0.0/etc/hadoop/
Scp. / root@node2:/opt/modules/hadoop-3.0.0/etc/hadoop/
6. Format hdfs
# specify the hdfs storage path as / data/tmp/ in the configuration file
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode-format
7. Start the hadoop service
# namenode three nodes
Cd / opt/modules/zookeeper-3.4.13
. / bin/zkServer.sh start
Cd / opt/modules/kafka_2.12-2.1.1
. / bin/kafka-server-start.sh. / config/server.properties &
/ opt/modules/hadoop-3.0.0/bin/hdfs journalnode &
# master node
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode-format
/ opt/modules/hadoop-3.0.0/bin/hdfs zkfc-formatZK
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode &
# slave1 node
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode-bootstrapStandby
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode &
/ opt/modules/hadoop-3.0.0/bin/yarn resourcemanager &
/ opt/modules/hadoop-3.0.0/bin/yarn nodemanager &
# slave2 node
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode-bootstrapStandby
/ opt/modules/hadoop-3.0.0/bin/hdfs namenode &
/ opt/modules/hadoop-3.0.0/bin/yarn resourcemanager &
/ opt/modules/hadoop-3.0.0/bin/yarn nodemanager &
# namenode three nodes
/ opt/modules/hadoop-3.0.0/bin/hdfs zkfc &
# master node
Cd / opt/modules/hadoop-3.0.0/
. / sbin/start-all.sh
Cd / opt/modules/hadoop-3.0.0/hbase-2.0.4
. / bin/start-hbase.sh
8. Check that the hadoop service starts normally on each node
Jps
9. Run the test
Cd / opt/modules/hadoop-3.0.0
# create a test path on hdfs
. / bin/hdfs dfs-mkdir / testdir1
# create a test file
Cd / opt
Touch wc.input
Vim wc.input
Hadoop mapreduce hive
Hbase spark storm
Sqoop hadoop hive
Spark hadoop
# upload wc.input to HDFS
Bin/hdfs dfs-put / opt/wc.input / testdir1/wc.input
# run mapreduce Demo that comes with hadoop
. / bin/yarn jar / opt/modules/hadoop-3.0.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar wordcount / testdir1/wc.input / output
# View the output file
Bin/hdfs dfs-ls / output
10. Status screenshot
Take screenshots after all services are started normally:
Zookeeper+kafka+namenode+journalnode+hbase
Pass by a little like, the technology is promoted to the first line, come on ↖ (^ ω ^) ↗!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.