In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
I personally tested and built, because it is a beginner to build all the way down a lot of detours. If there are any wrong or more concise steps, please put forward
Environment: virtual machine installed on win10, centos7 installed on virtual machine, and liunx interface (previously built once, on the premise that both network and port can be telnet, but cannot access the service address in docker container, this time is to prevent host machine from being unable to access using virtual machine interface browser), centos7 command is different from centos6, and there is no iptables command in centos7, if you want to use your own installation.
The virtual machine ip:192.168.20.129 built by myself
Spark master node IP: 172.17.0.2 corresponds to docker container name cloud1
Spark worker node IP: 172.17.0.3 corresponds to docker container name cloud2
Spark worker node IP: 172.17.0.4 corresponds to the docker container name cloud3 installation docker container steps:
Command under 1.root permission to view the current kernel version $uname-r. Docker requires the CentOS system kernel version to be higher than 3.10.
two。 Log in to Centos with root permissions to ensure that the yum package is updated to the latest
Command: yum update
3. If the old version is installed, uninstall the old docker first
Command: yum remove docker docker-common docker-selinux docker-engine
4. Install the required software package (or dependent package)
Command: yum install-y yum-utils device-mapper-persistent-data lvm2
5. Set up the yum source
(Ali) order: yum-config-manager-- add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
(official website) order: yum-config-manager-add-repo https://download.docker.com/linux/centos/docker-ce.repo
6. View all docker versions in the repository and select
Command: yum list docker-ce-- showduplicates | sort-r
7. Install docker
Command: yum install docker-ce # since only stable repository is opened by default in repo, the latest stable version 17.12.0 is installed here
Command: yum install # for example: sudo yum install docker-ce-17.12.0.ce
8. Start and join the boot boot
Command: systemctl start docker
Systemctl enable docker
9. Check whether the installation is successful
Command: docker version
Set up a spark cluster in the docker container. The hadoop2.7,jdk1.8, spark2.4, scala2.12.8 and zookeeper3.4.12 installed here
Steps for docker container to build spark cluster:
1. First pull a ubuntn into the docker
Command: docker pull ubuntu
If the liun is too slow, you can configure the acceleration image https://www.daocloud.io to register your own users, and then find your own copy of the liun acceleration command on the acceleration page and execute it in liunx. For more information, please refer to https://blog.51cto.com/14159501/2338377.
two。 You can view the image after downloading the ubuntu to your local location.
Command: docker images
Create a java directory under the path / usr/local in centos7 (command mkdir java), which stores the installation packages such as jdk,spark,scala to be installed, and drag it directly to this directory with SSH. After the installation package of the directory, copy to the docker container. After installing centos7, the SSH port is enabled by default. You can use the command ps-ef | grep ssh to check whether the SSH is started (Note: SSH should also be installed and self-started in the docker container, otherwise there will be problems when starting the Hadoop node later, which will be described later when you build the container)
3. Run Mirror
Command: docker run-- name cloud1-h cloud1-- add-host cloud1:172.17.0.2-it ubuntu
The running mirror gives the name of a cloud1 and assigns the IP address 172.17.0.2
From the command: docker network inspect bridge, you can see the container name and IP address of the startup in container.
Check the ip address of the container according to the command docker inspect container name | grep IPAddress
4. Configure SSH in the container
View SSH status: service ssh status
If not, install:
Apt-get update-Update
Apt install net-tools-if the ifconfig command is not installed
Apt install iputils-ping-if the ping command is not installed
Apt-get install vim-install the vim command
Apt-get install ssh-install SSH
After installation, add / usr/sbin/sshd to the command vim ~ / .bashrc
If ssh default configuration root cannot log in, change PermitRootLogin no to yes in / etc/ssh/sshd_config
Generate access key
Cd ~ / switch to the root directory
Ssh-keygen-t rsa-P''- f ~ / .ssh/id_rsa
Cd .ssh enter the ssh directory
Cat id_rsa.pub > > authorized_keys
Service ssh start enables ssh
Ssh localhost date verifies that SSH can be used
Ssh root@cloud1 tests whether the connection is successful
Check to see if SSH: which ssh is installed
Which sshd
Check to see if the SSH service starts:
Ps aux | grep ssh
5. Create a java directory in the container / usr/local to place the toolkit to be installed
Under liunx instead of the container, use the command
The docker cp / usr/local/java/ container ID:/usr/local/ copies all the installation tools under java to the container (under the same path). Another way here is to map the / usr/local/java directory under liunx to the container when starting the image. Command: docker run-v / usr/local/java/:/usr/local-it ubuntu or command docker run-I-t-v / usr/local/java:/usr/local/java image ID / bin/bash
6. Start installing JDK. ZOOKEEPER, SCALA,HADOOP,SPARK
Change the directory to / usr/local/java, where all installation packages are located
6. 1: install JDK:
Command: tar-xzvf jdk.xx.xx.tar.gz decompress the package
Empowerment: chmod 777 decompressed jdk package
Delete compressed package: rm-rf jdk.xx.xx.tar.gz
Vim ~ / .bashrc add parameters for JDK
Export JAVA_HOME=/usr/local/java/jdk1.8.0_191
Export PATH=$PATH:$JAVA_HOME/bin
Save exit command: source ~ / .bashrc to make the changed file effective
Check whether java-version is installed successfully
6. 2: install scala
Decompress: tar-zxvf scala-2.12.8.tgz
Empowerment: chmod 777 scala-2.12.8
Delete compressed package: rm-rf scala-2.12.8
Vim ~ / .bashrc add parameters for scala
Export SCALA_HOME=/usr/local/java/scal-2.12.8
Export PATH=$PATH:$SCALA_HOME/bin
Save the exit command source ~ / .bashrc
Command to check whether the installation is successful: scala-version
6.3:Zookeeper installation
Decompress: tar-zxvf zookeeper-3.4.12.tar.gz
Empowerment: chmod 777 zookeeper-3.4.12
Delete compressed package: rm-rf zookeeper-3.4.12.tar.gz
Vim ~ / .bashrc add parameters for zookeeper
Export ZOOKEEPER_HOME=/usr/local/java/zookeeper-3.4.12
Export PATH=$PATH:$ZOOKEEPER_HOME/bin
Save the exit command source ~ / .bashrc
Generate zookeeper configuration file
Execute the command cp / usr/local/java/zookeeper-3.4.12/conf/zoo_sample.cfg / usr/local/java/zookeeper-3.4.12/conf/zoo.cfg under zookeeper-3.4.12/conf under zookeeper decompression package
Modify the zoo.cfg file
# modify the data storage directory to:
DataDir=/root/zookeeper/tmp (to first create a zookeeper directory under root, tmp directory command: mkdir ~ / zookeeper; mkdir ~ / zookeeper/tmp)
# add Zkserver configuration information at the end:
Server.1=cloud1:2888:3888
Server.2=cloud2:2888:3888
Server.3=cloud3:2888:3888
Create a file under the / root/zookeeper/tmp path myid command: touch ~ / zookeeper/tmp/myid
Execute the command under / root/zookeeper/tmp: echo 1 > ~ / zookeeper/tmp/myid
When you open vim myid, you can see that 1 is written in myid.
6.4: install Hadoop
Command tar-zxvf Hadoop-2.7.7.tar.gz
Vim ~ / .bashrc add parameters for hadoop (command: vi ~ / .bashrc)
Export HADOOP_HOME=/usr/local/java/hadoop-2.7.7 (filled in according to the actual catalog)
Export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
6.4.1: modify the Hadoop startup configuration file (/ usr/local/java/hadoop-2.6.4/etc/hadoop/hadoop-env.sh):
# modify JAVA_HOME
Export JAVA_HOE=/usr/local/java/jdk1.8.0_191
6.4.2: configure the core configuration file (/ usr/local/java/Hadoop-2.6.4/etc/Hadoop/core-site.xml)
Fs.defaultFS
Hdfs://ns1
Hadoop.tmp.dir
/ root/hadoop/tmp
Ha.zookeeper.quorum
Cloud1:2181,cloud2:2181,cloud3:2181
6.4.3: modify the HDFS configuration file (/ usr/local/java/Hadoop-2.6.4/etc/Hadoop/hdfs-site.xml)
Dfs.nameservices
Ns1
Dfs.ha.namenodes.ns1
Nn1,nn2
Dfs.namenode.rpc-address.ns1.nn1
Cloud1:9000
Dfs.namenode.http-address.ns1.nn1
Cloud1:50070
Dfs.namenode.rpc-address.ns1.nn2
Cloud2:9000
Dfs.namenode.http-address.ns1.nn2
Cloud2:50070
Dfs.namenode.shared.edits.dir
Qjournal://cloud1:8485;cloud2:8485;cloud3:8485/ns1
Dfs.journalnode.edits.dir
/ root/hadoop/journal
Dfs.ha.automatic-failover.enabled
True
Dfs.client.failover.proxy.provider.ns1
Org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
Dfs.ha.fencing.methods
Sshfence
Shell (/ bin/true)
Dfs.ha.fencing.ssh.private-key-files
/ root/.ssh/id_rsa
Dfs.ha.fencing.ssh.connect-timeout
30000
Dfs.http.address
0.0.0.0:50070
6.4.4: modify the configuration file of Yarn (/ usr/local/java/Hadoop-2.6.4/etc/Hadoop/yarn-site.xml)
Yarn.resourcemanager.hostname
Cloud1
Yarn.nodemanager.aux-services
Mapreduce_shuffle
6.4.5: modify the maprep-site.xml file (/ usr/local/java/Hadoop-2.6.4/etc/Hadoop/maprep-site.xml)
Mv mapred-site.xml.template mapred-site.xml
Vim mapred-site.xml
Mapreduce.framework.name
Yarn
6.4.6: modify the configuration file (/ usr/local/java/Hadoop-2.6.4/etc/Hadoop/slaves) for the specified DataNode and NodeManager
Cloud1
Cloud2
Cloud3
5 install spark
Command tar-zxvf spark-2.4.0-bin-hadoop2.7.tgz
Add the parameters of scala to the host ~ / .bashrc (command: vi ~ / .bashrc)
Export SPARK_HOME=/usr/local/java/spark-1.6.1-bin-hadoop2.6
Export PATH=PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
Spark startup configuration file:
Cp / usr/spark-1.6.1-bin-hadoop2.6/conf/spark-env.sh.template / usr/spark-1.6.1-bin-hadoop2.6/conf/spark-env.sh
Modify the contents of the spark-env.sh profile:
Export SPARK_MASTER_IP=cloud1
Export SPARK_WORKER_MEMORY=128m
Export JAVA_HOME=/usr/local/java/jdk1.8.0_191
Export SCALA_HOME=/usr/local/java/scala-2.12.8
Export SPARK_HOME=/usr/local/java/spark-1.6.1-hadoop2.6
Export HADOOP_CONF_DIR=/usr/local/java/hadoop-2.6.4/etc/hadoop
Export SPARK_LIBRARY_PATH=$$SPARK_HOME/lib
Export SCALA_LIBRARY_PATH=$SPARK_LIBRARY_PATH
Export SPARK_WORKER_CORES=1
Export SPARK_WORKER_INSTANCES=1
Export SPARK_MASTER_PORT=7077
Modify the configuration file (/ usr/local/java/spark-1.6.1-bin-hadoop2.6/conf/slaves) of the specified Worker:
Cloud1
Cloud2
Cloud3
Cluster deployment:
1 # submit the cloud1 container, and the command returns the number of the new image
2 docker commit cloud1
4 returns an id
3 # label the new image as Spark
4 docker tag cloud1
Commit this container into a new image
Then use this image to run two containers, one is cloud2~cloud3
#-h specify the hostname after the container is running
Docker run-- name cloud2-h cloud2-- add-host cloud2:172.17.0.3-- add-host cloud3:172.17.0.4-- add-host cloud1:172.17.0.2-it cloud1
Docker run-- name cloud3-h cloud3-- add-host cloud3:172.17.0.4-- add-host cloud1:172.17.0.2-- add-host cloud2:172.17.0.3-it cloud1...
# manually modify the myid in cloud2~cloud3
In cloud2: echo 2 > / zookeeper/tmp/myid opens the myid and there is a 2.
In cloud3: echo 3 > / zookeeper/tmp/myid opens the myid and there is a 3.
# start zookeeper cluster (start zk on cloud1, cloud2 and cloud3, respectively)
~ / zookeeper/bin/zkServer.sh start
# use status to check whether it is started (you can see the status only when cloud1 is started to cloud3)
~ / zookeeper/bin/zkServer.sh status
# start journalnode (start all journalnode on cloud1, note: this script is called hadoop-daemons.sh, note that it is the script in plural s)
# run the jps command to verify that there are many JournalNode processes on cloud1, cloud2 and cloud3
~ / hadoop/sbin/hadoop-daemon.sh start journalnode = = "start in every cloud
~ / hadoop/sbin/hadoop-daemons.sh start journalnode = = "what starts is all
# format HDFS (in the bin directory) and execute the command on cloud1:
~ / hadoop/bin/hdfs namenode-format
# format ZK (execute on cloud1, under bin directory)
~ / hadoop/bin/hdfs zkfc-formatZK
# start HDFS (execute on cloud1)
~ / hadoop/sbin/start-dfs.sh
# execute start-yarn.sh on cloud1
~ / hadoop/sbin/start-yarn.sh
# start spark cluster
~ / spark/sbin/start-all.sh
~ / spark/sbin/start-master.sh-start the master node
~ / spark/sbin/slaves.sh-- start all worker nodes, and enable local worker nodes separately for slave without s
After startup, the browser visits spark= > cloud1:8080 yarn= > cloud1:8088 hdfs= > cloud1:50070
Note: the three containers cloud1, cloud2, and cloud3 must ensure that SSH is enabled, and communication between nodes must be used.
Ensure that there are three node ip and names in each container / etc/hosts, and the virtual machine firewall must be turned off
=
Solving problems encountered in the construction process:
Install the curl command: > sudo apt-get install curl
If you prompt "Temporary failure resolving 'archive.ubuntu.com'" during the installation of curl, add nameserver 202.96.134.133 nameserver 8.8.8.8 to the / etc/resolv.conf file
If you want to use native ip access in a virtual machine, you have to map the port from the docker container:
Add port mapping (source: https://blog.csdn.net/hp_satan/article/details/77531794)
A, get the container ip
Docker inspect $container_name | grep IPAddress
b. Add forwarding rules
Iptables-t nat-A DOCKER-p tcp-- dport $host_port-j DNAT-- to-destination $docker_ip:$docker_port
123456
Delete Port Mapping Rul
a. Get rule number
Iptables-t nat-nL-- line-number
b. Delete rules according to number
Iptables-t nat-D DOCKER $num
[root@localhost] # iptables-t nat-A DOCKER-p tcp-- dport 8080-j DNAT-- to-destination 172.17.0.2 nat 8080
[root@localhost] # iptables-t nat-A DOCKER-p tcp-- dport 50070-j DNAT-- to-destination 172.17.0.2 nat 50070
Liunx View process command:
View the process:
1. The ps command is used to view the currently running processes.
Grep is a search
For example: ps-ef | grep java
It means to view the process information that CMD is java in all processes.
2. Ps-aux | grep java
-aux displays all statu
Ps
The kill command is used to terminate the process
For example: kill-9 [PID]
-9 means forcing the process to stop immediately
Usually use ps to view the process PID and use the kill command to terminate the process
The command to turn off the firewall in centos7
Systemctl stop firewalld.service # stop firewall
Systemctl disable firewalld.service # prevents firewall from booting
Firewall-cmd-- state # View the default firewall status (show notrunning when turned off and running when turned on)
Docker network problems:
Docker network ls shows how docker connects to the network.
Docker network inspect bridge is empty in the container at this time
Docker run-- name cloud1-h cloud1-- add-host cloud1:172.17.0.2-it ubuntu will have the cloud1 name in the container after the execution.
Systemctl restart docker
Docker inspect container_name | grep IPAddress to check the ip address of the container
The container installed by Docker's Ubuntu image does not have the ifconfig command and ping command
Resolve:
Apt-get update
Apt install net-tools # ifconfig
Apt install iputils-ping # ping
Apt-get install vim command
Copy from the host to the container docker cp host_path containerID:container_path
Copy from the container to the host docker cp containerID:container_path host_path
Start the container
Start the container and start bash (interactive mode):
$docker run-I-t / bin/bash
The startup container runs in the background (in a more general way):
$docker run-d-it image_name
Ps: the image_name here contains tag:hello.demo.kdemo:v1.0
Docker start container ID or container name-"this startup is not interactive
Docker run-d-p 80 12345 weba:v0.1 (container name)-"the background thread starts to map port 80 of the host to port 12345 in the mirror.
Docker run-d-p 80weba:v0.1 12345-name web weba:v0.1 (container name)- is to change the container name from weba:v0.1 to container
Docker attach container name or container ID general production environment does not use this command, access to some web services will always be stuck unresponsive, guess is listening blocking access
Docker exec-it container ID / bin/bash can be changed to: docker exec-it container ID sh if / bin/bash is not found
Attach to a running container
Docker attach
Go inside the running container and run bash (better than attach)
Docker exec-t-I
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.