In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Prerequisites: JDK1.8, no secret communication, zookeeper, hadoop
Use server list
masterslave1slave2192.168.3.58192.168.3.54192.168.3.31QuorumPeerMainQuorumPeerMainNameNodeDataNodeDataNodeJournalNodeManagerDFSZKFailoverControllerDFSZKFailoverControllerMasterWorkerWorkerscala What is scala (Baidu Encyclopedia)
Scala is a multi-paradigm programming language, a Java-like programming language [1] designed to implement scalable languages [2] and integrate features of object-oriented programming and functional programming.
Download and install scala
Official download address: www.scala-lang.org/download/
download
cd /data
wget https://downloads.lightbend.com/scala/2.12.4/scala-2.12.4.tgz
tar axf scala-2.12.4.tgz
Add environment variables
vim /etc/profile
#scalaexport SCALA_HOME=/data/scala-2.12.4export PATH=$PATH:${SCALA_HOME}/bin
source /etc/profile
inspection
scala -version
Display version information indicating successful installation
What is spark?
Apache spark is a fast and general engine for large-scale data processing.
Spark is a fast and versatile big data computing engine.
Apache Spark is a fast, general-purpose computing engine designed for large-scale data processing.
Continue with Baidu Encyclopedia:
Spark is an open source clustered computing environment similar to Hadoop, but there are some useful differences that make Spark superior for certain workloads, in other words, Spark enables in-memory distributed datasets that optimize iterative workloads in addition to providing interactive queries.
Spark is implemented in the Scala language and uses Scala as its application framework. Unlike Hadoop, Spark and Scala are tightly integrated, with Scala making it easy to manipulate distributed data sets like native collection objects.
Download and install spark
Official website: spark.apache.org/
Here I followed the advice of my predecessors and used one or two versions lower than the latest version.
cd /data
wget https://www.apache.org/dyn/closer.lua/spark/spark-2.1.2/spark-2.1.2-bin-hadoop2.7.tgz
tar axf spark-2.1.2-bin-hadoop2.7.tgz
Add environment variables
vim /etc/profile
#spackexport SPARK_HOME=/data/spark-2.1.2-bin-hadoop2.7export PATH=$PATH:${SPARK_HOME}/bin
source /etc/profile
modify the configuration file
cd ${SPARK_HOME}/conf
cp fairscheduler.xml.template fairscheduler.xml
cp log4j.properties.template log4j.properties
cp slaves.template slaves
cp spark-env.sh.template spark-env.sh
cp spark-defaults.conf.template spark-site.conf
vim slaves
#Delete localhost, add worker node information
masterslave1slave2
vim spark-env.sh
#Add the following information: JAVA_HOME, SCALA_HOME, SPARK_MASTER_IP, SPARK_WORKER_MEMORY, HADOOP_CONF_DIR
export JAVA_HOME=/usr/local/jdkexport SCALA_HOME=/data/scala-2.12.4export SPARK_WORKER_MEMORY=1gexport HADOOP_CONF_DIR=/data/hadoop/etc/hadoop/
vim spark-site.conf
spark.master spark://master:7077
Copy files to other nodes
scp -r /data/scala-2.12.4 slave1:/data
scp -r /data/scala-2.12.4 slave2:/data
scp -r /data/spark-2.1.2-bin-hadoop2.7 slave1:/data
scp -r /data/spark-2.1.2-bin-hadoop2.7 slave2:/data
scp -r /etc/profile slave1:/etc
scp -r /etc/profile slave2:/etc
Start cluster
cd ${SPARK_HOME}/sbin
./ start-all.sh
Single-node startup master
cd ${SPARK_HOME}/sbin
./ start-master.sh
Single-node startup slave
./ start-slave.sh
Multi-master, HA implementation
All nodes are modified
Modify spark-site.conf
vim spark-site.conf
spark.master spark://master:7077,slave1:7077,slave2:7077
Modify spark-env.sh to specify zookeeper cluster
vim spark-env.sh
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master:2181,slave1:2181,slave2:2181 -Dspark.deploy.zookeeper.dir=/data/spark-2.1.2-bin-hadoop2.7"
Start cluster
master node
cd ${SPARK_HOME}/sbin
./ start-all.sh
slave1 node
cd ${SPARK_HOME}/sbin
./ start-master.sh
slave2 node
cd ${SPARK_HOME}/sbin
./ start-master.sh
view status
IP:8080
failover test
kill the Master process on the master
Reference: www.cnblogs.com/liugh/p/6624923.html
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.