In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Official document: spark.apache.org/docs/latest
Spark background
Limitations of MapReduce:
1 >) complicated
Map/reduce (mapjoin does not have reduce)
Low_level
Constained
Requirements test every time the code is changed and then tested.
2 >) low technical efficiency
Hundreds of processes: MapTask ReduceTask JVM reuse
IO: chain network + disk
Sort: all have to sort: interview question: what interface is the key type to achieve?
Memory:
...
Not suitable for iterative processing
Not suitable for real-time streaming processing
Many frameworks fight on their own.
Overview and characteristics of Spark
Spark.apache.org
Speed
Memory
Thread
Sort (settable)
DAG rdd.map.filter....collect
Ease of use
High-level operators: join 、 group 、 count .
Generality
Runs Everywhere
Summary:
Fast + general engine
Write code: java/Scala/Python/R interactive shell
Run:memory/ADG/thread model/.
The introduction and selection of the version is based on the reference:
How to learn Spark
Mail list
User@spark.apache.org
Apache-spark-user-list/
Meetup/ Summit
Sample source code
Github.com/apache/spark
Source code
Environment:
Centos6
Hadoop000 (hadoop) hadoop001 hadoop002
App stores the directory where the software is installed
Software stores the tar of the software package
Data stores test data
Lib stores our own jar
The location where the source code is stored in source
Spark installation
Download the source code and decompress it from the official website
Pre-requirements for compiling Spark source code
Java 8 cycles, Python 2.7 Scala 2.11.xx 3.4 + Spark 2.3.0 Scala 2.11.xx
Install jdk
Apache-maven installation
Extract configuration .bash _ proile
Export MAVEN_HOME/home/hadoop/app/apache-maven-3.3.9
Export PATH=$MAVE_HOME/bin:$PATH
Suggestion: modify the address of maven local warehouse $MAVE_HOME/conf/setting.xml
/ home/hadoop/mave_repo
Install scala-2.11.9.tgz
Extract configuration .bash _ proile
Export MAVEN_HOME/home/hadoop/app/scala-2.11.9
Export PATH=$MAVE_HOME/bin:$PATH
Source ~ .bash _ proile
Verification: mvn-v
Install yum install git under git
Compilation and installation
Export MAVEN_OPTS= "- Xmx2g-XX:ReservedCodeCacheSize=512m"
. / build/mvn-DskipTests clean package
Modify the default hadoop version of the source code
Pom.xml
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.