In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
With the development of the Internet, big data has become a new generation of "Internet celebrities", almost all kinds of industries have a relationship with big data. Spark is one of the most important frameworks in big data. Here's how to get started with spark.
Apache Spark is the most widely used memory-based technical framework in big data's industry, especially the features and applications of RDD to help understand Spark and task submission processes as well as caching mechanisms.
Through the above tutorials, you can master the construction of Spark environment, task scheduling process, and the application of RDD code.
Course catalogue:
Chapter 1 Spark knowledge explanation
01_ Why learn Spark
Comparison of 02_Spark and MapReduce. MP4
03_Spark frame system
04_Spark download
Introduction to 05.Spark operation mode
06.Spark cluster installation
07.Spark program execution flow
Explanation of 08.Spark-related nouns
09_SparkShellLocal
10_SparkShellCluster
Comparison between 11_Spark2.2 and Spark1.6Shell
Chapter 2 Maven and IDEA
12_Maven and IDEA downloads
13_Maven installation
14_IDEA installation
Configure Maven in 15_IDEA
Install the 16_Scala environment and configure the Scala plug-in in IDEA
17_IDEA creates Spark project
Developing WordCount programs with 18_Spark
19_Spark program packaging
The 20_Spark cluster runs the packager
Chapter 3 RDD knowledge explanation
21_RDD concept
22_RDD execution process
23_RDD attribute
24_RDD elasticity
Two kinds of creation of 25_RDD
26_RDD programming API
Chapter 4 Transformation algorithm
27_Transformation algorithm
28_Action algorithm
29_Map
30_filter
31_flatMap
32_sample
33 union
34 intersection
35 distinct
36 join
37_leftOuterJoin
38_rightOuterJoin
39_cartesian
40_groupBy
41_mapPartition
42_mapPartitionWithIndex
43_sortby
44_sortbykey
45_repartition
46_coalesce
47_partitionBy
48_repartitionAndSortWithinPartitions
49_reduce
50_reduceByKey
51_aggregateByKey
52_combineByKey
Chapter 5 Action algorithm
53_collect
54_count
55_top
56_take
57_takeOrdered
58_first
59_saveAsTextFile
60_foreach
CountByKey of 61 _ other operators
CountByValue of 62 _ other operators
FilterByRange of 63 _ other operators
FlatMapValues of 64 _ other operators
ForeachPartition of 65 _ other operators
KeyBy of other operators
Keys and values of 67 _ other operators
CollectAsMap of 68 _ other operators
69_RDD function transfer
Dependency relationship of 70_RDD
71_RDD task division
72_Lineage bloodline
73_RDD caching (persistence)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.