Big data's learning route teaching diagram, how to quickly get started with Spark 04/17 Update SLTechnology News&Howtos

Big data's learning route teaching diagram, how to quickly get started with Spark

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

With the development of the Internet, big data has become a new generation of "Internet celebrities", almost all kinds of industries have a relationship with big data. Spark is one of the most important frameworks in big data. Here's how to get started with spark.

Apache Spark is the most widely used memory-based technical framework in big data's industry, especially the features and applications of RDD to help understand Spark and task submission processes as well as caching mechanisms.

Through the above tutorials, you can master the construction of Spark environment, task scheduling process, and the application of RDD code.

Course catalogue:

Chapter 1 Spark knowledge explanation

01_ Why learn Spark

Comparison of 02_Spark and MapReduce. MP4

03_Spark frame system

04_Spark download

Introduction to 05.Spark operation mode

06.Spark cluster installation

07.Spark program execution flow

Explanation of 08.Spark-related nouns

09_SparkShellLocal

10_SparkShellCluster

Comparison between 11_Spark2.2 and Spark1.6Shell

Chapter 2 Maven and IDEA

12_Maven and IDEA downloads

13_Maven installation

14_IDEA installation

Configure Maven in 15_IDEA

Install the 16_Scala environment and configure the Scala plug-in in IDEA

17_IDEA creates Spark project

Developing WordCount programs with 18_Spark

19_Spark program packaging

The 20_Spark cluster runs the packager

Chapter 3 RDD knowledge explanation

21_RDD concept

22_RDD execution process

23_RDD attribute

24_RDD elasticity

Two kinds of creation of 25_RDD

26_RDD programming API

Chapter 4 Transformation algorithm

27_Transformation algorithm

28_Action algorithm

29_Map

30_filter

31_flatMap

32_sample

33 union

34 intersection

35 distinct

36 join

37_leftOuterJoin

38_rightOuterJoin

39_cartesian

40_groupBy

41_mapPartition

42_mapPartitionWithIndex

43_sortby

44_sortbykey

45_repartition

46_coalesce

47_partitionBy

48_repartitionAndSortWithinPartitions

49_reduce

50_reduceByKey

51_aggregateByKey

52_combineByKey

Chapter 5 Action algorithm

53_collect

54_count

55_top

56_take

57_takeOrdered

58_first

59_saveAsTextFile

60_foreach

CountByKey of 61 _ other operators

CountByValue of 62 _ other operators

FilterByRange of 63 _ other operators

FlatMapValues of 64 _ other operators

ForeachPartition of 65 _ other operators

KeyBy of other operators

Keys and values of 67 _ other operators

CollectAsMap of 68 _ other operators

69_RDD function transfer

Dependency relationship of 70_RDD

71_RDD task division

72_Lineage bloodline

73_RDD caching (persistence)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.