In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
This lesson demonstrates the two most important operators in RDD, join and cogroup, through code practice.
Join operator code practice:
/ / demonstrate the join operator in code
Val conf = new SparkConf () .setAppName ("RDDDemo") .setMaster ("local")
Val sc = new SparkContext (conf)
Val arr1 = Array (Tuple2 (1, "Spark"), Tuple2 (2, "Hadoop"), Tuple2 (3, "Tachyon"))
Val arr2 = Array (Tuple2 (1,100), Tuple2 (2,70), Tuple2 (3,90))
Val rdd1 = sc.parallelize (arr1)
Val rdd2 = sc.parallelize (arr2)
Val rdd3 = rdd1.join (rdd2)
Rdd3.collect () foreach (println)
Running result:
(1, (Spark,100))
(3, (Tachyon,90))
(2, (Hadoop,70))
Cogroup operator code practice:
First of all, it is written by java:
SparkConf conf = new SparkConf () .setMaster ("local") .setAppName ("Cogroup")
JavaSparkContext sc = new JavaSparkContext (conf)
List nameList = Arrays.asList (new Tuple2 (1, "Spark")
New Tuple2 (2, "Tachyon"), new Tuple2 (3, "Hadoop"))
List ScoreList = Arrays.asList (new Tuple2 1,100)
New Tuple2 (2,95), new Tuple2 (3,80)
New Tuple2 (1,80), new Tuple2 (2,110)
New Tuple2 (2,90))
JavaPairRDD names = sc.parallelizePairs (nameList)
JavaPairRDD scores = sc.parallelizePairs (ScoreList)
JavaPairRDD nameAndScores = names.cogroup (scores)
NameAndScores.foreach (new VoidFunction () {
Public void call (Tuple2 t) throws Exception {
System.out.println ("ID:" + t.room1)
System.out.println ("Name:" + t.room2.room1)
System.out.println ("Score:" + t.room2.room2)
}
});
Sc.close ()
Running result:
ID:1
Name: [Spark]
Score: [100, 80]
ID:3
Name: [Hadoop]
Score: [80]
ID:2
Name: [Tachyon]
Score: [95, 110, 90]
Through Scala:
Val conf = new SparkConf () .setAppName ("RDDDemo") .setMaster ("local")
Val sc = new SparkContext (conf)
Val arr1 = Array (Tuple2 (1, "Spark"), Tuple2 (2, "Hadoop"), Tuple2 (3, "Tachyon"))
Val arr2 = Array (Tuple2 (1,100), Tuple2 (2,70), Tuple2 (3,90), Tuple2 (1,95), Tuple2 (2,65), Tuple2 (1,110))
Val rdd1 = sc.parallelize (arr1)
Val rdd2 = sc.parallelize (arr2)
Val rdd3 = rdd1.cogroup (rdd2)
Rdd3.collect () foreach (println)
Sc.stop ()
Running result:
(1, (CompactBuffer (Spark), CompactBuffer (100,95,110))
(3, (CompactBuffer (Tachyon), CompactBuffer (90)
(2, (CompactBuffer (Hadoop), CompactBuffer (70,65))
Note:
Source: DT_ big data DreamWorks (customized Spark distribution)
For more private content, please follow the Wechat official account: DT_Spark
If you are interested in big data Spark, you can listen to the Spark permanent free open course offered by teacher Wang Jialin at 20:00 every evening, address YY room number: 68917580
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.