Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to run the Spark Streaming flow Computing Framework

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how the Spark Streaming flow computing framework works". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to run the Spark Streaming stream computing framework.

Post a case first

Import org.apache.spark.SparkConfimport org.apache.spark.streaming. {Durations, StreamingContext} object StreamingWordCountSelfScala {def main (args: Array [String]) {val sparkConf = new SparkConf (). SetMaster ("spark://master:7077"). SetAppName ("StreamingWordCountSelfScala") val ssc = new StreamingContext (sparkConf, Durations.seconds (5)) / / harvest data val lines = ssc.socketTextStream ("localhost") every 5 seconds 9999) / / listen to the local 9999 socket port val words = lines.flatMap (_ .split (")). Map ((_, 1)) .reduceByKey (_ + _) / / flat map after reduce words.print () / / print the result ssc.start () / / start ssc.awaitTermination () ssc.stop (true)}}

Let's go back to the trigger process.

The timer periodically triggers the execution of a method. Here is longTime = > eventLoop.post (GenerateJobs (new Time (longTime), which sends an event message of type GenerateJobs to the queue of eventLoop.

/ / JobGenerator.scala line 58 privateval timer = new RecurringTimer (clock, ssc.graph.batchDuration.milliseconds, longTime = > eventLoop.post (GenerateJobs (new Time (longTime), "JobGenerator")

Another convenience is that eventLoop always loops out event messages in the queue when fetching event messages of type GenerateJobs. OnReceive (event) is called.

/ / EventLoop.scala line 48 onReceive (event)

The onReceive (event) at this time is already override when JobGenerator instantiates eventLoop.

/ / JobGenerator.scala line 87 override protected def onReceive (event: JobGeneratorEvent): Unit = processEvent (event)

Call generatorJobs (time)

/ / JobGenerator.scala line 181case GenerateJobs (time) = > generateJobs (time)

Graph.generateJobs

/ / JobGenerator.scala line 248graph.generateJobs (time)

Restore the entire dependency of RDD through outputStream.generateJob and create a Job. This outputStream is ForEachDStream.

/ / DStreamGraph.scala line 115val jobOption = outputStream.generateJob (time) in this case, according to SocketInputDStream

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report