In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
There are two types of operations for RDD:
@ Transformation, create a new RDD based on the original RDD
@ actions, return the result to driver after the operation on RDD
The Transfrmation operation is delayed, which means that the conversion from one RDD to another RDD is not performed immediately, and the operation will not really be triggered until there is an Action operation.
Action triggers Spark to submit the job and outputs the data to the spark system.
From a small point of view, Spark operators can be divided into the following three categories:
(1) the Transformation operator of Value data type, this transformation does not trigger the submission job, and the data item processed is value-type data.
(2) Transfromation operator of Key-Value data type, this transformation does not trigger the submission job, and the data item processed is a data pair of Key- value type.
(3) Action operators, which will trigger SparkContext to submit Job jobs.
For example, map is a transformation, which converts the data in RDD into a new RDD after a series of transformations, while reduce is an action, which collects all the data from RDD after a series of processing, and finally passes the results to driver.
All transformations of RDD are in lazy mode, that is, Spark does not calculate the results immediately, but remembers all transformations on the dataset, which will only be calculated when an action is encountered. This design makes spark more efficient. For example, if you perform a reduce operation after a map operation on a data, only the result of the reduce is returned to the driver, rather than passing the larger data map operation to the driver.
1.1 Transformation
There are many ways for transformation to return a new RDD, such as generating a new RDD from a data source and a new RDD from RDD. All transformation adopt a lazy strategy, that is, only submitting the transformation will not be executed.
For more information, please see: http://spark.apache.org/docs/latest/rdd-programming-guide.html
1.2 Action
Action is to get a value, or a result. The calculation is triggered only when the action is submitted.
Welcome to follow your personal Wechat official account: big data and Machine Learning (CLbigdata)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.