In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
RDD features:
1.RDD is the core abstraction provided by spark, the full name: Resillient Distributed Dataset, that is, resilient distributed datasets.
2.RDD is abstractly a collection of elements that contains data. It is partitioned, with multiple partitions, each distributed on different nodes in the cluster, so that the data in the RDD can be operated in parallel (distributed datasets).
3.RDD is usually created from files on the Hadoop. Sometimes you can see it through the collections in the application.
The most important feature of 4.RDD is that it provides fault tolerance and can recover from node failure. That is, if the RDD partition of a node loses data due to a node failure, the RDD will automatically recalculate the partitin through its own data source.
Every partition of 5.RDD is placed in memory by default on the spark node, but if there is no room for so much data in memory, part of the data in partition will be written on disk and saved. For users, they don't know where RDD in-memory data is stored. This automatic switching mechanism between memory and cipher cards of RDD is the flexibility of RDD.
A RDD logically abstractly represents an HDFS file. But in fact, it is zoned, multiple partitions are scattered in the spark cluster, on different nodes.
What is the core programming of Spark:
First, define the initial RDD, that is, define where the subscription data comes from.
Second: define the calculation operation on RDD, which is called an operator in spark
Third: after the cyclic process is completed for the first time, the data will go to a new batch of nodes, become a new RDD, and then repeatedly define operator operations for the new RDD.
Fourth: get the final data and save the data.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.