In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Speaking of big data's tools, the most well-known are Hadoop and Spark, Hadoop has been introduced in the previous article, this issue of the editor will introduce the rising star Spark.
Spark is a lightning fast Apache project, developers claim that it is "a fast general engine for large-scale data processing". [A1] Spark is a general parallel computing framework like Hadoop MapReduce, which is open source by UC BerkeleyAMP lab. Distributed computing based on map reduce algorithm has the advantages of Hadoop MapReduce. [A2]
It provides a fast general data processing platform that can increase your program's memory computing speed to 100 times, or disk computing speed (Hadoop) to 10 times. In last year's Daytona GraySort competition, Spark achieved more than three times its speed with only the number of Hadoop 1/10 machines, and Spark has become the fastest open source tool for processing PB-level data. [A3]
The core concept of Spark is ResilientDistributed Dataset (RDD) flexible distributed datasets. RDD implements an abstract implementation of manipulating distributed datasets by manipulating local collections. RDD is the core of Spark. It represents a set of data that has been partitioned, immutable and can be operated in parallel. Different data set formats correspond to different RDD implementations. RDD must be serializable and can be cache into memory, and the results of each operation on the RDD dataset can be stored in memory, and the next operation can be entered directly from memory, saving a lot of disk IO operations of MapReduce. For the iterative operation of the more common machine learning algorithms, interactive data mining, the efficiency is greatly improved. [A4]
The architecture of Spark with RDD as its core is as follows
Spark has unparalleled advantages in machine learning and is especially suitable for algorithms that require multiple iterations. At the same time, Spark has excellent fault tolerance and scheduling mechanism to ensure the stable operation of the system [A5], and is famous for its ease of use. It comes with easy-to-use API and supports Scala (native language), Java, Python and Spark SQL. SparkSQL is very similar to SQL 92, so it requires little learning and is ready to get started. [A6]
Spark helps people simplify the process of dealing with large-scale data, seamlessly combines many complex functions (such as machine learning algorithm and graph algorithm), and is rapidly expanding its influence with its lightning-fast computing speed. We have reason to believe that with the unique and excellent performance of Spark, Spark will bloom more brightly in the future.
[A1] Source: stop comparing Hadoop and Spark, that's not the designer's intention.
[A2] what is the source of popular science Spark,Spark and how to use Spark; Baidu encyclopedia
[A3] Source: Apache Spark introduction and case presentation
[A4] Source: what is the core of popular science Spark,Spark and how to use Spark (2) http://www.aboutyun.com/thread-6850-1-1.html
[A5] Source: data Mining with me (22)-- introduction to spark
[A6] Source: stop comparing Hadoop and Spark, that's not the designer's intention.
Final source: Qichuang Ark Wechat official account
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.