In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Spark is an open source cluster computing environment similar to Hadoop. It is a fast and general computing engine designed for large-scale data processing. It has formed a rapidly developing and widely used ecosystem. The main application scenarios are as follows:
1. Spark is a memory-based iterative computing framework, which is suitable for applications that need to manipulate specific data sets multiple times. The more repeated operations are required, the greater the amount of data that needs to be read, and the greater the benefit. When the amount of data is small but the computing intensity is high, the benefit is relatively small.
two。 Due to the nature of RDD, Spark is not suitable for applications with asynchronous fine-grained status updates, such as storage of web services or incremental web crawlers and indexes. It is not suitable for the application model of incremental modification:
3. The amount of data is not very large, but it requires real-time statistical analysis.
All those who meet the above conditions can be processed by Spark technology. in practical application, big data is mainly used in advertising, reporting, recommendation system and other business in Internet companies. Big data needs to do application analysis, effect analysis and directional optimization in advertising business, while big data needs to optimize relevant ranking, personalized recommendation and hot click analysis in recommendation system.
The common characteristics of these application scenarios are large amount of computation and high efficiency requirements. Spark can precisely meet these requirements. Once launched, the project has been widely concerned and praised by the open source community, and has developed into a hot open source project in the field of big data processing in the past two years.
Spark is implemented in Scala. It is an object-oriented, functional programming language, which can manipulate distributed data sets as easily as local collection objects. It has the characteristics of fast running speed, good usability, strong versatility and running everywhere. It is suitable for most batch processing work, and has become the preferred processing technology of big data for enterprises in big data era. Among them, the representative enterprises are Tencent, Yahoo, Taobao and Youku Tudou.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.