Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the characteristics of Hadoop,Spark,Strom,Hive

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what are the characteristics of Hadoop,Spark,Strom,Hive". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the characteristics of Hadoop,Spark,Strom,Hive"?

Hadoop: a distributed system infrastructure when programs that deal with large amounts of data begin to require high reliability, high scalability, high efficiency, low fault tolerance, and low cost scenarios

MapReduce: MapReduce is a programming model for parallel operations on large datasets (larger than 1TB). In the typical application scenarios of MapReduce, log analysis is often used at present, and there are indexes for searching elements, and the machine learning algorithm package mahout is also one of them. Of course, there are many things it can do, such as data mining and information extraction.

Spark: it has the advantages of Hadoop MapReduce, but unlike MapReduce, the intermediate output of Job can be stored in memory, so there is no need to read and write HDFS, so Spark is better suitable for iterative MapReduce algorithms such as data mining and machine learning. Scenarios where the data is too complex and needs to be iterated and in memory to greatly improve efficiency

Strom: a distributed real-time computing system, Storm is a task parallel continuous computing engine. Storm itself does not typically run on a Hadoop cluster. It uses Apache ZooKeeper's and its own master / slave worker processes to coordinate topology, host and worker status, and ensure the semantics of the information. In any case, Storm must still be able to consume from or write to HDFS from a HDFS file.

Hive: a Hadoop-based data warehouse tool that maps structured data files to a database table and provides a simple sql query function that converts sql statements into MapReduce tasks to run. Application scenario: it is very suitable for statistical analysis of data warehouse.

Hbase: application scenario: the amount of data is so large that traditional RDBMS is not competent, online business function development, offline data analysis (data warehouse)

At this point, I believe you have a deeper understanding of "what are the characteristics of Hadoop,Spark,Strom,Hive?" you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report