In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "what are the commonly used scenes in Hadoop". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the commonly used scenes in Hadoop".
What is Hadoop?
With the rapid increase of the amount of data, the two most direct problems encountered are data storage and calculation (analysis / utilization).
Hadoop is a distributed basic framework implemented in Java developed by the Apache Foundation. It can also be seen as a platform for developing and running distributed applications on a large cluster composed of general computing devices. The two most important components of Hadoop, HDFS and MapReduce, are used to solve massive data storage (distributed) storage and massive data (distributed) computing. Users can develop distributed programs without knowing the underlying details of the distribution. Make full use of the power of the cluster for high-speed operation and storage.
Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. HDFS has high fault tolerance and is designed to be deployed on low-cost (low-cost) hardware; and it provides high throughput (high throughput) to access application data, suitable for applications with very large data sets (large data set). HDFS relaxes the requirement of (relax) POSIX to access (streaming access) data in a file system as a stream.
HDFS has two kinds of nodes, NameNode and DataNode. DataNode is mainly used to store data, and NameNode manages the interaction of the entire file system. Compared with the ordinary file system, HDFS is characterized by distributed mass storage and backup mechanism.
The core design of Hadoop's framework is: HDFS and MapReduce. HDFS provides storage for massive data, while MapReduce provides computing for massive data. MapReduce: parallel computing framework, MapReduce is actually a distributed computing model, multiple computers parallel computing, do one thing together.
Application scenarios of Hadoop:
After a brief understanding of what Hadoop is, let's take a look at the scenarios in which Hadoop is generally applicable.
Hadoop is mainly used in offline scenarios with large amount of data, which is characterized by large amount of data and offline.
Large amount of data: generally speaking, the real online use of Hadoop, the cluster size is hundreds to thousands of machines. In this case, the T-level data is also very small.
Offline: under the Mapreduce framework, it is difficult to deal with real-time computing, and jobs are mainly offline jobs such as log analysis. In addition, there are generally a large number of jobs waiting to be scheduled in the cluster to ensure the full utilization of resources.
In addition, due to the characteristics of HDFS design, Hadoop is suitable for dealing with files with large file blocks. It is inefficient to use Hadoop for a large number of small files.
Common scenarios in Hadoop are:
Large amount of data storage: distributed storage (various cloud disks, Baidu, 360 ~ and cloud platforms all have hadoop applications)
L log processing
Mass computing, parallel computing
L data mining (such as advertising recommendations, etc.)
L behavior analysis, user modeling, etc.
Thank you for your reading. The above is the content of "what are the commonly used scenarios in Hadoop". After the study of this article, I believe you have a deeper understanding of the common scenarios of Hadoop, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.