What does Hadoop mean? 02/11 Update SLTechnology News&Howtos

What does Hadoop mean?

2026-02-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "what is the meaning of Hadoop", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "what does Hadoop mean" this article?

Hadoop is a software platform for developing and running to deal with large-scale data. It is an open source software framework for Appach to implement distributed computing of massive data in a cluster composed of a large number of computers.

The core design of Hadoop framework is: HDFS and MapReduce.HDFS provide the storage of massive data, and MapReduce provides the calculation of data.

The process of data processing in Hadoop can be simply understood as follows: the data is processed through the Haddop cluster to get the result.

HDFS:Hadoop Distributed File System,Hadoop 's distributed file system.

Large files are divided into default 64m blocks distributed and stored in cluster machines.

The file data1 in the following figure is divided into three blocks, which are distributed on different machines in the form of redundant mirrors.

MapReduce:Hadoop creates a task call Map calculation for each input split. In this task, the records in the split are processed in turn (record). The map will output the results in the form of key--value. Hadoop is responsible for sorting the output of map according to the key value as the input of Reduce, and the output of Reduce Task is the output of the whole job, which is saved on HDFS.

The cluster of Hadoop is mainly composed of NameNode,DataNode,Secondary NameNode,JobTracker,TaskTracker.

As shown in the following figure:

NameNode records how the file is split into block and that the block is stored in those DateNode nodes.

NameNode also saves the running status information of the file system.

What is stored in DataNode is the split blocks.

Secondary NameNode helps NameNode collect status information about the running file system.

JobTracker is responsible for running Job when tasks are submitted to the Hadoop cluster, and is responsible for scheduling multiple TaskTracker.

TaskTracker is responsible for a map or reduce task.

The above is all the contents of this article "what does Hadoop mean?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.