Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Hadoop Mapreduce architecture analysis

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "Hadoop Mapreduce architecture analysis". In daily operation, I believe many people have doubts about Hadoop Mapreduce architecture analysis. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "Hadoop Mapreduce architecture analysis". Next, please follow the editor to study!

1 、 Client

The MapReduce program written by the user is submitted to the JobTracker side through Client; at the same time, the user can check the running status of the job through some interfaces provided by Client. Within Hadoop, the MapReduce program is represented by a "job" Job. A MapReduce program can correspond to several jobs, and each job is broken down into several Map/Reduce tasks (Task).

2 、 JobTracker

JobTracker is mainly responsible for job scheduling and resource monitoring.

Job scheduling: monitor the health status of all TaskTracker and jobs. Once a failure is found, it will transfer the corresponding tasks to other nodes.

Resource monitoring: JobTracker will track the progress of task execution, resource usage and other information, and give this information to the task scheduler, and the scheduler will select the appropriate task to use these resources when resources are idle. In Hadoop, task scheduler is a pluggable module, which can be designed by ourselves.

3 、 TaskTracker

TaskTracker periodically reports the use of resources and the progress of tasks on this node to JobTracker through Heartbeat, and receives commands sent by JobTracker and performs corresponding operations (such as starting new tasks, killing tasks, etc.).

TaskTracker uses the equivalent of "slot" to divide the amount of resources on this node. "slot" stands for computing resources. A Task gets a slot before it has a chance to run it. The function of the Hadoop scheduler is to allocate the free slot on each TaskTracker to Task. Slot is divided into Map slot and Reduce slot, which are used by Map Task and Reduce Task respectively. TaskTracker defines the concurrency of Task by the number of slot.

4 、 Task

Task is divided into Map Task and Reduce Task, both of which are started by TaskTracker. HDFS stores data in fixed-size block, while in the case of MapReduce, the processing unit is split. The corresponding relationship between split and block is as follows

Split is a logical concept, which only contains some metadata information, such as data starting position, data length, data node and so on. The division method is entirely up to the user. The number of split determines the number of Map Task. One split corresponds to one Map Task.

The execution process of Map Task is shown below:

Map Task first parses the corresponding split iteration into a key/value pair, calls the map function in turn for processing, and stores the processing results on the local disk, in which the temporary data is divided into several partition, and each partition will be processed by a Reduce Task.

The execution process of Reduce Task is shown below:

The process is divided into three stages: 1, shuffle stage, reading Map Task intermediate results from remote nodes; 2, Sort stage, sorting key/value according to key; 3, reduce stage, reading in turn, calling reduce function processing, and storing in HDFS.

At this point, the study of "Hadoop Mapreduce Architecture Analysis" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report