What is the execution process of hadoop mapreduce? 07/13 Update SLTechnology News&Howtos

What is the execution process of hadoop mapreduce?

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article is to share with you what the hadoop mapreduce execution process is like. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

The general process of MapReduce is as follows, as shown in the figure:

From the picture, you can see that the execution of mapreduce mainly includes the following steps

1. First slice the input data source

2.master dispatches worker to execute map tasks

3.worker reads input source clips

4.worker executes map tasks and saves task output locally

5.master dispatches worker to execute reduce task, and reduce worker reads the output file of map task.

6. Execute the reduce task and save the task output to HDFS

If you delve into the details of the process, you can get such a flow chart.

Role description:

JobClient: the client that performs the task

JobTracker: task Scheduler

TaskTracker: task tracker

Task: specific tasks (Map OR Reduce)

Map-shuffle-reduce process

As you can see from the figure above, the Shuffle process spans both map and reduce, so I will expand it in two parts below.

First, take a look at the situation on the map side, as shown below:

The figure above may be the operation of a map task. Comparing it with the left half of the official map, you will find a lot of inconsistencies. The official chart does not clearly indicate at which stage partition,sort and combiner function. I drew this picture, hoping to give you a clear understanding of the whole process from map data input to map-side data preparation.

I divided the whole process into four steps. To put it simply, each map task has a memory buffer that stores the output of map. When the buffer is almost full, you need to store the data of the buffer to the disk as a temporary file. When the entire map task ends, all temporary files generated by the map task in the disk are merged to generate the final official output file, and then wait for reduce task to pull the data.

Of course, each step here may contain multiple steps and details. Let me explain the details one by one:

1. When map task executes, its input data comes from HDFS's block, and of course, in the concept of MapReduce, map task only reads split. The correspondence between Split and block may be many-to-one, and the default is one-to-one. In the WordCount example, assume that the input data for map is a string like "aaa".

two。 After running mapper, we know that the output of mapper is such a key/value pair: key is "aaa" and value is the number 1. Because the current map side only adds 1, the result set is merged in reduce task. Before we know that this job has three reduce task, which reduce should hand over the current "aaa" to do, it needs to be decided now.

MapReduce provides Partitioner interface, and its function is to decide which reduce task should handle the current pair of output data according to the number of key or value and reduce. By default, the model is modeled by the number of key hash after reduce task. The default mode is only to average the processing power of reduce. If users have a need for Partitioner, they can customize it and set it to job.

In our example, "aaa" returns 0 after Partitioner, which means that the pair of values should be handled by the first reducer. Next, you need to write the data to the memory buffer, which is used to collect map results in bulk, reducing the impact of disk IO. Both our key/value pair and the result of Partition are written to the buffer. Of course, both key and value values are serialized into a byte array before writing.

The whole memory buffer is a byte array, its byte index and key/value storage structure I have not studied. If you have a friend who has studied it, please give a general description of it.

3. This memory buffer is limited in size and defaults to 100MB. When there is a lot of map task output, memory may burst, so you need to temporarily write the data in the buffer to disk under certain conditions, and then reuse the buffer. This process of writing data from memory to disk is called Spill, which can be translated into overflow writing in Chinese. The literal meaning is very intuitive. This overflow is done by a separate thread and does not affect the thread that writes the map result to the buffer. The result output of map should not be blocked when the overflow thread starts, so the entire buffer has an overflow ratio spill.percent. This ratio defaults to 0.8, that is, when the data in the buffer has reached the threshold (buffer size * spill percent = 100MB * 0.8 = 80MB), the overflow thread starts, locks the 80MB's memory, and performs the overflow process. The output of Map task can also be written to the remaining 20MB memory without affecting each other.

When the overflow thread starts, you need to Sort the key in this 80MB space. Sorting is the default behavior of the MapReduce model, and sorting here is also the sort of serialized bytes.

Here we can think about it, because the output of map task needs to be sent to different reduce sides, and the memory buffer does not merge the data to be sent to the same reduce side, then this merge should be reflected in the disk file. You can also see from the official diagram that the overflow files written to disk are merged with values on different reduce sides. So an important detail of the overflow process is that if there are many key/value pairs that need to be sent to a reduce, then these key/ values need to be spliced together to reduce the number of index records associated with partition.

When merging data for each reduce side, some data may look like this: "aaa" / 1, "aaa" / 1. For the WordCount example, we simply count the number of word occurrences. If there are many key that appear multiple times like "aaa" in the same map task, we should merge their values together. This process is called reduce or combine. However, in the terminology of MapReduce, reduce only refers to the process that the reduce side performs the process of fetching data from multiple map task for calculation. With the exception of reduce, informally merging data can only be counted as combine. In fact, as we all know, MapReduce equates Combiner with Reducer.

If client has set up Combiner, now is the time to use Combiner. Add up the key/value pairs with the same key to reduce the amount of data overwritten to disk. Combiner optimizes the intermediate results of MapReduce, so it is used multiple times throughout the model. In which scenarios can Combiner be used? From this analysis, the output of Combiner is the input of Reducer, and Combiner must not change the final calculation result. So from my point of view, Combiner should only be used in scenarios where the input key/value of Reduce is exactly the same as the output key/value type and does not affect the final result. Such as accumulation, maximum, etc. The use of Combiner must be careful. If used well, it will help the efficiency of job execution, otherwise it will affect the final result of reduce.

4. Each overflow will generate an overflow file on the disk. If the output of map is really large, there will be multiple overflow files on the disk if there are multiple such overwrites. When the map task is actually completed, all the data in the memory buffer is overwritten to disk to form an overwrite file. In the end, there will be at least one such overflow file on the disk (if the output of map is small, only one overflow file will be generated when the map execution is complete). Because there is only one file in the end, these overflow files need to be merged together, a process called Merge. What is Merge like? As in the previous example, the value of "aaa" is 5 when it is read from one map task and 8 when it is read from another map. Because they have the same key, they have to merge into group. What is group. For "aaa" it is like this: {"aaa", [5, 8, 2,...] }, the values in the array are read from different overflow files, and then added up. Note that because merge merges multiple overwritten files into one file, the same key may exist, and if client sets Combiner, Combiner will be used to merge the same key in the process.

At this point, all the work on the map side is done, and the resulting file is stored in a local directory within TaskTracker's reach. Each reduce task continuously obtains information about whether the map task is completed from the JobTracker through the RPC. If the reduce task is notified that the map task execution on a certain TaskTracker is complete, the second half of the Shuffle process starts.

To put it simply, the work of reduce task before execution is to constantly pull the final result of each map task in the current job, and then constantly merge the data pulled from different places, and finally form a file as the input file of reduce task. See the following figure:

For example, in the detail diagram of the map side, the process of Shuffle on the reduce side can also be summarized by the three points marked on the diagram. The premise of the current reduce copy data is that it wants to get what map task has been executed from JobTracker. This process is not shown, and interested friends can follow it. Before Reducer actually runs, all the time is pulling data, doing merge, and doing it over and over again. As in the previous way, I also describe the Shuffle details on the reduce side in sections below:

The 1.Copy process, which simply pulls data. The Reduce process starts some data copy threads (Fetcher) and requests the TaskTracker where the map task is located to get the output file of the map task by HTTP. Because map task is already over, these files are managed by TaskTracker on the local disk.

2.Merge phase. The merge here is like the merge action on the map, except that the array stores the values from different copy on the map. The data from Copy will be put into the memory buffer first. The buffer size here is more flexible than that on the map side. It is based on the heap size setting of JVM. Since Reducer does not run in the Shuffle phase, most of the memory should be used by Shuffle. It should be emphasized here that merge comes in three forms: 1) memory to memory 2) memory to disk 3) disk to disk. The first form is not enabled by default, which is confusing, isn't it? When the amount of data in memory reaches a certain threshold, start the memory-to-disk merge. Similar to the map side, this is also the process of overwriting, in which if you set Combiner, it will be enabled, and then a large number of overflow files are generated on disk. The second merge mode runs until there is no data on the map side, and then starts the third disk-to-disk merge mode to generate the final file.

The input file for 3.Reducer. After constant merge, a "final file" is finally generated. Why are you in quotation marks? Because this file may exist on disk or in memory. For us, of course, we want it to be stored in memory as input to Reducer, but by default, this file is stored on disk. As for how to make this file appear in memory, I'll talk about performance optimization later. When the input file of the Reducer has been decided, the entire Shuffle is finally finished. Then there is the Reducer execution, putting the results on the HDFS.

Thank you for reading! This is the end of this article on "what is the implementation process of hadoop mapreduce?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.