Example Analysis of Hadoop execution path 10/22 Update SLTechnology News&Howtos

Example Analysis of Hadoop execution path

2025-10-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article shares with you the content of a sample analysis of Hadoop execution paths. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Introduction to Hadoop

A distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without knowing the underlying details of the distribution. Make full use of the power of the cluster for high-speed computing and storage.

In a nutshell, Hadoop is a software platform that makes it easier to develop and run software that handles large-scale data.

Hadoop implements a distributed file system (HadoopDistributedFileSystem), referred to as HDFS. HDFS has high fault tolerance (fault-tolerent) and is designed to be deployed on low-cost (low-cost) hardware. And it provides high transfer rate (highthroughput) to access the application's data, which is suitable for applications with very large data sets (largedataset). HDFS relaxes the requirement of (relax) POSIX (requirements) so that data in the file system can be accessed (streamingaccess) by stream.

Hadoop execution path.

Usually we call the JobClient.runJob (job) method in the Job code we write to start the actual execution of the task. Our introduction starts with this command (before calling this api, we have designed and specified our own mapper and reducer functions in the program)

1JobClient.runJob (job) static method instantiates a JobClient instance, and then uses the instance's submitJob (job) method to submit a job to master. This method returns a RunningJob object, which is used to track the status of the job. When the job is submitted, JobClient will rotate the progress of the job.

(2) the substantive job submission is completed through JobSubmitter's submitJobInternal (job). SubmitJobInternal first uploads three files to the haodoop file system: the locations of the three files job.jar,job.split,job.xml are determined by the mapred.system.dir attribute of the mapreduce system path. After writing these three files, this method uses RPC to call the JobTracker.submitJob (job) method of the master node.

3JobTracker receives the job submitted by JobClient, that is, in the JobTracker.submitJob () method, it first generates a JobInProgress object, which represents a job, and its function is to maintain all the information about the job, including job analysis JobProfile and JobStatus, and register all Task into the task list. JobTracker then adds the JobInProgress object to the job scheduling queue through the listener.jobAdd (job) method, and uses a member Jobs to represent all jobs

4 the default scheduler for FIFO Hadoop is FIFO's scheduler. It has two member variables JobQueueJobInProgressListener and eagerTaskInitializationListener. The latter is responsible for task initialization. The method is as follows: when listerner initializes, start the JobInitThread thread, when the job joins the initialization queue jobInitQueue through JobAdd (job), it is sorted according to the job priority, and then the thread calls JobInProgress's initTasks () to initialize all tasks.

5The initTasks () process is complex, in which a corresponding number of Map execution management objects TaskInProgress are created based on the original decomposition of the input task.

Thank you for reading! This is the end of this article on "sample Analysis of Hadoop execution path". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.