Example Analysis of hadoop-mapreduce 04/24 Update SLTechnology News&Howtos

Example Analysis of hadoop-mapreduce

2025-04-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you the "sample analysis of hadoop-mapreduce", which is easy to understand and well-organized. I hope it can help you solve your doubts. Let the editor lead you to study and learn the article "sample Analysis of hadoop-mapreduce".

If you think of the whole Hadoop as a container, then Mapper and Reduce are the components in the container. * Context stores some configuration information of the component and is also the mechanism for communicating with the container.

Parameters.

Action

Default value

Other implementations

InputFormat

The input data set is cut into a small data set InputSplits, and each InputSplit will be processed by a Mapper. In addition, an implementation of RecordReader is provided in InputFormat, which parses an InputSplit into pairs and provides it to the map function.

TextInputFormat

(for a text file, cut the text file into InputSplits by line, and parse the InputSplit into pairs with LineRecordReader. Key is the location of the line in the file, and value is a line in the file.)

SequenceFileInputFormat

OutputFormat

Provide an implementation of RecordWriter that is responsible for outputting the final result

TextOutputFormat

(write the final result as a pure file with LineRecordWriter, one line for each pair, separated by tab between key and value)

SequenceFileOutputFormat

OutputKeyClass

Type of key in the final result of the output

LongWritable

OutputValueClass

Type of value in the final result of the output

Text

MapperClass

The Mapper class, which implements the map function, completes the mapping of input to intermediate results

IdentityMapper

(take the intact output of the input as the intermediate result)

LongSumReducer

LogRegexMapper

InverseMapper

CombinerClass

Implement the combine function to merge the repeated key in the intermediate result

Null

(do not merge duplicate key in intermediate results)

ReducerClass

The Reducer class, which implements the reduce function, merges the intermediate results to form the final result

IdentityReducer

(output the intermediate result directly to the final result)

AccumulatingReducer, LongSumReducer

InputPath

Set the input directory for job, and the job runtime will process all files under the input directory

Null

OutputPath

Set the output directory of job, and the final result of job will be written to the output directory.

Null

MapOutputKeyClass

Sets the type of key in the intermediate result of the map function output

Use OutputKeyClass if it is not set by the user

MapOutputValueClass

Sets the type of value in the intermediate result of the map function output

Use OutputValuesClass if it is not set by the user

OutputKeyComparator

The comparator used when sorting the key in the results

WritableComparable

PartitionerClass

After the key of the intermediate result is sorted, it is divided into R parts with this Partition function, and each copy is processed by a Reducer.

HashPartitioner

(use the Hash function to do partition)

KeyFieldBasedPartitioner PipesPartitioner

Job inherits from JobContext and provides a series of set methods for setting some properties of Job (Job update property, JobContext read property). At the same time, Job also provides some methods to control Job, as follows:

L progress of mapProgress:map (0-1.0)

L progress of reduceProgress:reduce (0-1.0)

L isComplete: whether the job has been completed

L isSuccessful: whether the job is successful

L killJob: ends a running job

L getTaskCompletionEvents: get a reply to the completion of the task (success / failure)

L killTask: end a task

The above is all the content of this article "sample Analysis of hadoop-mapreduce". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.