Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use combiner, a component of MR program

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "how to use the MR program component combiner". In daily operation, I believe many people have doubts about how to use the MR program component combiner. The editor consulted various materials and sorted out simple and easy operation methods. I hope to help you answer the doubts about "how to use the MR program component combiner"! Next, please follow the small series to learn together!

A simple sentence describes the combiner component's role: reduce map task output, reduce the number of reduce tasks, and thus reduce network load

Working mechanism:

Map task allows a summary operation to be performed locally before submitting to Reduce task, that is, combiner component. Combiner component behaves the same way as Reduce, receiving key/values and producing key/value output.

Note:

The output of combiner is the input of reduce.

If the combiner is pluggable, then the combiner must not change the final result.

3. Combiner is an optimization component, but it can not be used everywhere, so combiner can only be used for scenarios where the input and output key/value types of reduce are completely consistent and do not affect the final result.

Example: WordCount program, by counting the number of times each word appears, we can first through the Map task local summary (Combiner), and then the summary results to Reduce, complete each Map task there is the same KEY data for a total summary, Figure:

Combiner code:

Combiner class, directly open Combiner class source code is directly inherited Reducer class, so we can directly inherit Reducer class, and finally specify the Combiner class we define when submitting

package com.itheima.hadoop.mapreduce.combiner;import java.io.IOException;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Reducer;public class WordCountCombiner extends Reducer { @Override protected void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { long count = 0 ; for (LongWritable value : values) { count += value.get(); } context.write(key, new LongWritable(count)); }}

Mapper class:

package com.itheima.hadoop.mapreduce.mapper;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Mapper;public class WordCountCombinerMapper extends Mapper { public void map(LongWritable key, Text value, Context context) throws java.io.IOException, InterruptedException { String line = value.toString(); //Get a line of data String[] words = line.split(" "); //Get individual words for (String word : words) { //Write each word out context.write(new Text(word), new LongWritable(1)); } }}

Driving class:

package com.itheima.hadoop.drivers;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.conf.Configured;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.Tool;import com.itheima.hadoop.mapreduce.combiner.WordCountCombiner;import com.itheima.hadoop.mapreduce.mapper.WordCountCombinerMapper;public class WordCountCombinerDriver extends Configured implements Tool{ @Override public int run(String[] args) throws Exception { /** * Submit Quintet: * 1. Generating operations * 2. Specify MAP/REDUCE * 3. Specify MAPREDUCE output data type * 4. Specify the route * 5. Submit the assignment */ Configuration conf = new Configuration(); Job job = Job.getInstance(conf); job.setJarByClass(WordCountCombinerDriver.class); job.setMapperClass(WordCountCombinerMapper.class); /** * Middle episode here: combiner component ***/ job.setCombinerClass(WordCountCombiner.class); /** * Middle episode here: combiner component ***/ //reduce logic is consistent with combiner logic and combiner is a subclass of reduce job.setReducerClass(WordCountCombiner.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(LongWritable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(LongWritable.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); return job.waitForCompletion(true) ? 0 : 1; }}

Main category:

package com.itheima.hadoop.runner;import org.apache.hadoop.util.ToolRunner;import com.itheima.hadoop.drivers.WordCountCombinerDriver;public class WordCountCombinerRunner { public static void main(String[] args) throws Exception { int res = ToolRunner.run(new WordCountCombinerDriver(), args); System.exit(res); }}

Run Results:

At this point, the study of "how to use MR program component combiner" is over, hoping to solve everyone's doubts. Theory and practice can better match to help everyone learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 233

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report