Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize WordCount and its Optimization by MapReduce

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about how to achieve WordCount and its optimization in MapReduce. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

WordCount: word count, counting the number of times each word appears in a text file

Define the Mapper class, which inherits from org.apache.hadoop.mapreduce.Mapper

And override the map () method

Public static class TokenizerMapper extends Mapper {/ / defines a static member variable and is immutable to avoid creating a duplicate object private final static IntWritable one = new IntWritable (1) every time the map () method is called / / define a member variable, variable. Each time you call the map () method, you only need to call the Text.set () method to assign the new value private Text word = new Text (). Public void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {String [] words = value.toString () .split ("); for (String item: words) {word.set (item) Context.write (word, one);}

Define the Reducer class, which inherits from org.apache.hadoop.mapreduce.Reducer

And override the reduce () method

Public static class IntSumReducer extends Reducer {/ / defines a member variable that is variable. Each time you call the reduce () method, you only need to call the IntWritable.set () method to assign a new value, private IntWritable result = new IntWritable (). Public void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; for (IntWritableval: values) {sum + = val.get () } result.set (sum); context.write (key, result);}}

Test WordCount

Public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); Job job = Job.getInstance (conf); job.setJarByClass (WordCount.class); / / set the main class job.setMapperClass (TokenizerMapper.class) of job / / set Mapper class / / use combiner to reduce the amount of data transmitted over shuffle job.setCombinerClass (IntSumReducer.class); / / set Combiner class job.setReducerClass (IntSumReducer.class); / / set Reducer class job.setMapOutputKeyClass (Text.class) / / set the type of map phase output Key job.setMapOutputValueClass (IntWritable.class); / / set the type of map phase output Value job.setOutputKeyClass (Text.class); / / set the type of reduce phase output Key type job.setOutputValueClass (IntWritable.class) / / set the type of reduce output Value / / set the job input path (obtained from the main method parameter args) FileInputFormat.addInputPath (job, new Path (args [0])); / / set the job output path (obtained from the main method parameter args) FileOutputFormat.setOutputPath (job, new Path (args [1])) Job.waitForCompletion (true); / / submit job}

Enter:

Words:

Hello tomhello jerryhello kittyhello worldhello tom

Output:

Hello 5jerry 1kitty 1tom 2world 1

Less object creation and less GC will certainly lead to faster speed

Using combiner to reduce the amount of data transmitted through shuffle is one of the key points of MapReduce job tuning.

The above is the editor for you to share the MapReduce how to achieve WordCount and its optimization, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report