In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, the editor will bring you about how to achieve WordCount and its optimization in MapReduce. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
WordCount: word count, counting the number of times each word appears in a text file
Define the Mapper class, which inherits from org.apache.hadoop.mapreduce.Mapper
And override the map () method
Public static class TokenizerMapper extends Mapper {/ / defines a static member variable and is immutable to avoid creating a duplicate object private final static IntWritable one = new IntWritable (1) every time the map () method is called / / define a member variable, variable. Each time you call the map () method, you only need to call the Text.set () method to assign the new value private Text word = new Text (). Public void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {String [] words = value.toString () .split ("); for (String item: words) {word.set (item) Context.write (word, one);}
Define the Reducer class, which inherits from org.apache.hadoop.mapreduce.Reducer
And override the reduce () method
Public static class IntSumReducer extends Reducer {/ / defines a member variable that is variable. Each time you call the reduce () method, you only need to call the IntWritable.set () method to assign a new value, private IntWritable result = new IntWritable (). Public void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; for (IntWritableval: values) {sum + = val.get () } result.set (sum); context.write (key, result);}}
Test WordCount
Public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); Job job = Job.getInstance (conf); job.setJarByClass (WordCount.class); / / set the main class job.setMapperClass (TokenizerMapper.class) of job / / set Mapper class / / use combiner to reduce the amount of data transmitted over shuffle job.setCombinerClass (IntSumReducer.class); / / set Combiner class job.setReducerClass (IntSumReducer.class); / / set Reducer class job.setMapOutputKeyClass (Text.class) / / set the type of map phase output Key job.setMapOutputValueClass (IntWritable.class); / / set the type of map phase output Value job.setOutputKeyClass (Text.class); / / set the type of reduce phase output Key type job.setOutputValueClass (IntWritable.class) / / set the type of reduce output Value / / set the job input path (obtained from the main method parameter args) FileInputFormat.addInputPath (job, new Path (args [0])); / / set the job output path (obtained from the main method parameter args) FileOutputFormat.setOutputPath (job, new Path (args [1])) Job.waitForCompletion (true); / / submit job}
Enter:
Words:
Hello tomhello jerryhello kittyhello worldhello tom
Output:
Hello 5jerry 1kitty 1tom 2world 1
Less object creation and less GC will certainly lead to faster speed
Using combiner to reduce the amount of data transmitted through shuffle is one of the key points of MapReduce job tuning.
The above is the editor for you to share the MapReduce how to achieve WordCount and its optimization, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.