Hadoop 2.2.0 how to compile and run wordcount 07/13 Update SLTechnology News&Howtos

Hadoop 2.2.0 how to compile and run wordcount

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces hadoop 2.2.0 how to compile and run wordcount, the article is very detailed, has a certain reference value, interested friends must read it!

1. First of all, introduce the version of hadoop. The current version of Hadoop is confusing, leaving many users at a loss. In fact, there are only two versions of Hadoop: Hadoop 1.0 and Hadoop 2.0, in which Hadoop 1.0 consists of a distributed file system HDFS and an offline computing framework MapReduce, while Hadoop 2.0 includes a HDFS that supports NameNode scale-out, a resource management system YARN and an offline computing framework MapReduce running on YARN. Compared with Hadoop 1.0 Hadoop 2.0, Hadoop 2.0 is more powerful, has better scalability, performance, and supports a variety of computing frameworks. Because hadoop 2.0 is not used in the API of hadoop 1.0, upgrading from hadoop 1.0 to hadoop 2.0 requires rewriting the mapreduce program About upgrading from Hadoop 1.0 to 2.0 (1) reference Link: http://dongxicheng.org/mapreduce-nextgen/hadoop-upgrade-to-version-2/ hadoop 2.2.0 New Features introduction reference Link http://docs.aws.amazon.com/zh_cn/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-2.2.0-features.html

2. Then comes the preparation program WordCount.java under / root/test/:

Import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat Import org.apache.hadoop.util.GenericOptionsParser; public class WordCount {public static class TokenizerMapper extends Mapper {private final static IntWritable one = new IntWritable (1); private Text word = new Text (); / / value is already a line of public void map (Object key, Text value, Context context) throws IOException, InterruptedException {StringTokenizer itr = new StringTokenizer (value.toString ()) While (itr.hasMoreTokens ()) {word.set (itr.nextToken ()); context.write (word, one);}} public static class IntSumReducer extends Reducer {private IntWritable result = new IntWritable () Public void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; for (IntWritableval: values) {sum + = val.get ();} result.set (sum); context.write (key, result) } public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); String [] otherArgs = new GenericOptionsParser (conf, args). GetRemainingArgs (); if (otherArgs.length! = 2) {System.err.println ("Usage: wordcount"); System.exit (2);} Job job = new Job (conf, "wordcount"); job.setJarByClass (WordCount.class) Job.setMapperClass (TokenizerMapper.class); job.setCombinerClass (IntSumReducer.class); job.setReducerClass (IntSumReducer.class); job.setOutputKeyClass (Text.class); job.setOutputValueClass (IntWritable.class); FileInputFormat.addInputPath (job, new Path (otherArgs [0])); FileOutputFormat.setOutputPath (job, new Path (otherArgs [1])); System.exit (job.waitForCompletion (true)? 0: 1);}}

3. Create a new bin folder under / root/test/, and compile WordCount into a class file with the following command:

Root@ubuntupc:/home/ubuntu/software/cdh6-hadoop/share/hadoop# javac-classpath common/hadoop-common-2.2.0-cdh6.0.0-beta-2.jar:common/lib/commons-cli-1.2.jar:common/lib/hadoop-annotations-2.2.0-cdh6.0.0-beta-2.jar:mapreduce/hadoop-mapreduce-client-core-2.2.0-cdh6.0.0-beta-2.jar-d / root/test/bin/ / root/test/WordCount.java

4. Package the class file into a jar package with the following command:

Root@ubuntupc:~/test# jar-cvf WordCount.jar com/du/simple/*.class

5. Run the jar file

Root@ubuntupc:~/test# hadoop jar WordCount.jar com/du/simple/WordCount / user/root/input / user/root/output

6. View the running results

Root@ubuntupc:~/hadoop/WordCount# hadoop fs-cat output/part-r-00000

All right, this is the end of the fight!

These are all the contents of the article "how to compile and run wordcount in hadoop 2.2.0". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.