In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "how to achieve WordCount On Hadoop". In daily operation, I believe many people have doubts about how to achieve WordCount On Hadoop. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "how to achieve WordCount On Hadoop"! Next, please follow the editor to study!
Official examples:
WordCount2.java
Import java.io.BufferedReader;import java.io.FileReader;import java.io.IOException;import java.net.URI;import java.util.ArrayList;import java.util.HashSet;import java.util.List;import java.util.Set;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper Import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.mapreduce.Counter;import org.apache.hadoop.util.GenericOptionsParser;import org.apache.hadoop.util.StringUtils;public class WordCount2 {public static class TokenizerMapper extends Mapper {static enum CountersEnum {INPUT_WORDS} private final static IntWritable one = new IntWritable (1) Private Text word = new Text (); private boolean caseSensitive; private Set patternsToSkip = new HashSet (); private Configuration conf; private BufferedReader fis; @ Override public void setup (Context context) throws IOException, InterruptedException {conf = context.getConfiguration (); caseSensitive = conf.getBoolean ("wordcount.case.sensitive", true) If (conf.getBoolean ("wordcount.skip.patterns", false)) {/ / the official example is true. If there is no configuration file, an error will be reported and false will be normal. See: https://issues.apache.org/jira/browse/MAPREDUCE-6038 URI [] patternsURIs = Job.getInstance (conf). GetCacheFiles (); for (URI patternsURI: patternsURIs) {Path patternsPath = new Path (patternsURI.getPath ()); String patternsFileName = patternsPath.getName () .toString (); parseSkipFile (patternsFileName) } private void parseSkipFile (String fileName) {try {fis = new BufferedReader (new FileReader (fileName)); String pattern = null; while ((pattern = fis.readLine ())! = null) {patternsToSkip.add (pattern) }} catch (IOException ioe) {System.err .println ("Caught exception while parsing the cached file'" + StringUtils.stringifyException (ioe)) } @ Override public void map (Object key, Text value, Context context) throws IOException, InterruptedException {String line = (caseSensitive)? Value.toString (): value.toString () .toLowerCase (); for (String pattern: patternsToSkip) {line = line.replaceAll (pattern, ");} StringTokenizer itr = new StringTokenizer (line); while (itr.hasMoreTokens ()) {word.set (itr.nextToken ()) Context.write (word, one); Counter counter = context.getCounter (CountersEnum.class.getName (), CountersEnum.INPUT_WORDS.toString ()); counter.increment (1);}} public static class IntSumReducer extends Reducer {private IntWritable result = new IntWritable () Public void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {int sum = 0; for (IntWritableval: values) {sum + = val.get ();} result.set (sum); context.write (key, result) }} public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); GenericOptionsParser optionParser = new GenericOptionsParser (conf, args); String [] remainingArgs = optionParser.getRemainingArgs (); if (! (remainingArgs.length! = 2 | | remainingArgs.length! = 4) {System.err .println ("Usage: wordcount [- skip skipPatternFile]") System.exit (2);} Job job = Job.getInstance (conf, "word count"); job.setJarByClass (WordCount2.class); job.setMapperClass (TokenizerMapper.class); job.setCombinerClass (IntSumReducer.class); job.setReducerClass (IntSumReducer.class); job.setOutputKeyClass (Text.class); job.setOutputValueClass (IntWritable.class); List otherArgs = new ArrayList () For (int I = 0; I < remainingArgs.length; + + I) {if ("- skip" .equals (mainingArgs [I])) {job.addCacheFile (new Path (mainingArgs [+ + I]) .toUri (); job.getConfiguration () .setBoolean ("wordcount.skip.patterns", true) } else {otherArgs.add (mainingArgs [I]);}} FileInputFormat.addInputPath (job, new Path (otherArgs.get (0)); FileOutputFormat.setOutputPath (job, new Path (otherArgs.get (1); System.exit (job.waitForCompletion (true)? 0: 1) }} cd / data/programjavac-classpath / home/hadoop/hadoop-2.7.1/share/hadoop/common/hadoop-common-2.7.1.jar:/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.1.jar:/home/hadoop/hadoop-2.7.1/share/hadoop/common/lib/commons-cli-1.2.jar WordCount2.javajar cf wc.jar WordCount*.classcd / home / hadoop/hadoop-2.7.1/bin/hadoop jar wc.jar WordCount2 / program/input / program/output to this point The study on "how to achieve WordCount On Hadoop" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.