Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use WordCount MapReduce

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how to use WordCount MapReduce". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Package org.myorg;import java.io.*;import java.util.*;import org.apache.hadoop.fs.Path;import org.apache.hadoop.filecache.DistributedCache;import org.apache.hadoop.conf.*;import org.apache.hadoop.io.*;import org.apache.hadoop.mapred.*;import org.apache.hadoop.util.*;public class WordCount extends Configured implements Tool {public static class Map extends MapReduceBase implements Mapper {static enum Counters {INPUT_WORDS} private final static IntWritable one = new IntWritable (1) Private Text word = new Text (); private boolean caseSensitive = true; private Set patternsToSkip = new HashSet (); private long numRecords = 0; private String inputFile; public void configure (JobConf job) {caseSensitive = job.getBoolean ("wordcount.case.sensitive", true); inputFile = job.get ("map.input.file") If (job.getBoolean ("wordcount.skip.patterns", false)) {Path [] patternsFiles = new Path [0]; try {patternsFiles = DistributedCache.getLocalCacheFiles (job);} catch (IOException ioe) {System.err.println ("Caught exception while getting cached files:" + StringUtils.stringifyException (ioe)) } for (Path patternsFile: patternsFiles) {parseSkipFile (patternsFile);} private void parseSkipFile (Path patternsFile) {try {BufferedReader fis = new BufferedReader (new FileReader (patternsFile.toString (); String pattern = null While ((pattern = fis.readLine ())! = null) {patternsToSkip.add (pattern);}} catch (IOException ioe) {System.err.println ("Caught exception while parsing the cached file'" + patternsFile + "':" + StringUtils.stringifyException (ioe)) }} public void map (LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {String line = (caseSensitive)? Value.toString (): value.toString (). ToLowerCase (); for (String pattern: patternsToSkip) {line = line.replaceAll (pattern, ");} StringTokenizer tokenizer = new StringTokenizer (line); while (tokenizer.hasMoreTokens ()) {word.set (tokenizer.nextToken ()); output.collect (word, one) Reporter.incrCounter (Counters.INPUT_WORDS, 1);} if ((+ + numRecords% 100) = 0) {reporter.setStatus ("Finished processing" + numRecords + "records" + "from the input file:" + inputFile) } public static class Reduce extends MapReduceBase implements Reducer {public void reduce (Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException {int sum = 0; while (values.hasNext ()) {sum + = values.next (). Get ();} output.collect (key, new IntWritable (sum)) }} public int run (String [] args) throws Exception {JobConf conf = new JobConf (getConf (), WordCount.class); conf.setJobName ("wordcount"); conf.setOutputKeyClass (Text.class); conf.setOutputValueClass (IntWritable.class); conf.setMapperClass (Map.class); conf.setCombinerClass (Reduce.class); conf.setReducerClass (Reduce.class); conf.setInputFormat (TextInputFormat.class) Conf.setOutputFormat (TextOutputFormat.class); List other_args = new ArrayList (); for (int I = 0; I < args.length; + + I) {if ("- skip" .equals (args [I])) {DistributedCache.addCacheFile (new Path (args [+ + I]). ToUri (), conf); conf.setBoolean ("wordcount.skip.patterns", true) } else {other_args.add (args [I]);}} FileInputFormat.setInputPaths (conf, new Path (other_args.get (0)); FileOutputFormat.setOutputPath (conf, new Path (other_args.get (1); JobClient.runJob (conf); return 0 } public static void main (String [] args) throws Exception {int res = ToolRunner.run (new Configuration (), new WordCount (), args); System.exit (res);}} "how to use WordCount MapReduce" ends here, thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report