Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the basic content of MapReduce?

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail what is the basic content of MapReduce. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

1. WordCount program

1.1 WordCount source program

Import java.io.IOException;import java.util.Iterator;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat Import org.apache.hadoop.util.GenericOptionsParser;public class WordCount {public WordCount () {} public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); String [] otherArgs = (new GenericOptionsParser (conf, args)) .getRemainingArgs (); if (otherArgs.length)

< 2) { System.err.println("Usage: wordcount [...] "); System.exit(2); } Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(WordCount.TokenizerMapper.class); job.setCombinerClass(WordCount.IntSumReducer.class); job.setReducerClass(WordCount.IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); for(int i = 0; i < otherArgs.length - 1; ++i) { FileInputFormat.addInputPath(job, new Path(otherArgs[i])); } FileOutputFormat.setOutputPath(job, new Path(otherArgs[otherArgs.length - 1])); System.exit(job.waitForCompletion(true)?0:1); } public static class TokenizerMapper extends Mapper { private static final IntWritable one = new IntWritable(1); private Text word = new Text(); public TokenizerMapper() { } public void map(Object key, Text value, Mapper.Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while(itr.hasMoreTokens()) { this.word.set(itr.nextToken()); context.write(this.word, one); } } }public static class IntSumReducer extends Reducer { private IntWritable result = new IntWritable(); public IntSumReducer() { } public void reduce(Text key, Iterable values, Reducer.Context context) throws IOException, InterruptedException { int sum = 0; IntWritableval; for(Iterator i$ = values.iterator(); i$.hasNext(); sum += val.get()) { val = (IntWritable)i$.next(); } this.result.set(sum); context.write(key, this.result); } }} 1.2 运行程序,Run As->

Java Applicatiion

1.3 compile the package to generate Jar files

2 run the program

2.1 create a text file to count the word frequency

Wordfile1.txt

Spark Hadoop

Big Data

Wordfile2.txt

Spark Hadoop

Big Cloud

2.2 launch hdfs, create a new input folder, and upload word frequency files

Cd / usr/local/hadoop/

. / sbin/start-dfs.sh

. / bin/hadoop fs-mkdir input

. / bin/hadoop fs-put / home/hadoop/wordfile1.txt input

. / bin/hadoop fs-put / home/hadoop/wordfile2.txt input

2.3 View uploaded word frequency files:

Hadoop@dblab-VirtualBox:/usr/local/hadoop$. / bin/hadoop fs-ls.

Found 2 items

Drwxr-xr-x-hadoop supergroup 0 2019-02-11 15:40 input

-rw-r--r-- 1 hadoop supergroup 5 2019-02-10 20:22 test.txt

Hadoop@dblab-VirtualBox:/usr/local/hadoop$. / bin/hadoop fs-ls. / input

Found 2 items

-rw-r--r-- 1 hadoop supergroup 27 2019-02-11 15:40 input/wordfile1.txt

-rw-r--r-- 1 hadoop supergroup 29 2019-02-11 15:40 input/wordfile2.txt

2.4 run WordCount

. / bin/hadoop jar / home/hadoop/WordCount.jar input output

Large chunks of information will be entered on the screen

You can then view the results of the run:

Hadoop@dblab-VirtualBox:/usr/local/hadoop$. / bin/hadoop fs-cat output/*

Hadoop 2

Spark 2

About what the basic content of MapReduce is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report