What is the Java MapReduce programming method? 09/18 Update SLTechnology News&Howtos

What is the Java MapReduce programming method?

2025-09-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what is the Java MapReduce programming method". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn what the Java MapReduce programming method is.

Experimental topic:

MapReduce: programming

Content of the experiment:

In this experiment, Java API provided by Hadoop is used to program MapReduce.

Experimental objectives:

Master MapReduce programming.

Understand the principle of MapReduce

[experimental assignment] simple traffic statistics

There are log files like this:

13726230503 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

13726230513 00-FD-07-A4-72-B8:CMCC 120.196.40.8 i02.c.aliimg.com 2480200

13826230523 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

13726230533 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

13726230543 00-FD-07-A4-72-B8:CMCC 120.196.100.82 Video website 1527 2106

13926230553 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

13826230563 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

13926230573 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

18912688533 00-FD-07-A4-72-B8:CMCC 220.196.100.82 Integrated portal 1938 2910

18912688533 00-FD-07-A4-72-B8:CMCC 220.196.100.82 i02.c.aliimg.com 3333 21321

13726230503 00-FD-07-A4-72-B8:CMCC 120.196.100.82 Search Engines 9531 9531

13826230523 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

13726230503 00-FD-07-A4-72-B8:CMCC 120.196.100.82 i02.c.aliimg.com 2481 24681

The log file records the network traffic information of each mobile phone user over a period of time. The specific fields are:

Mobile phone number MAC address IP address domain name uplink traffic (number of bytes) downstream traffic (number of bytes) package type

According to the above log, the total traffic of each mobile phone user in this period (uplink traffic + downstream traffic) is calculated. The format of the statistical result is as follows:

Number of bytes of mobile phone number

Experimental results:

Lab code:

WcMap.java

Import java.io.IOException;import org.apache.commons.lang.StringUtils;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Mapper; public class WcMap extends Mapper {@ Override protected void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {String str = value.toString () String [] words = StringUtils.split (str, "", 10); int iTuno; for (String word: words) {if (i==words.length-2 | | i==words.length-3) context.write (new Text (words [0]), new LongWritable (Integer.parseInt (word) ITunes;}

WcReduce.java

Import java.io.IOException;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Reducer;public class WcReduce extends Reducer {@ Override protected void reduce (Text key, Iterable values,Context context) throws IOException, InterruptedException {long count = 0; for (LongWritable value: values) {count + = value.get ();} context.write (key, new LongWritable (count)) }}

WcRunner.java

Import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.util.Scanner;import org.apache.hadoop.fs.FSDataInputStream;import org.apache.hadoop.fs.FileSystem;import java.net.URI Public class WcRunner {public static void main (String [] args) throws IOException, ClassNotFoundException, InterruptedException {Configuration conf = new Configuration (); Job job = Job.getInstance (conf); job.setJarByClass (WcRunner.class); job.setMapperClass (WcMap.class); job.setReducerClass (WcReduce.class); job.setOutputKeyClass (Text.class); job.setOutputValueClass (LongWritable.class) Job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (LongWritable.class); Scanner sc = new Scanner (System.in); System.out.print ("inputPath:"); String inputPath = sc.next (); System.out.print ("outputPath:"); String outputPath = sc.next () Try {FileSystem fs0 = FileSystem.get (new URI ("hdfs://master:9000"), new Configuration ()); Path hdfsPath = new Path (outputPath); fs0.copyFromLocalFile (new Path ("/ headless/Desktop/workspace/mapreduce/WordCount/data/1.txt"), new Path ("/ mapreduce/WordCount/input/1.txt")) If (fs0.delete (hdfsPath,true)) {System.out.println ("Directory" + outputPath + "has been deleted successfully!");}} catch (Exception e) {e.printStackTrace ();} FileInputFormat.setInputPaths (job, new Path ("hdfs://master:9000" + inputPath)) FileOutputFormat.setOutputPath (job, new Path ("hdfs://master:9000" + outputPath)); job.waitForCompletion (true); try {FileSystem fs = FileSystem.get (new URI ("hdfs://master:9000"), new Configuration ()); Path srcPath = new Path (outputPath+ "/ part-r-00000"); FSDataInputStream is = fs.open (srcPath) System.out.println ("Results:"); while (true) {String line = is.readLine (); if (line = = null) {break;} System.out.println (line);} is.close () } catch (Exception e) {e.printStackTrace ();} [Experimental Job] Index inverted output line number

In the index inversion experiment, we can get the files in which each word is distributed and the number of times it appears in each file, and modify the above implementation. in the output inverted index results, we can get the specific line number information of each word in each file. The format of the output is as follows:

Word file name: line number, file name: line number, file name: line number

Experimental results:

MapReduce appears twice in the first line of 3.txt, so there are two 1s.

Import java.io.*;import java.util.StringTokenizer;import org.apache.hadoop.io.*;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.lib.input.FileSplit;public class MyMapper extends Mapper {private Text keyInfo = new Text (); private Text valueInfo = new Text (); private FileSplit split; int num=0; public void map (Object key,Text value,Context context) throws IOException,InterruptedException {num++ Split = (FileSplit) context.getInputSplit (); StringTokenizer itr = new StringTokenizer (value.toString ()); while (itr.hasMoreTokens ()) {keyInfo.set (itr.nextToken () + "+ split.getPath (). GetName (). ToString ()); valueInfo.set (num+"); context.write (keyInfo,valueInfo);}} import java.io.* Import org.apache.hadoop.io.*;import org.apache.hadoop.mapreduce.Reducer;public class MyCombiner extends Reducer {private Text info = new Text (); public void reduce (Text key,Iterablevalues,Context context) throws IOException, InterruptedException {String sum = ""; for (Text value:values) {sum + = value.toString () + ";} String record = key.toString () String [] str = record.split (""); key.set (str [0]); info.set (str [1] + ":" + sum); context.write (key,info);} import java.io.IOException;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Reducer;public class MyReducer extends Reducer {private Text result = new Text () Public void reduce (Text key,Iterablevalues,Context context) throws IOException, InterruptedException {String value = new String (); for (Text value1:values) {value + = value1.toString () + ";";} result.set (value); context.write (key,result);}} import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path Import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.util.Scanner;import org.apache.hadoop.fs.FSDataInputStream;import org.apache.hadoop.fs.FileSystem;import java.net.URI;public class MyRunner {public static void main (String [] args) throws IOException, ClassNotFoundException, InterruptedException {Configuration conf = new Configuration () Job job = Job.getInstance (conf); job.setJarByClass (MyRunner.class); job.setMapperClass (MyMapper.class); job.setReducerClass (MyReducer.class); job.setCombinerClass (MyCombiner.class); job.setOutputKeyClass (Text.class); job.setOutputValueClass (Text.class); job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (Text.class) Scanner sc = new Scanner (System.in); System.out.print ("inputPath:"); String inputPath = sc.next (); System.out.print ("outputPath:"); String outputPath = sc.next (); try {FileSystem fs0 = FileSystem.get (new URI ("hdfs://master:9000"), new Configuration ()); Path hdfsPath = new Path (outputPath) If (fs0.delete (hdfsPath,true)) {System.out.println ("Directory" + outputPath + "has been deleted successfully!");}} catch (Exception e) {e.printStackTrace ();} FileInputFormat.setInputPaths (job, new Path ("hdfs://master:9000" + inputPath)) FileOutputFormat.setOutputPath (job, new Path ("hdfs://master:9000" + outputPath)); job.waitForCompletion (true); try {FileSystem fs = FileSystem.get (new URI ("hdfs://master:9000"), new Configuration ()); Path srcPath = new Path (outputPath+ "/ part-r-00000"); FSDataInputStream is = fs.open (srcPath) System.out.println ("Results:"); while (true) {String line = is.readLine (); if (line = = null) {break;} System.out.println (line);} is.close () } catch (Exception e) {e.printStackTrace ();} thank you for reading, this is the content of "what is the Java MapReduce programming method". After the study of this article, I believe you have a deeper understanding of what the Java MapReduce programming method is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.