Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

MapReduce programming practice 2 color-inverted index (jar package)

2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Task requirements:

/ / input file format

18661629496 110

13107702446 110

1234567 120

2345678 120

987654 110

2897839274 18661629496

/ / output file format

11018661629496 | 13107702446 | 987654 | 18661629496 | 13107702446 | 987654 |

1201234567 | 2345678 | 1234567 | 2345678 |

186616294962897839274 | 2897839274 |

Mapreduce programming:

Import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat Public class Test2 {enum Counter {Line of LINESKIP,// record error} public static class Map extends Mapper {public void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {String line = value.toString () / / read source data try {/ / data processing String [] lineSplit = line.split (""); / / 18661629496110 String anum = lineSplit [0]; String bnum = lineSplit [1] / / output format: 110 context.write 18661629496 (new Text (bnum), new Text (anum));} catch (ArrayIndexOutOfBoundsException e) {context.getCounter (Counter.LINESKIP) .increment (1) / / counter + 1 return;}} public static class Reduce extends Reducer {public void reduce (Text key, Iterable values, Context context) throws IOException, InterruptedException {String valueString; String out= "; for (Text value:values) {valueString=value.toString () Out+=valueString+ "|;} context.write (key, new Text (out));} public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); if (args.length! = 2) {System.err.println (" Please configure input and output paths "); System.exit (2) } / / various configurations Job job = new Job (conf, "telephone"); / / Job name configuration / / Class configuration job.setJarByClass (Test2.class); job.setMapperClass (Map.class); job.setReducerClass (Reduce.class); / / map output format configuration job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (Text.class) / / Job output format configuration job.setOutputKeyClass (Text.class); job.setOutputValueClass (Text.class); / / add input and output paths FileInputFormat.addInputPath (job, new Path (args [0])); FileOutputFormat.setOutputPath (job, new Path (args [1])); / / exit System.exit (job.waitForCompletion (true)? 0: 1) when the task is completed;}}

Package the mapreduce program as a jar file:

1. Right-click the project name-> Export- > java- > jar file

two。 Configure the jar file storage location

3. Select main calss

4. Run the jar file

[liuqingjie@master hadoop-0.20.2] $bin/hadoop jar / home/liuqingjie/test2.jar / user/liuqingjie/in / user/liuqingjie/out

15-05-14 01:46:47 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

15-05-14 01:46:47 INFO input.FileInputFormat: Total input paths to process: 2

15-05-14 01:46:48 INFO mapred.JobClient: Running job: job_201505132004_0005

15-05-14 01:46:49 INFO mapred.JobClient: map 0 reduce 0

15-05-14 01:46:57 INFO mapred.JobClient: map 100% reduce 0

15-05-14 01:47:09 INFO mapred.JobClient: map 100 reduce 100%

.

View the result

[liuqingjie@master hadoop-0.20.2] $bin/hadoop dfs-cat. / out/*

Cat: Source must be a file.

110 18661629496 | 13107702446 | 987654 | 18661629496 | 13107702446 | 987654 |

1234567 | 2345678 | 1234567 | 2345678 |

18661629496 2897839274 | 2897839274 |

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report