How to realize grouping in Hadoop 04/17 Update SLTechnology News&Howtos

How to realize grouping in Hadoop

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "how to achieve grouping in Hadoop", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to achieve grouping in Hadoop" this article.

Package grounp;import java.io.DataInput;import java.io.DataOutput;import java.io.IOException;import java.net.URI;import java.net.URISyntaxException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.RawComparator;import org.apache.hadoop.io.Text;import org.apache.hadoop.io.WritableComparable;import org.apache.hadoop.io.WritableComparator Import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat / * * Custom grouping * initial result: * 3 3 * 3 2 * 3 1 * 2 2 * 2 1 * 1 1 * output: 1 1 2 2 3 3 * @ author Xr * * / public class groupApp {public static final String INPUT_PATH = "hdfs://hadoop:9000/data"; public static final String OUTPUT_PATH = "hdfs://hadoop:9000/datas" Public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); existsFile (conf); Job job = new Job (conf, groupApp.class.getName ()); FileInputFormat.setInputPaths (job, INPUT_PATH); job.setMapperClass (MyMapper.class) / / Custom key job.setMapOutputKeyClass (NewKey.class); job.setMapOutputValueClass (LongWritable.class); / / Custom grouping job.setGroupingComparatorClass (NewGroupCompator.class); job.setReducerClass (MyReducer.class); job.setOutputKeyClass (LongWritable.class) Job.setOutputValueClass (LongWritable.class); FileOutputFormat.setOutputPath (job, new Path (OUTPUT_PATH)); job.waitForCompletion (true);} private static void existsFile (Configuration conf) throws IOException, URISyntaxException {FileSystem fs = FileSystem.get (new URI (OUTPUT_PATH), conf) If (fs.exists (new Path (OUTPUT_PATH) {fs.delete (new Path (OUTPUT_PATH), true);}} class MyMapper extends Mapper {@ Override protected void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {String string = value.toString () String [] split = string.split ("\ t"); NewKey K2 = new NewKey (); k2.set (Long.parseLong (split [0]), Long.parseLong (split [1]); context.write (K2, new LongWritable (Long.parseLong (split [1]) }} class MyReducer extends Reducer {@ Override protected void reduce (NewKey key2, Iterable values,Context context) throws IOException, InterruptedException {long max = Long.MIN_VALUE; for (LongWritable v2: values) {long l = v2.get () If (l > max) {max = l;}} context.write (new LongWritable (key2.first), new LongWritable (max));}} class NewKey implements WritableComparable {long first; long second @ Override public void write (DataOutput out) throws IOException {out.writeLong (this.first); out.writeLong (this.second);} public void set (long parseLong, long parseLong2) {this.first = parseLong; this.second = parseLong2 } @ Override public void readFields (DataInput in) throws IOException {this.first = in.readLong (); this.second = in.readLong () } @ Override public int compareTo (NewKey o) {if (this.first==o.first) {if (this.second < o.second) {return-1;} else if (this.second = = o.second) {return 0 } else {return 1;}} else {if (this.first < o.first) {return-1 } else {return 1;}} class NewGroupCompator implements RawComparator {@ Override public int compare (NewKey o1, NewKey o2) {return 0 } / * compare the size of the specified byte sequence in the byte array * @ param b1 the first byte array participating in the comparison * @ param S1 the start position of the first byte array participating in the comparison * @ param L1 the bytes of the first byte array participating in the comparison Length * @ param b2 second participating byte array * @ param S2 start position of the second participating byte array * @ param L2 byte length of the second participating byte array * @ return * / @ Override public int compare (byte [] b1 Int S1, int L1, byte [] b2, int S2, int L2) {return WritableComparator.compareBytes (b1, S1, 8, b2, S2, 8) }} above is all the content of the article "how to achieve grouping in Hadoop". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.