In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is to share with you about how to achieve custom sorting function in MapReduce, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
The test text of this article:
Tom 20 8000nancy 22 8000ketty 22 9000stone 19 10000green 19 11000white 39 29000socrates 30 40000
In MapReduce, partition, sort, and group according to key
MapReduce sorts according to the key corresponding to the basic type, such as the LongWritable,Text type of IntWritable,long type of int type, which is sorted by default in ascending order
Why does customize the collation? For existing requirements, you need to customize the key type, and customize the sorting rules of key. For example, sort by person's salary descending order, if the same, then sort by age ascending order.
Take the Text type as an example:
The Text class implements the WritableComparable interface and has write (), readFields (), and compare () methods.
ReadFields () method: used to deserialize operations
Write () method: used to serialize operations
So if you want to customize the type to sort, you need to have the above method.
Custom class code:
Import org.apache.hadoop.io.WritableComparable;import java.io.DataInput;import java.io.DataOutput;import java.io.IOException;public class Person implements WritableComparable {private String name; private int age; private int salary; public Person () {} public Person (String name, int age, int salary) {/ / super (); this.name = name; this.age = age; this.salary = salary } public String getName () {return name;} public void setName (String name) {this.name = name;} public int getAge () {return age;} public void setAge (int age) {this.age = age;} public int getSalary () {return salary;} public void setSalary (int salary) {this.salary = salary } @ Override public String toString () {return this.salary + "" + this.age + "" + this.name;} / / compare salary first, the highest sort comes first. If the same, age small before public int compareTo (Person o) {int compareResult1= this.salary-o.salt; if (compareResult1! = 0) {return-compareResult1;} else {return this.age-o.age;}} / / serialize NewKey into binary public void write (DataOutput dataOutput) throws IOException {dataOutput.writeUTF (name) using streaming DataOutput.writeInt (age); dataOutput.writeInt (salary);} / / use in to read fields in the same order as written in the write method. Public void readFields (DataInput dataInput) throws IOException {/ / read string this.name = dataInput.readUTF (); this.age = dataInput.readInt (); this.salary = dataInput.readInt ();}}
MapReuduce program:
Import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.NullWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat Import java.io.IOException;import java.net.URI;public class SecondarySort {public static void main (String [] args) throws Exception {System.setProperty ("HADOOP_USER_NAME", "hadoop2.7"); Configuration configuration = new Configuration () / / set the local mapreduce program jar package configuration.set ("mapreduce.job.jar", "C:\ Users\\ tanglei1\\ IdeaProjects\\ Hadooptang\\ target\\ com.kaikeba.hadoop-1.0-SNAPSHOT.jar"); Job job = Job.getInstance (configuration, SecondarySort.class.getSimpleName ()); FileSystem fileSystem = FileSystem.get (args [1]), configuration) If (fileSystem.exists (new Path (args [1])) {fileSystem.delete (new Path (args [1]), true);} FileInputFormat.setInputPaths (job, new Path (args [0])); job.setMapperClass (MyMap.class); job.setMapOutputKeyClass (Person.class); job.setMapOutputValueClass (NullWritable.class); / / set the number of reduce job.setNumReduceTasks (1) Job.setReducerClass (MyReduce.class); job.setOutputKeyClass (Person.class); job.setOutputValueClass (NullWritable.class); FileOutputFormat.setOutputPath (job, new Path (args [1])); job.waitForCompletion (true) } public static class MyMap extends Mapper {/ / LongWritable: input parameter key type, Text: input parameter value type / / Persion: output parameter key type, NullWritable: output parameter value type @ Override / / map the output value is the key value pair NullWritable said that he is concerned about the value of V protected void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {/ / LongWritable key: enter the key of the parameter key value pair, Text value: enter the value of the parameter key value pair / / to get a row of data Enter the key of the parameter (position from the first line), and Hadoop reads the text line by line when Hadoop reads the data / / fields: data representing a line of text String [] fields = value.toString () .split ("") / / A row of data in this column: nancy 22 8000 String name = fields [0]; / / string conversion to int int age = Integer.parseInt (fields [1]); int salary = Integer.parseInt (fields [2]); / / compare Person person = new Person (name, age, salary) in a custom class Context.write (person, NullWritable.get ();}} public static class MyReduce extends Reducer {@ Override protected void reduce (Person key, Iterable values, Context context) throws IOException, InterruptedException {context.write (key, NullWritable.get ());}
Running result:
40000 30 socrates29000 39 white11000 19 green10000 19 stone9000 22 ketty8000 20 tom8000 22 nancy above is how to achieve custom sorting in MapReduce. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.