In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
In Mapreduce:
The shuffle phase is between map and reduce, and you can customize sorting, custom partitions, and custom grouping!
In Mapreduce, the data from map is a key-value pair. By default, hashPatitionner is used to partition the data from map.
There are several other ways to partition:
RandomSampler sampler = new InputSampler.RandomSampler (3000, 3000, 10); IntervalSampler sampler2 = new InputSampler.IntervalSampler (0.333, 10); SplitSampler sampler3 = new InputSampler.SplitSampler (reduceNumber)
Implementation and details
Public class TotalSortMR {@ SuppressWarnings ("deprecation") public static int runTotalSortJob (String [] args) throws Exception {Path inputPath = new Path (args [0]); Path outputPath = new Path (args [1]); Path partitionFile = new Path (args [2]); int reduceNumber = Integer.parseInt (args [3]) / / three samplers RandomSampler sampler = new InputSampler.RandomSampler (1, 3000, 10); IntervalSampler sampler2 = new InputSampler.IntervalSampler (0.333, 10); SplitSampler sampler3 = new InputSampler.SplitSampler (reduceNumber); / / Task initialization Configuration conf = new Configuration (); Job job = Job.getInstance (conf); job.setJobName ("Total-Sort") Job.setJarByClass (TotalSortMR.class); job.setInputFormatClass (KeyValueTextInputFormat.class); job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (Text.class); job.setNumReduceTasks (reduceNumber); / / set all partition classes job.setPartitionerClass (TotalOrderPartitioner.class); / / partition files referenced by partition classes TotalOrderPartitioner.setPartitionFile (conf, partitionFile) / / which sampler InputSampler.writePartitionFile (job, sampler) is used in the partition; / / the input and output paths of job are FileInputFormat.setInputPaths (job, inputPath); FileOutputFormat.setOutputPath (job, outputPath); outputPath.getFileSystem (conf) .delete (outputPath, true); return job.waitForCompletion (true)? 0: 1 } public static void main (String [] args) throws Exception {System.exit (runTotalSortJob (args));}}
The default input format for job is TextInputFormat, which is in the form of key-value, key is the line label of each line, and value is the content of each line. Can be changed
Job.setInputFormatClass (,....)
Generally, you need to set the output format of mapper for later use.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.