How to use MapReduce Map Join 04/04 Update SLTechnology News&Howtos

How to use MapReduce Map Join

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "how to use MapReduce Map Join". In daily operation, I believe many people have doubts about how to use MapReduce Map Join. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "how to use MapReduce Map Join". Next, please follow the editor to study!

1. Sample data

011990-99999 SIHCCAJAVRI012650-99999 TYNSET-HANSMOEN

012650-99999 194903241200 111012650-9999194903241800 78011990-9999195005150700 0011990-9999195005151200 22011990-9999195005151800-11

two。 Demand

3. Ideas, codes

Add a sufficiently small associated file (that is, weather station information) to the distributed cache, and then read the full amount of weather station information cached locally at each Mapper side, and then associate with the weather information.

Import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser;import java.io.BufferedReader;import java.io.FileReader;import java.io.IOException Import java.util.HashMap;import java.util.Map;public class MapJoin {static class RecordMapper extends Mapper {private Map stationMap = new HashMap (); @ Override protected void setup (Context context) throws IOException, InterruptedException {/ / preprocess, load the file to be associated into the cache Path [] paths = context.getLocalCacheFiles () / / the new API for retrieving cache files is context.getCacheFiles (), while context.getLocalCacheFiles () is deprecated / / but context.getCacheFiles () returns the HDFS path Context.getLocalCacheFiles () returns the local path / / only one file is cached here, so take the first one to BufferedReader reader = new BufferedReader (new FileReader (paths [0] .toString (); String line = null Try {while ((line = reader.readLine ())! = null) {String [] vals = line.split ("\\ t"); if (vals.length = = 2) {stationMap.put (vals [0], vals [1]) } catch (Exception e) {e.printStackTrace ();} finally {reader.close ();} super.setup (context) } @ Override protected void map (LongWritable key, Text value, Context context) throws IOException, InterruptedException {String [] vals = value.toString () .split ("\\ t"); if (vals.length = = 3) {String stationName = stationMap.get (vals [0]); / / Join stationName = stationName = = null? ": stationName Context.write (new Text (vals [0]), new Text (stationName + "\ t" + vals [1] + "\ t" + vals [2]);} public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); String [] otherArgs = new GenericOptionsParser (conf, args). GetRemainingArgs () If (otherArgs.length! = 3) {System.err.println ("Parameter number is wrong, please enter three parameters:"); System.exit (- 1);} Path inputPath = new Path (otherArgs [0]); Path stationPath = new Path (otherArgs [1]); Path outputPath = new Path (otherArgs [2]); Job job = Job.getInstance (conf, "MapJoin") Job.setJarByClass (MapJoin.class); FileInputFormat.addInputPath (job, inputPath); FileOutputFormat.setOutputPath (job, outputPath); job.addCacheFile (stationPath.toUri ()); / / add cache files to add multiple job.setMapperClass (RecordMapper.class); job.setMapOutputKeyClass (Text.class); System.exit (job.waitForCompletion (true)? 0: 1);}}

4. Running result

At this point, the study on "how to use MapReduce Map Join" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.