Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the principle of MapReduce implementation?

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is the implementation principle of MapReduce". In daily operation, I believe many people have doubts about what the implementation principle of MapReduce is. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "what is the implementation principle of MapReduce?" Next, please follow the editor to study!

Overview of MapReduce

◆ MapReduce is a distributed computing model proposed by Google, which is mainly used in the search field to solve the computing problems of massive data.

◆ MR consists of two stages: Map and Reduce. Users only need to implement map () and reduce () functions to achieve distributed computing, which is very simple.

The formal parameters of the two functions ◆ are key and value pairs, which represent the input information of the function.

MR execution process

Principle of MapReduce implementation

◆ perform the steps:

1. Map task processing

1.1 read the contents of the input file and parse it into key and value pairs. Each line of the input file is parsed into key and value pairs. The map function is called once for each key-value pair.

1.2 write your own logic, process the input key and value, and convert them into new key and value output.

1.3.Partition the output key and value.

1.4 the data of different partitions are sorted and grouped according to key. The value of the same key is put into a collection.

1.5 (optional) the grouped data is reduced.

2.reduce task processing

2.1 the output of multiple map tasks, according to different partitions, copy through the network to different reduce nodes.

2.2 merge and sort the output of multiple map tasks. Write the reduce function's own logic, deal with the input key and value, and convert them into new key and value output.

Save the output of reduce to a file.

Mapreduce the text file hello text content as follows

Hello you

Hello me

The code is implemented as follows

Package MapReduce;import java.net.URI;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat Public class WordCountApp {static final String INPUT_PATH = "hdfs://hadoop:9000/hello"; static final String OUT_PATH = "hdfs://hadoop:9000/out"; public static void main (String [] args) throws Exception {Configuration conf = new Configuration (); final FileSystem fileSystem = FileSystem.get (new URI (INPUT_PATH), conf) Final Path outPath = new Path (OUT_PATH); if (fileSystem.exists (outPath)) {fileSystem.delete (outPath, true);} final Job job = new Job (conf, WordCountApp.class.getSimpleName ()) / / 1.1 specify where the read file is located in FileInputFormat.setInputPaths (job, INPUT_PATH); / / specify how to format the input file, parsing each line of the input file into a key-value pair / / job.setInputFormatClass (TextInputFormat.class) / / 1.2 specifies the type of custom map class job.setMapperClass (MyMapper.class); / / map output. If the type is the same as the type, you can omit / / job.setMapOutputKeyClass (Text.class); / / job.setMapOutputValueClass (LongWritable.class); / / 1.3partition / / job.setPartitionerClass (HashPartitioner.class); / / have a reduce task running / / job.setNumReduceTasks (1) / / 1.4 TODO sorting, grouping / / 1.5 TODO specification / / 2.2 specify custom reduce class job.setReducerClass (MyReducer.class); / / specify reduce output type job.setOutputKeyClass (Text.class) Job.setOutputValueClass (LongWritable.class); / / 2.3specify where to write FileOutputFormat.setOutputPath (job, outPath); / / specify the formatting class / / job.setOutputFormatClass (TextOutputFormat.class) of the output file / / submit job to JobTracker to run job.waitForCompletion (true) } / * KEYIN that is K1 represents the offset of the line * VALUEIN that v1 represents the text content of the line * KEYOUT that is K2 represents the word that appears in the line * VALUEOUT that is v2 represents the number of words that appear in the line Fixed value 1 * / static class MyMapper extends Mapper {protected void map (LongWritable K1, Text v1, Context context) throws java.io.IOException, InterruptedException {final String [] splited = v1.toString () .split (") For (String word: splited) {context.write (new Text (word), new LongWritable (1));} } / * KEYIN is K2 for the word that appears in the line * VALUEIN is v2 for the number of words that appear on the line * KEYOUT is K3 for different words that appear in the text * VALUEOUT is v3 for text Total number of different words appearing in this article * * / static class MyReducer extends Reducer {protected void reduce (Text K2) Java.lang.Iterable V2s, Context ctx) throws java.io.IOException, InterruptedException {long times = 0L For (LongWritable count: V2s) {times + = count.get ();} ctx.write (K2, new LongWritable (times);};}}

Run the above program, edit the hello file and upload it to the HDFS file system

At this point, the study of "what is the principle of MapReduce implementation" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report