Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the characteristics of MapReduce

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article is to share with you about the characteristics of MapReduce. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Characteristics of MapReduce

 is easy to program (it's really easy to get familiar with, mostly in two parts, map and reduce. Hive and pig make mapreduce easier)

 has good scalability (it can be achieved by simply adding machines)

High fault tolerance of  (tasks in job partially failed and can be reexecuted)

 is suitable for offline processing of massive data above PB level.

MapReduce is not good at it.

 real-time computing

, like MySQL, returns results in milliseconds or seconds (you can refer to Spark or HBase,HBase for random read and write performance, but the statistics are not very good)

 streaming computing

The input dataset of  MapReduce is static and cannot be changed dynamically

The design characteristics of  MapReduce determine that the data source must be static (Storm can be considered)

 DAG calculation

 multiple applications have dependencies, and the input of the latter application is the output of the previous one (Tez)

MapReduce divides the entire running process of the job into two stages.

Map phase and Reduce phase

The  Map phase consists of a certain number of Map Task

 input data format parsing: InputFormat

 input data processing: Mapper

 data packet: Partitioner

The  Reduce phase consists of a certain number of Reduce Task

Remote copy of  data

 data is sorted by key

 data processing: Reducer

 data output format: OutputFormat

By default,  TextInputFormat splits files and processes each Split, providing RecordReader to generate key/value

TextInputFormat:Key is the offset of the line in the file, and value is the line content. If the line is truncated, the first few characters of the next block are read.

Conceptual  designed

Block

The smallest data storage unit in  HDFS defaults to 64MB

 Spit

The smallest cell in  MapReduce corresponds to Block by default.

 Block and Split

The correspondence between  Split and Block is arbitrary and can be controlled by the user.

Map stage

 InputFormat (default TextInputFormat)

 Mapper

 Partitioner

 Sort (optional)

 Combiner (local reducer) (optional)

Reduce stage

 Sort

 Reducer

 OutputFormat (default TextOutputFormat)

Combiner

Combiner can do look at local reducer merging value corresponding to the same key (wordcount example) usually has the same benefits as Reducer logic

 reduces the amount of data output from Map Task (disk IO)

 reduces the amount of data transmitted over the Reduce-Map network (network IO)

 results can be superimposed.

 Sum (YES!), Average (NO!)

Partitioner

 Partitioner determines which Reduce Task to process each piece of data output by Map Task: hash (key) mod R R is the number of Reduce Task

 allows users to customize. In many cases, custom Partitioner is required.

 such as "hash (hostname (URL)) mod R" ensures that web pages with the same domain name are handed over to the same Reduce Task for processing

Thank you for reading! This is the end of this article on "what are the characteristics of MapReduce?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report