In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
The contents of this issue:
1 MapReduce architecture decryption
2 Research on MapReduce running cluster
3 practical operation of MapReduce through Java programming
Hadoop from 2. 0 already had to run on Yarn from the beginning, and didn't care about Yarn at all at 1.0.
Now it is MR, and it is also about Yarn, and it is already in the basic entry stage. Zero foundation is a thing of the past.
Starting tomorrow-a collection of about 20 MapReduce codes
One: MapReduce architecture based on Yarn
The 1.MR code program is based on the implementation of Mapper and Reducer, in which Mapper divides a computing task into many.
Small tasks for parallel computing, Reducer is the final statistical work
2.Hadoop 2.x started running on Yarn.
Yarn manages all the resources of the cluster (such as memory and CPU), ResourceManager, each node is arranged with a JVM process, NodeManager, receive requests to use Container to wrap these resources, when RM receives a job request
3. When ResourceManager receives the request submitted by Client, it will order NodeManager to start the first Container of the program on the node where the NodeManager is located according to the status of the cluster resources. The Container is the ApplicationMaster of the program, which is responsible for the execution process of the task scheduling of the program. ApplicationManager registers itself with ResourceManager, and then applies for specific Container computing resources from ReourceManager after registration.
4. How many Container is needed for ApplicationMaster in a program?
When Application starts, it will run the program's Main method, in which there will be data input and related configuration, from which you can know how much Container is needed.
(container is a unit of computer resources. According to the calculation requested by the client, the cluster parses the computing job, and the calculation result contains the required contain resources.)
Application needs to run the Main method to know how many shards the analyzer has and how many shards correspond to Container, and then consider other resources, such as Shuffle, and allocate some more resources.
Summary of 5.MapReduce running on Yarn
Master-slave structure
Master node, there is only one: ResourceManager
Control node, each Job has one MRAppMaster
There are many slave nodes: YarnChild
ResourceManager is responsible for:
Receive computing tasks submitted by customers
Assign Job to MRAppMaster for execution
Monitor the implementation of MRAppMaster
MRAppMaster is responsible for:
Responsible for scheduling tasks executed by a Job
Assign Job to YarnChild for execution
Monitor the implementation of YarnChild
YarnChild is responsible for:
Perform computing tasks assigned by MRAppMaster
HA should be done in RM production environment.
MRAppMaster in 6.Hadoop MapReduce is equivalent to YarnChildren in Driver,Hadoop MapReduce in Spark and CoarseGrainedExecutorBackend in Spark
(Hadoop consumes a lot of resources relative to Spark)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.