Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Basic concepts of 2.spark

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Application

A Spark application written by a user. The main method of Application is the entrance to the application, and the user defines the RDD and the operation to RDD through the API of Spark.

Job

For jobs provided to Spark to run, multiple Job are often generated in an Application with Action as the boundary. Spark adopts a lazy mechanism, and the creation and transformation of RDD will not be executed immediately. Only when Action is encountered will a Job be generated, and then unified scheduling and execution will be made.

Stage

Each Job is divided into different phases with Shuffle as the boundary, and its name is Stage. There are two kinds of Stage: non-final Stage (Shuffle Map Stage) and final Stage (Result Stage).

When dividing the Stage into all operations in Job, it is generally carried out in reverse order: that is, starting from Action, if you encounter narrow dependency operations, you will be divided into the same execution phase; if you encounter wide dependency operations, you will divide a new execution phase, and the new phase will be the parent of the previous stage, and then execute recursively. Child Stage needs to wait for all the parent Stage to be executed before it can be executed, so the Stage forms a large-grained DAG according to the dependency relationship. In a Stage, all operations are performed in the form of serial Pipeline, and a set of Task completes the calculation.

Task

The unit of work that is actually executed is the computing task that performs serial operations on the RDD within a Stage. Multiple Task make up a Stage.

Task is divided into ShuffleMapTask and ResultTask. The Task located in the last Stage is ResultTask, and the other stages belong to ShuffleMapTask.

Cluster Manager

External services that obtain resources on the cluster. Cluster Manager can be built-in Standalone, or third-party Yarn and Mesos.

Cluster Manager generally adopts Master-Slave structure. Take Yarn as an example. The node that deploys ResourceManager service is Master, which is responsible for the unified management and allocation of all computing resources in the cluster. The node that deploys NodeManager service is Slave, which is responsible for creating one or more JVM instances with independent computing capabilities on the current node. In Spark, these nodes are also called Worker.

Executor

A process that an Application runs on a worker node that is responsible for running some Task and returning the results to the Driver while providing storage for the RDD that needs to be cached.

Driver

Prepare the running environment of the Spark application, which is responsible for executing the main method in the user Application, submitting the Job, converting the Job into Task, and coordinating the scheduling of the Task among the Executor processes.

Spark has two deployment modes: Client and Cluster. When Application is deployed in Client mode, Driver runs on the Client node, while when deployed in Cluster mode, Driver runs on the Worker node and, like Executor, is started by Cluster Manager.

DAGScheduler

Build the DAG diagram based on Job, split the Job into multiple Stage and submit it to TaskScheduler.

TaskScheduler

Split the Stage into multiple Task and submit it to the worker to run, where the Task that the Executor runs is assigned.

Loyal to technology, love sharing. Welcome to the official account: java big data programming to learn more technical content.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report