Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize the full parsing of BlockManager in Spark distributed storage system

2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to achieve full BlockManager parsing of Spark's distributed storage system. The content of the article is of high quality, so the editor shares it for you as a reference. I hope you will have some understanding of the relevant knowledge after reading this article.

BlockManager is one of the most important components of spark. BlockManager can be seen everywhere in the operation of spark. Only by understanding the principle and mechanism of BlockManager can you have a deeper understanding of spark.

What is BlockManager?

What is the role of BlockManager? I understand that it is responsible for storing RDD and how to save it for subsequent tasks to use.

The internal module diagram is as follows:

In the figure, you can see a memoryStore and DiskStore, which shows that when you store block, there are two ways of memory and disk, and all of them are managed through this Store after storage.

Storage is in units of Block, so there will be an array for mapping

There is a reference interface responsible for communicating with Driver's BlockManagerMaster

There is also a shuffClient, which is responsible for backup and download, that is, each executor will transfer block through shuffClient.

§the relationship between BlockManager and Driver, executor

The relationship is shown in the figure:

You can see from it

BlockManagerMaster is generated on the driver side

BlockManager is generated in executor and is responsible for registering with BMM.

Registration messages in spark are sent through ActorSystem

§the process of storing a block block into blockManager

For example, two special places:

When you try to put, you will first check whether the blockId has a cache, and if so, take it directly, otherwise you will recreate the blockInfo.

When storing, it will first determine whether there is enough memory, write it to memoryStore if it is sufficient, and release it before trying to put it in if it is not enough.

§Delete blocks from blockManager

The deletion operation is nothing special, mainly to determine the storage level of the block and choose to take the block from different store.

§shuffClient download block operation

BMMAC is BlockManagerMasterActor, the abbreviation I wrote in vain.

Note: when the block to be fetched comes from several BlockManager, scramble it to prevent several BM from downloading data from one BM at the same time!

§backup operation of shuffeClinet

Why would BM back up his block? In the book, the author does not explain, my understanding is to prevent the node from crashing or losing, so that the intermediate task can not be carried out.

Because the block that other BlockManager can receive may be limited, multiple block may be involved in backup. Each time we take a * * random * * blockManager from BMmaster as backup to avoid backing up to the same blockManager.

§the relationship between BlockManager and Executor, driver:

You can see from it

BlockManagerMaster is generated on the driver side

BlockManager is generated in executor and is responsible for registering with BMM.

Registration messages in spark are sent through ActorSystem

On how to achieve Spark's distributed storage system BlockManager parsing is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report