Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is Hadoop's DataNode?

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the DataNode of Hadoop". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is the DataNode of Hadoop".

The class diagram is as follows:

Publipublic class DataNode extends Configured

Implements InterDatanodeProtocol, ClientDatanodeProtocol, FSConsta nts, Runnable

The inheritance relationship of DataNode is given above, and we find that DataNode implements two communication interfaces, of which ClientDatanodeProtocol is used with Client

Interactive, InterDatanodeProtocol, is the communication interface between DataNode that we mentioned earlier. IpcServer (lower left of the class diagram) is DataNode

A member variable that starts an IPC service so that DataNode provides the power of ClientDatanodeProtocol and InterDatanodeProtocol

It's hard.

Let's start with the main function. This function is simple enough to call the method of createDataNode and then wait for the thread of DataNode to finish. CreateDataNode

First call instantiateDataNode to initialize DataNode, and then execute runDatanodeDaemon. RunDatanodeDaemon will register with NameNode, such as

If it succeeds, the DataNode thread is started and the DataNode starts to work.

InstantiateDataNode, the method to initialize DataNode, reads the configuration files required by DataNode, as well as the configured storage directory (there may be multiple

See the discussion section of storage), and then send these two parameters to makeInstance, and makeInstance will first check the directory (existing, directory, readable, writable)

Then call:

New DataNode (conf, dirs)

Then the control flow goes to the constructor. The constructor calls startDataNode to complete the initialization work related to DataNode (note that the DataNode work line

The program does not start in this function. The first step is to initialize a bunch of configuration parameters, NameNode addresses, socket parameters, and so on. Then, request a match from NameNode

Set the DatanodeProtocol.versionRequest and check that the returned NamespaceInfo is consistent with the local version.

The next step in the normal situation is to check the state of the file system and do the necessary recovery, initializing FSDataset (at this point, the storage and data members in the figure above change

The quantity has been initialized.

Then, find a port and create a DataXceiverServer (started in the run method), and create a DataBlockScanner (start in offerService as needed

Start it only once), create a HttpServer on DataNode, and start ipcServer. This ends the initialization work related to DataNode.

Before starting the DataNode worker thread, DataNode needs to register with NameNode. Registration information has been constructed at initialization time, including DataXceiverServer

Port, ipcServer port, file layout version number and other important information. After successful registration, you can start the DataNode thread.

DataNode's run method, there are two options in the loop, upgrade (not discussed for the time being) / working properly. Let's take a look at the working offerService method. OfferService

It is also a loop in which offerService sends a regular heartbeat to NameNode, reports changes in the status of Block in the system, and reports that DataNode is now managed

The Block status of the When sending heartbeat and Block status reports, NameNode returns commands that DataNode executes.

The processing of the heartbeat is relatively simple and is sent at heartBeatInterval intervals.

The Block status change report makes use of the information saved in the receivedBlockList and delHints lists. ReceivedBlockList shows that in this

DataNode successfully creates a new data block, and delHints is the node from which the data block can be deleted. For example, in the replaceBlock of DataXceiver, call:

Datanode.notifyNamenodeReceivedBlock (block, sourceID)

This indicates that DataNode has received a corresponding Block on Block,sourceID from sourceID that can be deleted (this scenario appears when the system needs to

When doing load balancing, Block is copied between DataNode).

The Block status change report is reported through NameNode.blockReceived.

Block status reports are also relatively simple and are sent at blockReportInterval intervals.

Heartbeat and Block status reports can return commands, which is the only way for NameNode to initiate a request before DataNode. Let's take a look at the orders:

DNA_TRANSFER: copy blocks to other DataNode

DNA_INVALIDATE: delete blocks (simple method)

DNA_SHUTDOWN: turn off DataNode (simple method)

DNA_REGISTER:DataNode re-registration (simple method)

DNA_FINALIZE: submit the upgrade (simple method)

DNA_RECOVERBLOCK: recovering data block

Copying blocks to other DataNode is performed by the transferBlocks method. Note that the returned command can contain multiple data blocks, and each data block can contain more than one

A destination address. The transferBlocks method starts a DataTransfer thread for each Block to transfer data.

DataTransfer is an inner class of DataNode that uses OP_WRITE_BLOCK to write block operations and send data to multiple targets.

Thank you for your reading, the above is the content of "what is the DataNode of Hadoop". After the study of this article, I believe you have a deeper understanding of what the DataNode of Hadoop is, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report