Four mechanisms and two cores of hdfs 07/11 Update SLTechnology News&Howtos

Four mechanisms and two cores of hdfs

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Four mechanisms: (1) heartbeat mechanism:

introduction: hdfs is a master-slave architecture, so in order to know whether dataNode is alive in real time, it is necessary to establish a heartbeat mechanism. During the whole hdfs operation, dataNode will regularly send heartbeat reports to nameNode to inform nameNode of its own status.

heartbeat content:

-report your survival status and update the maintained count information after each report

-report your stored block list information to nameNode

heartbeat report cycle:

Dfs.heartbeat.interval 3 / / Unit second

nameNode determines the benchmark of a dataNode outage: 10 consecutive times of failure to receive dataNode heartbeat information, and 2 times of check time.

check time: when the nameNode cannot receive the heartbeat of the dataNode, the check will be sent to the dataNode actively.

Dfs.namenode.heartbeat.recheck-interval 300000 / / Unit millisecond

calculation formula: 2dfs.namenode.heartbeat.recheck-interval+10dfs.heartbeat.interval=310+3002=630s=10.5min

(2) Security mechanism:

introduction: when hdfs starts, it will first enter the safe mode, and when the specified requirements are met, it will exit safe mode. In safe mode, you cannot perform any operations that modify metadata information.

Introduction to metadata for hdfs (three parts):

-Abstract Directory Tree

-the correspondence between data and blocks (how many blocks the file is divided into)

-location information where block blocks are stored

Where hdfs metadata is stored:

-memory: a complete copy of metadata information is stored in memory (abstract directory tree, correspondence between data and blocks, location information of block blocks)

-hard disk: abstract directory tree, correspondence between data and blocks

Note: the location information of the block block of the metadata in memory is obtained when the heartbeat is reported to nameNode through dataNode. The metadata in the hard disk is because the metadata in memory disappears automatically when the machine is down, so the metadata in memory needs to be persisted to the hard disk.

On the other hand, the metadata in the hard disk only has the abstract directory tree, the corresponding relationship between data and blocks, and there is no location information for block blocks.

What nameNode does at startup:

Startup sequence of the cluster: nameNode--- "dataNode---" secondaryNameNode

loads the metadata information from the hard disk into memory. If you start the cluster for the first time, it will generate a fsimage image file locally, receive the heartbeat reported by dataNode, and load the location information of the block in the report into memory. Of course, this is when hdfs goes into safe mode.

Conditions for to exit safe mode:

-if the dfs.namenode.safemode.min.datanodes (the number of dataNode started) is 0 when the cluster is started, and the minimum number of replicas of the data block dfs.namenode.replication.min is 1, the secure mode will exit, that is, the cluster reaches the minimum number of replicas, and the datanode nodes that can run also meet the requirements, and exit the secure mode at this time

-when the number of dataNode started is 0 and the inventory rate of all data blocks reaches 0.999f, the cluster exits the safe mode (the number of copies meets the requirement)

Dfs.namenode.safemode.threshold-pct 0.999f

manually exits or enters safe mode

Hdfs dfsadmin-safemode enter enter hdfs dfsadmin-safemode leave exit hdfs dfsadmin-safemode get View (3) Rack Policy (replica Storage Policy):

The first copy of is placed on any node of the rack closest to the client, if the client is local, it is stored on the local machine (make sure there is a number of replicas), the second copy is placed on any node of the rack different from the first copy, and the third copy is placed on a different node of the same rack as the second copy.

The method for to modify the copy:

1. Modify the configuration file:

Dfs.replication 1

two。 Command setting: hadoop fs-setrep 2-R dir

(4) load balancing:

Load balancing of hdfs: indicates that the data stored in each dataNode matches its hardware, that is, the occupancy rate is equal.

, how to manually adjust the load balancer:

-the cluster automatically adjusts the bandwidth of the load balancer: (default is 1m)

Dfs.datanode.balance.bandwidthPerSec 1048576 / / 1M

-tells the cluster to load balance: start-balancer.sh-t 10% indicates the difference between the maximum occupancy rate of the node and the minimum occupancy rate of the node. When the difference is more than 10%, the cluster will not immediately carry out load balancing, but will do so when the cluster is not busy.

Two cores: (1) File upload:

Use the client client provided by hdfs to initiate a RPC request to the remote namenode. Namenode will check whether the file to be created already exists and whether the creator has permissions. If it succeeds, it will create a record for the file, otherwise an exception will be thrown to the client. When the client starts to write to the file, the client will split the file into multiple packets, manage these packet internally in the form of data queue "data queue", and apply to namenode for blocks to obtain the appropriate datanode list for storing replicas. The size of the list is set according to the number of replication in the namenode. After client gets the block list, it begins to write packet to all replicas in the form of pipeline (pipeline), and the client writes packet to the first datanode as a stream. After the datanode stores the packet, it passes it to the next datanode in this pipeline until the last datanode. After the last datanode is successfully stored, an ackpacket (acknowledgement queue) is returned, which is passed to the client in pipeline, and the "ack queue" is maintained inside the client's development library. When the ackpacket returned by datanode is successfully received, the corresponding packet is removed from "data queue". If a datanode fails during transmission, the current pipeline will be shut down, the failed node will be removed from the pipeline, and the remaining block will continue to be transmitted in the form of pipeline in the remaining datanode, while the namenode will allocate a new datanode to maintain the number set by replicas. After the client finishes writing the data, the close method is called on the data stream. As long as the data stream is closed, as long as the number of replicas of dfs.replication.min (the minimum number of replicas successfully written) is written (the default is 1), the write operation will succeed, and the block can be replicated asynchronously in the cluster until it reaches its target number of replicas (the default value of dfs.replication is 3), because namenode already knows which blocks the file consists of. So he only needs to wait for the data block to replicate at a minimum before returning to success. Finally, when the file is uploaded successfully, namenode will synchronize the operation of prewriting such as log to memory (2) download the file:

The client sends instructions to download nameNode, and hadoop fs-get / a.txtnameNode does a series of checks (whether permissions, whether the file exists..) nameNode sends block location information to client, sends some or all of the client to calculate the most DN, and then establishes a connection to download the file. Client will do a CRC check every time a block is downloaded. If the download fails, client will report to nameNode. Then copy the corresponding block from other DD, and nameNode will record the possible malfunctioning DN and try not to use it the next time you upload or download it. When all blocks are downloaded successfully, client reports success information to nameNode about the merge of supplementary metadata:

This is secondaryNamenode's merging of metadata on a distributed basis:

Timing of merger:

A: how often do you merge?

Dfs.namenode.checkpoint.period 3600 / / Unit second

B: how many operation log records are merged at a time

When the dfs.namenode.checkpoint.txns 1000000 cluster starts, the fsimage image file is loaded into memory. If the first startup cluster is started or the cluster is restarted normally, nameNode will merge a fsimage image seconddaryNameNode in the hard disk and send verification to nameNode regularly (1 minute) to check whether merging is needed. NameNode needs metadata merging seconddaryNameNode sends merge request to nameNode nameNode rolls back edits_inprogress_000095 according to seen_txid And generate a new empty edits_inprogress_000096, continue to record the operation log secondaryNameNode, pull the rolled back edits and the latest fsiamge locally, secondaryNameNode merge the edits and the latest fsiamge, and modify the fsiamgesecondaryNameNode according to the edits in memory to push the merged fsiamge back to namenode. And keep a copy locally.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.