Basic concepts of hdfs (design ideas, features, working mechanism, upload and download namenode storage metadata mechanism) 09/13 Update SLTechnology News&Howtos

Basic concepts of hdfs (design ideas, features, working mechanism, upload and download namenode storage metadata mechanism)

2025-09-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Saturday, 2019-2-16

Basic concepts of hdfs (design ideas, features, working mechanism, upload and download namenode storage metadata mechanism)

1. The general design idea of hdfs:

Design goal: to improve the efficiency of distributed concurrent processing of data (to improve concurrency and move operations to data)

Divide and conquer: large files and large quantities of files are distributed and stored on a large number of independent servers to facilitate the operation and analysis of massive data in a divide-and-conquer way.

Key concepts: file fragmentation, copy storage, metadata, location query, data read and write stream

2. Shell operation of hdfs / / see the separate document of the response.

3. Some concepts of hdfs

The basic working mechanism and related concepts of Hdfs distributed file system / / see drawing

First of all, it is a file system with a unified namespace, the directory tree, which is when the client accesses the hdfs file.

By specifying the path in this directory tree

Secondly, it is distributed and functions are implemented by many servers.

The  hdfs file system provides clients with a unified abstract directory tree, and the files in Hdfs are all block.

The size of the stored block can be specified by the configuration parameter (dfs.blocksize). The default size is in the hadoop2.x version.

128m in the middle and 64m in the old version

Who actually stores the block of  files? -distributed across datanode service nodes, while

And each block can store multiple copies (the number of copies can also be set by parameter dfs.replication, default

The value is 3)

There is an important role in  Hdfs: namenode, which maintains the directory tree of the entire hdfs file system, as well as each

Block block information corresponding to a path (file) (id of block and datanode server)

 hdfs is designed to adapt to write-once, read-out scenarios, and does not support file modification.

(hdfs is not suitable for network disk applications because it is inconvenient to modify, large delay, high network overhead and high cost.)

The definition and concept of hdfs slice

1: define a slice size: it can be adjusted by parameter, which is equal to "blocksize set in hdfs" by default, usually 128m.

2: get all the pending files List under the input data directory

3: traverse the file List and slice it one by one

For (file:List)

File is cut from 0 offset to form a slice every 128m, such as a.txt (200m), it will be cut into two slices: a.txt: 0-128m, a.txt: 128M-256M

For example, b.txt (80m) will be cut into a slice, b.txt: 0-80m

HDFS Block replication strategy

-the first copy is placed on the node where the client is located

If it is a remote client, block will randomly select the node

The system will first select the idle DataNode node

-the second copy is on a different rack node

-the third copy is placed on a different machine on the same rack as the second copy

-good stability, load balancing, good write bandwidth, read performance, uniform block distribution

-Rack awareness: distribute copies to different racks to improve high fault tolerance of data

-take the node as the backup object

4. Characteristics:

Capacity can be expanded linearly

High reliability of data storage

Distributed computing processing is very convenient.

The data access delay is large, and the data modification operation is not supported.

It is suitable for application scenarios with one write and multiple reads.

5. The working mechanism of hdfs

HDFS cluster is divided into two major roles: NameNode and DataNode

NameNode is responsible for managing the metadata of the entire file system

DataNode is responsible for managing users' file blocks.

6. The working mechanism of namenode

Namenode responsibilities:

1. When responding to client request / / when the client requests hdfs, it will go to namenode first.

2. Maintain the directory tree / / when the client reads or writes files, he will specify a directory, which is the directory of hdfs and is managed by namenode.

3. Manage metadata (query, modify) *

/ / what is metadata

File description: how many block are there for a file in a certain path, and each block is stored on those datanode? what is the number of copies of a file? This information is metadata, which is very important and cannot be lost or wrong, so it may not be available when the client requests it.

Tip: a complete copy of metadata is stored in memory, including the directory tree structure and the mapping between files and data blocks and copy storage.

7. The working mechanism of datanode

1. Datanode job responsibilities:

2. Store and manage the file block data of users

3. Report your block information to namenode regularly (through heartbeat information)

4. Upload a file and observe the physical storage of the block of the file

This directory on each datanode machine:

/ home/hadoop/app/hadoop-2.4.1/tmp/dfs/data/current/BP-193442119-192.168.2.120-1432457733

977/current/finalized

Monday, 2019-2-18

Hdfs write data flow (put)

1. The root namenode communicates with the request to upload the file. Namenode checks whether the target file already exists and the parent directory exists.

2. Namenode returns whether it can be uploaded.

3. Client requests which datanode servers the first block should be transferred to.

4. Namenode returns 3 datanode server ABC

5. Client requests one of the three dn's A to upload data (essentially a RPC call to establish a pipeline). A will continue to call B when receiving the request, and then B will call C to complete the establishment of the real pipeline and return it to the client step by step.

6. Client starts uploading the first block to A (first reading data from disk to a local memory cache). Taking packet as a unit, A receives a packet and passes it to Bmai B and Cten A. each packet is put into a response queue to wait for a reply.

7. When a block transfer is complete, client again requests namenode to upload the server of the second block.

Hdfs read data flow (get)

1. Communicate with namenode to query metadata and find the datanode server where the file block is located.

2. Select a datanode (nearest principle, then random) server and request to establish a socket stream

3. Datanode starts to send data (read data from disk and put it into stream, and verify it in packet)

4. The client receives it in packet, now caches it locally, and then writes to the target file.

Summary:

What we describe here is that the process of reading and writing data in hdfs is relatively smooth, and exceptions may occur in each of the above stages. Hdfs is also very perfect for each exception, and the fault tolerance is very high. The logic of handling these exceptions is very complex, so we will not go into detail for the time being and understand the normal reading and writing process on ok.

The mechanism of namenode managing metadata in Hdfs / / CheckPoint of metadata

As shown in the figure:

How is hdfs metadata stored?

A. there is a complete metadata (specific data structure) in memory.

B. the disk has a mirror file of "quasi-complete" metadata.

C. When the client adds or modifies the files in hdfs, the operation log is first recorded in the edits file, and when the client operation is successful, the corresponding metadata is updated to memory; every once in a while, secondary namenode downloads all edits accumulated on namenode and a new fsimage locally, and loads it into memory for merge (this process is called checkpoint)

D, trigger condition configuration parameters of checkpoint operation:

The frequency of dfs.namenode.checkpoint.check.period=60 # checking whether the trigger condition is met, 60 seconds

Dfs.namenode.checkpoint.dir= file://${hadoop.tmp.dir}/dfs/namesecondary

# when the above two parameters perform checkpoint operation, the local working directory of secondary namenode

Dfs.namenode.checkpoint.edits.dir=$ {dfs.namenode.checkpoint.dir}

Dfs.namenode.checkpoint.max-retries=3 # maximum retries

Dfs.namenode.checkpoint.period=3600 # the interval between two checkpoint is 3600 seconds

Dfs.namenode.checkpoint.txns=1000000 # the largest operation record between two checkpoint

The working directory storage structures of E, namenode and secondary namenode are exactly the same, so when namenode failure exit requires recovery, fsimage can be copied from the working directory of secondary namenode to the working directory of namenode to recover the metadata of namenode.

F. You can view the information in edits through a tool of hdfs

Bin/hdfs oev-I edits-o edits.xml

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.