In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
This paper describes the whole process of HDFS storing files from macro view by combining HDFS copy and partition. HDFS system includes four roles: Client, NameNode, DataNode and SeconderyNameNode, where Client is the client, NN is responsible for management, DN is responsible for storage, and SN assists in management.
Let's first look at a picture on the official website
#Figure 0 -Architecture of HDFS
HDFS copy storage has the following rules:
1. The client places the first copy on the nearest DN
2. The second copy takes precedence over the other rack.
3. And so on, try to keep copies in different racks.
Due to the existence of copy and block mechanism, when uploading files from local file system to HDFS, its internal process is relatively complex, which can be understood through the following figure and step description.
#Figure 1-1 -HDFS replica storage mechanism (3 replicas)
A. For small files that can be stored in a single block:
1. The client initiates a storage request to NN (NameNode),
2. NN looks up whether it already has corresponding files,
3. If not, NN returns DN1 (DataNode) path to client,
4. Client sends copy to DN1,
5. DN1 transmits copies asynchronously to DN2 through pipelines.
6. DN2 sends copies asynchronously to DN3 through pipelines.
7. DN3 notifies DN2 that reception is complete,
8. DN2 notifies DN1 that the reception is complete,
DN1 notifies NN that reception is complete.
B. For large files that need to be segmented:
The general flow is the same as above, but NN will divide the blocks in step 3, and then client will send each block to the assigned DN in step 4, and execute steps 4 to 9.
As can be seen from the foregoing, NameNode nodes are critical in transferring files to HDFS. NN is responsible for managing metadata. Its role is equivalent to the file allocation table FAT in the physical hard disk. If the data in NN is lost, the data stored in DN will have no meaning.
#Figure 1-2 -NN metadata storage mechanism
1. Client requests NN to write,
2. NN writes allocation blocks to editslog file,
NN responds to client,
4. Client writes files to DN,
5. Client notifies NN that writing is complete,
NN updates editslog to memory.
ps: Common and latest metadata are stored in memory, the latest metadata is stored in editslog, and the old metadata is stored in fsimage. Before editslog is full, edits log (new metadata) is converted and merged into fsimage.
#Figure 1-3 -edits log merge mechanism
When editslog is full:
1.NN notifies SecondryNameNode to perform checkpoint operation
2. NN stops writing to the full editslog,
3. NN creates new edits log to maintain writes
4. SN Download NN's fsimage and full editslog
5. SN performs merging to generate fsimage. checkpoint,
SN uploads fsi to NN. cp,
7. NN will fsi. cp renamed fsimage,
8.NN deletes full editslog.
#Figure 3 -Metadata format: file full path, number of copies, block number, block-DN mapping.
Ruijiangyun official website link: www.eflycloud.com/home? from=RJ0035
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.