Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Hadoop File Reading

2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly shows you the "Hadoop file reading example analysis", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and study the "Hadoop file reading example analysis" this article.

The client opens the file it wants to read by calling the open () method of the FileSystem object, which is an instance of the distributed system (step 1) for HDFS. DistributedFileSystem calls namenode by using RPC to determine the location of the starting block of the file (step 2). For each block, namenode returns the datanode address where a copy of the block is stored, and when the datanode (for example, in a MapReduce task) and holds a copy of the corresponding database, the node reads the data in the local datanode. DistributedFileSystem returns a FSDataInputStream object (an input stream that supports file location) to the client and reads the data. The FSDataInputStream class instead encapsulates the DFSInputStream object, which manages the datanode and namenode's I read O, and then the client calls the read () method on the input stream (step 3). The DFSInputStream that stores the datanode address of the starting block of the file randomly connects to the nearest datanode. By repeatedly calling the read () method on the data stream, the data can be transferred from datanode to the client (step 4). When you reach the end of the block, the DFSInputStream closes the connection to the datanode and then needs to find the nearest datanode of the next block (step 5). The client only needs to read the continuous stream, and the operation is transparent to the client. When the client reads data from the stream, the block is read in the order in which the new connection between DFSInputStream and datanode is opened, and it also needs to ask namenode to retrieve the location of the next batch of fast datanode, and once the client finishes reading, the close () method is called on the FSDataInputStream (step 6). If DFSInputStream encounters an error in communicating with datanode while reading data, it will try to read data from the other nearest datanode that is fast. It will also host the fault datanode to ensure that subsequent blocks on that node are not read over and over again. DFSInputStream also verifies the completeness of the data sent from datanode by checksum. If a corrupted block is found, it notifies namenode before the DFSInputStream view reads a copy of a block from another datanode. A key point of this design is that namenode tells the client the nearest datanode in each block and lets the client contact the datanode directly and retrieve the lock. Because the data flow is scattered across all datanode in the cluster, this design enables HDFS to scale to a large number of concurrent clients. At the same time, namenode only needs requests for the corresponding block location (this information is stored in memory, so it is very efficient), and there is no need for corresponding data requests, otherwise namenode will quickly become a bottleneck as the client data grows.

The above is all the contents of the article "sample Analysis of Hadoop File Reading". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report