In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
Editor to share with you what are the design features of HDFS, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to understand it!
The design features of HDFS are:
1. Big data files are very suitable for the storage of large T-level files or a pile of big data files. It doesn't make sense if the files are only a few G or even smaller.
2. File block storage. HDFS stores a complete large file on different calculators averagely. Its significance is that when reading files, you can get different chunks of files from multiple hosts at the same time. Multi-host reading is much more efficient than a single host.
3. Streaming data access, read and write multiple times at a time, this mode is different from traditional files, it does not support dynamic changes in the contents of the file, but requires that the file should not be changed once written, and content can only be added at the end of the file to change.
4. Cheap hardware, HDFS can be applied to ordinary PCs, this mechanism allows some companies to use dozens of cheap computers to support a big data cluster.
5, hardware failure, HDFS believes that all computers may have problems. In order to prevent a host from failing to read the block file of the host, it allocates a copy of the same block to several other hosts. If one of the hosts fails, it can quickly find another copy to get the file.
Key elements of HDFS:
Block: chunks a file, usually 64m.
NameNode: saves the directory information, file information and chunk information of the entire file system, which is specially saved by the only host. Of course, if this host makes an error, NameNode will be invalidated. Activity-standy mode is supported in Hadoop2.*-if the primary NameNode fails, start the standby host to run NameNode.
DataNode: distributed on cheap computers to store Block block files.
MapReduce
It is popular to say that MapReduce is a programming model that extracts analysis elements from massive source data and finally returns the result set. Distributed storage of files to the hard disk is the first step, while extracting and analyzing what we need from massive data is what MapReduce does.
Let's take a calculation of the maximum value of massive data as an example: a bank has hundreds of millions of depositors, and the bank wants to find out what the maximum deposit amount is. According to the traditional calculation, we will do this:
Long moneys []... Long max = 0L * * for (int iTuno * * IMAX) {max = moneys [I];}}
If the length of the calculated array is small, there will be no problem with the implementation, or there will be problems in the face of huge amounts of data.
MapReduce will do this: first, the numbers are stored in different blocks, take several blocks as a Map, calculate the maximum value in the Map, and then do the Reduce operation on the maximum value in each Map, and then Reduce takes the maximum value to the user.
The basic principle of MapReduce is: divide the large data analysis into small pieces and analyze them one by one, and then summarize and analyze the extracted data, and finally get the content we want. Of course, how to block analysis, how to do Reduce operation is very complex, Hadoop has provided the implementation of data analysis, we only need to write simple requirements commands to achieve the data we want.
The above is all the contents of the article "what are the design features of HDFS". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 221
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.