Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the advantages of adopting 64m block in Hadoop

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

Editor to share with you what are the advantages of Hadoop 64m blocks, I believe most people do not know much about it, so share this article for your reference, I hope you will gain a lot after reading this article, let's go to know it!

Reduce hard disk seek time (disk seek time) HDFS design premise is to support large-capacity streaming data operations, so even general data read and write operations, the amount of data involved is relatively large. If the data block setting is too few, then there are more data blocks to read. Because the data blocks are not continuously stored on the hard disk, the ordinary hard disk needs to move the magnetic head, so the random addressing is slower. The more data blocks you read, the more time it takes to seek the hard disk. When the hard disk seek time is longer than the io time, then the hard disk seek time has become a bottleneck of the system. The appropriate block size helps to reduce the hard disk seek time and improve the system throughput. Reduce Namenode memory consumption for HDFS, which has only one Namenode node, his memory is extremely limited compared to Datanode. However, namenode needs the block information recorded in Datanode in its memory FSImage file. If the block size is set too little, and there is too much block information to maintain, the memory of Namenode may not be hurt. Why not be much larger than 64MB (or 128MB or 256MB)

This is mainly discussed from the upper MapReduce framework.

Map crash problem: the system needs to restart, the startup process needs to reload data, the larger the data block, the longer the data loading time, the longer the system recovery process. Monitoring time: the master node supervises the situation of other nodes, and each node periodically reports back the finished work and status updates. If a node remains silent for more than a preset time interval, the master node records the node status as dead and sends the data assigned to that node to other nodes. For this "preset time interval", this is roughly estimated from the point of view of data blocks. If it is for the data block of 64MB, I can assume that you can solve it in 10 minutes anyway, and if you don't respond for more than 10 minutes, you are dead. But for 640MB or more than 1G data, how long should I estimate? If the estimated time is short, the death will be misjudged, and what is worse, all nodes will be sentenced to death every minute. If the estimated time is long, the waiting time will be too long. Therefore, for large blocks of data, this "preset interval" is difficult to estimate.

Problem decomposition problem: the amount of data is linear in terms of the complexity of problem solving. For the same algorithm, the larger the amount of data processed, the greater its time complexity. Constrain Map output:

In the Map Reduce framework, data after Map is sorted before Reduce operations are performed. Think about the idea of merge sorting algorithm, sort small files, and then merge small files into large files, and then you will understand this.

These are all the contents of this article entitled "what are the advantages of using 64m blocks in Hadoop". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report