In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Please refer to the previous link address:
Why hdfs is not good at storing a large number of small files
Https://blog.51cto.com/12445535/2354951
Overview: namenode replication (replication)
Copy
The default block replication factor (dfs.replication) is 3. Replication affects disk space but does not affect memory consumption. Replication changes the amount of storage required for each block, but does not change the number of blocks. If a block file on DataNode (represented by a block on NameNode) is copied three times, the number of block files increases threefold, but does not represent their number of blocks.
When replication is turned off, a 192 MB file takes up 192 MB of disk space and approximately 450 bytes of memory.
/ / (calculated as: 128: 64, that is, one file, inode+2, 3150 blocks consume about 450bytes of memory)
If you have a million such files, or 192 TB of data, you need 192 TB of disk space, regardless of RPC workload, 450 MB memory: (1 million inode + 2 million blocks) 150 bytes. With default replication enabled, you need 576 TB of disk space: (192 TB * 3) but memory usage remains the same, 450 MB. When you consider bookkeeping and RPC, and follow the recommendation of 1 GB heap memory per million blocks, the safer estimate for this solution is 2 GB memory (with or without replication).
Examples
Example 1: estimated NameNode heap memory used
Alice,Bob and Carl have 1 GB (1024 MB) of data on each disk, but cut into files of different sizes. Alice and Bob files are part of the block size and require the least memory. Carl does not populate the heap with unnecessary namespace objects.
Alice:1 x 1024 MB files
1 file inode
8 yuan (1024 MB / 128MB)
Total = 9 objects * 150bytes = 1350 bytes of heap memory
Bob:8 x 128 MB files
8 files inode
8 blocks
Total = 16 objects * 150bytes = 2400 bytes of heap memory
Carl:1024 x 1 MB file
1024 files inode
1024 blocks.
Total = 2048 objects * 150bytes = 307200 bytes of heap memory
* * Compute instance / / production is available to calculate how much namenode heap memory is required based on the known disk size
Example 2: estimate the required NameNode heap memory * *
In this example, the memory is estimated by considering the capacity of the cluster. It's worth rounding. Both clusters physically store 4800 TB or approximately 36 million block files (the default block size). Replication determines how many namespace blocks represent these block files.
Cluster A: 200hosts for each 24 TB = 4800 TB.
Blocksize = 128MB, replication = 1
Cluster capacity in MB: 200 24000000 MB = 4800000000 MB (4800 TB)
Disk space required per block: 128 MB 1 = 128 MB storage per block
Cluster capacity in blocks: 4800000000 MB / 128MB = 36000000 blocks
In terms of capacity, it is recommended to allocate 1 GB of memory per million blocks, and cluster A requires a maximum heap space of 36 GB.
Cluster B: 200hosts for each 24 TB = 4800 TB.
Blocksize = 128MB, replication = 3
Cluster capacity in MB: 200 24000000 MB = 4800000000 MB (4800 TB)
Disk space required per block: 128 MB 3 = 384 MB storage per block
Cluster capacity in blocks: 4800000000 MB / 384 MB = 12000000 blocks
In terms of capacity, it is recommended to allocate 1 GB of memory per million blocks, and Cluster B requires a maximum heap space of 12 GB.
Both Cluster An and Cluster B store the same number of block files. However, in cluster A, each block file is unique and represented by a block on NameNode; in cluster B, only 1/3 is unique and 2/3 is a replica.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.