Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to calculate the size of heap memory needed in namenode in hdfs

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Please refer to the previous link address:

Why hdfs is not good at storing a large number of small files

Https://blog.51cto.com/12445535/2354951

Overview: namenode replication (replication)

Copy

The default block replication factor (dfs.replication) is 3. Replication affects disk space but does not affect memory consumption. Replication changes the amount of storage required for each block, but does not change the number of blocks. If a block file on DataNode (represented by a block on NameNode) is copied three times, the number of block files increases threefold, but does not represent their number of blocks.

When replication is turned off, a 192 MB file takes up 192 MB of disk space and approximately 450 bytes of memory.

/ / (calculated as: 128: 64, that is, one file, inode+2, 3150 blocks consume about 450bytes of memory)

If you have a million such files, or 192 TB of data, you need 192 TB of disk space, regardless of RPC workload, 450 MB memory: (1 million inode + 2 million blocks) 150 bytes. With default replication enabled, you need 576 TB of disk space: (192 TB * 3) but memory usage remains the same, 450 MB. When you consider bookkeeping and RPC, and follow the recommendation of 1 GB heap memory per million blocks, the safer estimate for this solution is 2 GB memory (with or without replication).

Examples

Example 1: estimated NameNode heap memory used

Alice,Bob and Carl have 1 GB (1024 MB) of data on each disk, but cut into files of different sizes. Alice and Bob files are part of the block size and require the least memory. Carl does not populate the heap with unnecessary namespace objects.

Alice:1 x 1024 MB files

1 file inode

8 yuan (1024 MB / 128MB)

Total = 9 objects * 150bytes = 1350 bytes of heap memory

Bob:8 x 128 MB files

8 files inode

8 blocks

Total = 16 objects * 150bytes = 2400 bytes of heap memory

Carl:1024 x 1 MB file

1024 files inode

1024 blocks.

Total = 2048 objects * 150bytes = 307200 bytes of heap memory

* * Compute instance / / production is available to calculate how much namenode heap memory is required based on the known disk size

Example 2: estimate the required NameNode heap memory * *

In this example, the memory is estimated by considering the capacity of the cluster. It's worth rounding. Both clusters physically store 4800 TB or approximately 36 million block files (the default block size). Replication determines how many namespace blocks represent these block files.

Cluster A: 200hosts for each 24 TB = 4800 TB.

Blocksize = 128MB, replication = 1

Cluster capacity in MB: 200 24000000 MB = 4800000000 MB (4800 TB)

Disk space required per block: 128 MB 1 = 128 MB storage per block

Cluster capacity in blocks: 4800000000 MB / 128MB = 36000000 blocks

In terms of capacity, it is recommended to allocate 1 GB of memory per million blocks, and cluster A requires a maximum heap space of 36 GB.

Cluster B: 200hosts for each 24 TB = 4800 TB.

Blocksize = 128MB, replication = 3

Cluster capacity in MB: 200 24000000 MB = 4800000000 MB (4800 TB)

Disk space required per block: 128 MB 3 = 384 MB storage per block

Cluster capacity in blocks: 4800000000 MB / 384 MB = 12000000 blocks

In terms of capacity, it is recommended to allocate 1 GB of memory per million blocks, and Cluster B requires a maximum heap space of 12 GB.

Both Cluster An and Cluster B store the same number of block files. However, in cluster A, each block file is unique and represented by a block on NameNode; in cluster B, only 1/3 is unique and 2/3 is a replica.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report