Why is hdfs not good at storing a large number of small files? 04/16 Update SLTechnology News&Howtos

Why is hdfs not good at storing a large number of small files?

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Monday, 2019-2-25

Why is hdfs not good at storing a large number of small files?

Advantages and disadvantages of hdfs

Advantages:

1. Can be built on cheap machines

Improved reliability through multiple copies, providing fault tolerance and recovery mechanisms

Server node downtime is normal and must be a rational object.

2. High fault tolerance

Multiple copies of the data are automatically saved and automatically restored when the copy is lost.

The core design idea of HDFS: distributed uniform storage + backup redundant storage

3. Suitable for batch processing

Mobile computing rather than data, data location exposed to computing framework

In the end, the computing task of a large amount of data must be divided into many small tasks.

4. Suitable for big data

GB, TB, or even PB-level data, number of files with a scale of more than one million, and scale of 10K + nodes

5. Streaming file access

Write once and read multiple times to ensure data consistency

Shortcomings of hdfs

1. Low latency data access

Such as millisecond low latency and high throughput

2. Small file access

Occupies a large amount of NameNode memory 150b * 1000W = 15e par 1.5g seek time exceeds read time

3. Write concurrently and modify files randomly

A file can have only one writer who only supports append

The difference between FastDFS and hadoop

The main reason is that the location is different from the application.

Hadoop's file system HDFS mainly solves the problem of distributed data storage in parallel computing. Its single data file is usually very large and is stored in blocks (slicing).

FastDFS is mainly used for large and medium-sized websites to provide online services for file upload and download. Therefore, it is well supported in load balancing, dynamic expansion and other aspects, and FastDFS will not split files quickly (split).

Why hdfs is not suitable for the storage of a large number of small files

Namenode stores metadata in memory. Each small file takes up 150bytes. So if there are a lot of small files, memory will be tight.

It's not that it's not suitable for saving small files, but it's not suitable for pure mass small files. The number of files should not be too large.

Managing each file needs to take up a certain amount of memory on the master machine. Too many management files and too much memory will slow down the cluster.

If you have too many files, you will occupy more memory, which will affect the processing speed of the system. If it is a small number of large-capacity files, which takes up less memory, the processing speed will be faster.

HDFS can not efficiently store a large number of small files, how to deal with small files? Https://blog.csdn.net/zyd94857/article/details/79946773

Hadoop SequenceFile details: https://blog.csdn.net/bitcarmanlee/article/details/78111289

(1) HDFS is not suitable for storing a large number of small files, because namenode stores the metadata of the file system in memory, so the number of files stored is limited by the memory size of namenode. Each file, directory, and data block in HDFS occupies 150Bytes. If there are too many files, it will take up a lot of memory.

(2) HDFS is suitable for high throughput, but not for access with low latency. It will take a long time to save a large number of small files at the same time.

(3) the way of streaming reading is not suitable for multi-user writing and arbitrary writing. If you access small files, you must jump from one datanode to another datanode, which greatly degrades read performance.

The main reason is that the location is different from the application.

Hadoop's file system HDFS mainly solves the problem of distributed data storage in parallel computing. Its single data file is usually very large and is stored in blocks (slicing).

The reference link is: https://www.cnblogs.com/qingyunzong/p/8535995.html

This has something to do with the underlying design and implementation of the HDFS system. HDFS itself is designed to solve the storage of massive large files. He naturally likes the processing of big data. Large files are stored in HDFS and will be divided into many small data blocks. Any file, no matter how small, is an independent data block, and the information of these data blocks is stored in metadata. In the previous blog HDFS foundation, it was introduced that the metadata information will be stored in the namenode of the HDFS cluster. Here, the metadata information mainly includes the following three parts:

1) Abstract directory tree

2) the mapping relationship between files and data blocks. The metadata size of a data block is about 150byte.

3) multiple copy storage places of the data block

Metadata is stored on disk (1 and 2) and memory (1, 2, and 3), while there is an upper limit on memory in the server, for example:

There are 100 1m files stored in HDFS system, so the number of data blocks is 100, and the size of metadata is 100*150byte, which consumes the memory of 15000byte, but only stores 100m of data.

There is a 100m file stored into the HDFS system, then the number of data blocks is 1, the size of metadata is 150byte, consumption of 150byte memory, storage capacity of 100m of data.

Cdh's official explanation: why hdfs is not suitable for storing a large number of small files / / very important

Files and blocks

In HDFS, data and metadata are separate. The data files are split into block files, which are stored and copied on the DataNode in the cluster. The file system namespace tree and associated metadata are stored on NameNode.

Namespace objects are files inode and blocks that point to block files on DataNode. These namespace objects are stored as file system images (fsimage) in NameNode's memory and are also persisted locally. Updates to the metadata are written to the edit log. When NameNode starts, or when checkpoints are adopted, edits are applied, logs are cleared, and a new fsimage is created.

Important: NameNode keeps the entire namespace image in memory. The secondary NameNode is on its own JVM, as well as when creating an image checkpoint.

On average, each file takes up 1.5 pieces of storage space. That is, the average file is divided into two block files-one consumes the entire allocated block size, and the other consumes half. On NameNode, this same average file requires three namespace objects-a file inode and two blocks.

%%%

Disk space and namespace

The CDH default block size (dfs.blocksize) is set to 128MB. Each namespace object on the NameNode consumes approximately 150 bytes.

On DataNodes, data files are measured by the disk space consumed (the actual data length), not necessarily the entire block size. For example, a file of 192 MB takes up 192 MB of disk space instead of an integral multiple of the block size. Using the default block size of 128 MB, a file of 192 MB is divided into two block files, a 128 MB file and a 64 MB file. On NameNode, namespace objects are measured by the number of files and blocks. The same 192 MB file is represented by three namespace objects (1 file inode + 2 blocks) and consumes about 450 bytes of memory.

Large files that are split into fewer blocks usually consume less memory than small files that generate many blocks. A 128-MB data file is represented by two namespace objects (1 file inode + 1 block) on NameNode and consumes about 300 bytes of memory. By contrast, 128files per 1 MB are represented by 256namespace objects (128file inode + 128blocks) and consume about 38400 bytes. The optimal partition size is then some integer multiples of the block size for memory management and data locality optimization.

By default, Cloudera Manager allocates a maximum heap space of 1 GB per million blocks (but not less than 1 GB). How much memory you actually need depends on your workload, especially the number of files, directories, and blocks generated in each namespace. If all files are split by block size, you can allocate 1 GB per million files. But considering that the historical average per file is 1. 5 blocks (2 block objects), a more conservative estimate is 1 GB of memory per million blocks.

Important: Cloudera recommends using 1 GB of NameNode heap space per million blocks to calculate namespace objects, necessary bookkeeping data structures, and remote procedure call (RPC) workloads. In fact, your heap requirements may be lower than this conservative estimate.

%%%

Copy

The default block replication factor (dfs.replication) is 3. Replication affects disk space but does not affect memory consumption. Replication changes the amount of storage required for each block, but does not change the number of blocks. If a block file on DataNode (represented by a block on NameNode) is copied three times, the number of block files increases threefold, but does not represent their number of blocks.

When replication is turned off, a 192 MB file takes up 192 MB of disk space and approximately 450 bytes of memory.

/ / (calculated as: 128: 64, that is, one file, inode+2, each block consumes about 450bytes of memory)

If you have a million such files, or 192 TB of data, you need 192 TB of disk space, regardless of RPC workload, 450 MB memory: (1 million inode + 2 million blocks) 150 bytes. With default replication enabled, you need 576 TB of disk space: (192 TB 3) but memory usage remains the same, 450 MB. When you consider bookkeeping and RPC, and follow the recommendation of 1 GB heap memory per million blocks, the safer estimate for this solution is 2 GB memory (with or without replication).

Tip:

A lot of materials say that the HDFS file system is not suitable for storing small files. In my experience, I think that the hdfs file system is not unable to store small files, but is not good at storing a large number of small files.

Reference link: https://www.cloudera.com/documentation/enterprise/5-13-x/topics/admin_nn_memory_config.html

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.