Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Why are the statistical results of du and df different?

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

Today I will tell you why the statistical results of du and df are different. The content of the article is good. Now I would like to share it with you. Friends who feel in need can understand it. I hope it will be helpful to you. Let's read it along with the editor's ideas.

Use du and df to get a situation where a directory or file system is already occupied. But their statistical results are inconsistent, most of the time, their results will not be very different, but sometimes their statistical results will be very different.

Statistical results of df

[root@liangxu] # df-hTFilesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 1.7G 15G 11% / tmpfs tmpfs 491M 0491M 0% / dev/shm/dev/sda1 ext4 239M 68m 159m 30% / boot//192.168.0.124/win cifs 381G 243G 138G 64% / mnt

Statistical results of root directory by du

The usage space of "/" in [root@liangxu] # du-sh / 2 > / dev/null244G / df is 1.7G, but the result of du is 244G. Here the statistical result of du is greater than that of df. Then take a look at the statistical results for the / boot partition. [root@liangxu] # df-hT / boot;echo;du-sh / bootFilesystem Type Size Used Avail Use% Mounted on/dev/sda1 ext4 239m 68m 159m 30% / boot66M / boot

The result of du is 66m, the result of df is 68m, the difference is not big, but the result of df is larger than that of du.

The underlying process of file storage and deletion

Here is a brief description of the underlying mechanism related to the file system, first of all, how the file is stored in the file system. If you want to store a.txt in the / tmp directory.

When the a.txt file is to be stored under / tmp:

First, find a free inode number from inode table and assign it to a.txt, for example, 2222. Then mark the inode number 2222 in inode map (imap) as used.

Add a record of the a.txt file to the data block of / tmp. The record includes a pointer to the inode number, such as "0x2222".

Then find the free data block from block map (bmap) and start writing the data from a.txt to data block. Every time you write a piece of space (ext4 allocates a piece of space at a time), you find a free data block from bmap until all the data is saved.

Set the data block pointer for the record 2222 in inode table, through which you can find out which data block is used by a.txt.

When you want to delete an a.txt file:

Delete the data block pointer to a.txt in inode table. As long as it is deleted here, the outside world will not be able to find a.txt data. But this file still exists, except that it is a "corrupted" file because there is no pointer to the data block.

Mark the inode number of 2222 as unused in imap. The inode number is then released and can be reused by subsequent files.

Delete the record about a.txt in the data block of the parent directory / tmp. As long as it is deleted here, the file will not be seen or found by the outside world.

Mark the block occupied by a.txt as unused in bmap. Once marked as unused here, these data block can be overwritten and reused by subsequent files.

Consider a situation where a file is deleted but is still being used by a process at this time. The file cannot be seen or found by the outside world, so the deletion process has reached step (3).

However, the process is still using the data of this file, and can also find the data of this file, because the process has already obtained the data block occupied by the file when loading the file. Although the file has been deleted, the data block in the bmap has not been marked as unused.

The principle of du Statistics

Du uses the stat command to count the total space occupancy of each file (including subdirectories). Because the stat command is used for each file involved, it is slower.

If other file systems are mounted in the statistics directory, the file system will also be counted. For example, when "du-sh /", the files of all partitions are counted, including those mounted. As with the "/" at the beginning of this article, the result of du is 244G, which is significantly larger than the result of df statistics, because a partition is mounted to the / mnt directory.

# # Statistical results of df [root@liangxu ~] # df-hTFilesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 1.7G 15G 11% / tmpfs tmpfs 491M 0491M 0% / dev/shm/dev/sda1 ext4 239M 68m 159m 30% / boot//192.168.0.124/win cifs 381G 243G 138G 64% / mnt## du pair root Statistical results of the directory [root@liangxu ~] # du-sh / 2 > / dev/null244G /

If a file is deleted, the du command cannot count it even if it is referenced by another process. This file cannot be found by the stat command.

You can count the sum of certain file sizes you want to count across partitions. Because they can all be found and counted by stat. For example: count the size of all img files under Linux.

# # Statistical results of df [root@liangxu ~] # find /-type f-name "* .img"-print0 | xargs-0 du-csh19M / boot/initramfs-2.6.32-504.el6.x86_64.img13M / mnt/linux tool / cirros-0.3.4-x86_64-disk.img31M total

The two img files counted here are in different partitions.

The principle of df Statistics

Df reads the superblock of each partition to get free data blocks, used data blocks, thus calculating free space and used space, so df statistics are extremely fast (superblock only takes up 1024 bytes).

When other partitions are mounted under a file system, df does not count that partition. This is easy to understand, because df reads the superblock of the respective partition, and even if partition 1 is mounted in the directory of partition 0, df can only read the superblock of partition 0 when counting partition 0.

For example, the following / mnt and / boot are not counted in "/".

[root@liangxu] # df-hTFilesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 1.7G 15G 11% / tmpfs tmpfs 491M 0491M 0% / dev/shm/dev/sda1 ext4 239M 68m 159m 30% / boot//192.168.0.124/win cifs 381G 243G 138G 64% / mnt

Because df reads superblock every time, df automatically converts the statistics of a file in the file system to count the information of the file system.

[root@liangxu] # df-hT / etc/fstabFilesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 1.7G 15G 11% /

Df counts files that have been deleted but are still referenced by the process.

Normally, deleting a file immediately releases the relevant pointer and marks the relevant bitmaps in imap and bmap as unused. As soon as the bmap changes, the file system immediately knows which blocks in each block group are free and which are used, and this information is updated to the superblock of the partition. So df can count real-time spatial information immediately.

However, when a file is deleted, if there is a process referencing the file, according to the previous analysis, the data block of the file will not be marked as unused in bmap, and the usage of data blocks will not be updated to superblock. Because df calculates free space and used space based on the number of free and used blocks in superblock, df counts the file that has been "deleted" into the used space.

For example, create a larger file to put in the "/" directory, and du and df count the used space in the root directory.

[root@liangxu ~] # dd if=/dev/zero of=/my.iso bs=1M count=1000 [root@liangxu ~] # df-hT / Filesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 2.7G 14G 17% / [root@liangxu ~] # du-sh-exclude= "/ mnt" / 2 > / dev/null2.7G /

They are equal in GB units. Now use a process to reference the file, then delete the file, and then du and df statistics.

[root@liangxu ~] # tail-f / my.iso & [root@liangxu ~] # rm-rf / my.iso [root@liangxu ~] # ls / my.isols: cannot access / my.iso: No such file or directory [root@liangxu ~] # du-sh-exclude= "/ mnt" / 2 > / dev/null1.8G / [root@liangxu ~] # df-hT / Filesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 2.7G 14G 17% /

You can see that the my.iso file is no longer available to the outside world, so du cannot count this file. Df, on the other hand, counts the file size because the data block occupied by my.iso has not been marked as unused. Then turn off the tail process, and then df counts the space, and the result will appear as normal as du.

[root@liangxu ~] # jobs [1] + Running tail-f / my.iso & [root@liangxu ~] # kill% 1 [root@liangxu ~] # df-hT / Filesystem Type Size Used Avail Use% Mounted on/dev/sda2 ext4 18G 1.7G 15G 11% /

If you don't know which files in the file system have been deleted but are still referenced by the process, you can use lsof to get them. Through it, you can also get the size of the file to see which file "occupies the manger and how many pits". For example, use lsof to view before shutting down the tail process. You can see that the tail process takes up / my.iso, and the file size is 1048576000 bytes.

[root@liangxu] # lsof | grep deletedphp-fpm 12597 root txt REG 8 nobody txt REG 2 4058416 931143 / usr/sbin/php-fpm (deleted) php-fpm 12657 nobody txt REG 8 2 4058416 931143 / usr/sbin/php-fpm (deleted) php-fpm 12707 nobody txt REG 8 nobody txt REG 2 4058416 931143 / usr/sbin/php-fpm (deleted) php-fpm 12708 nobody txt REG 8 2.4058416 931143 / usr/sbin/php-fpm (deleted) tail 14437 root 3r REG 8 1048576000 7171 / my.iso (deleted) the above is why the statistical results of du and df are different. For more information about why the statistical results of du and df are different, you can search for previous articles or browse the following articles to learn! I believe the editor will add more knowledge to you. I hope you can support it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report