In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces why the statistical results of du and df are different. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.
We often use du and df to get a situation where a directory or file system is already occupied. But their statistical results are inconsistent, most of the time, their results will not be very different, but sometimes their statistical results will be very different.
Statistical results of df
[root@xuexi] # df-hT Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 1.7G 15G 11% / tmpfs tmpfs 491M 0491M 0% / dev/shm / dev/sda1 ext4 239M 68m 159m 30% / boot / / 192.168.0.124/win cifs 381G 243G 138G 64% / mnt
Statistical results of root directory by du
[root@xuexi] # du-sh / 2 > / dev/null 244G /
The usage space of "/" in df is 1.7G, but the result of du is 244G. Here the statistical result of du is greater than that of df. Then take a look at the statistical results for the / boot partition.
[root@xuexi] # df-hT / boot;echo;du-sh / boot Filesystem Type Size Used Avail Use% Mounted on / dev/sda1 ext4 239m 68m 159m 30% / boot 66m / boot
The result of du is 66m, the result of df is 68m, the difference is not big, but the result of df is larger than that of du.
The underlying process of file storage and deletion
Here is a brief description of the underlying mechanism related to the file system, first of all, how the file is stored in the file system. If you want to store a.txt in the / tmp directory.
When the a.txt file is to be stored under / tmp:
(1)。 First, find a free inode number from inode table and assign it to a.txt, for example, 2222. Then mark the inode number 2222 in inode map (imap) as used.
(2)。 Add a record of the a.txt file to the data block of / tmp. The record includes a pointer to the inode number, such as "0x2222".
(3)。 Then find the free data block from block map (bmap) and start writing the data from a.txt to data block. Every time you write a piece of space (ext4 allocates a piece of space at a time), you find a free data block from bmap until all the data is saved.
(4)。 Set the data block pointer for the record 2222 in inode table, through which you can find out which data block is used by a.txt.
When you want to delete an a.txt file:
(1)。 Delete the data block pointer to a.txt in inode table. As long as it is deleted here, the outside world will not be able to find a.txt data. But this file still exists, except that it is a "corrupted" file because there is no pointer to the data block.
(2)。 Mark the inode number of 2222 as unused in imap. The inode number is then released and can be reused by subsequent files.
(3)。 Delete the record about a.txt in the data block of the parent directory / tmp. As long as it is deleted here, the file will not be seen or found by the outside world.
(4)。 Mark the block occupied by a.txt as unused in bmap. Once marked as unused here, these data block can be overwritten and reused by subsequent files.
Consider a situation where a file is deleted but is still being used by a process at this time. The file cannot be seen or found by the outside world, so the deletion process has reached step (3).
However, the process is still using the data of this file, and can also find the data of this file, because the process has already obtained the data block occupied by the file when loading the file. Although the file has been deleted, the data block in the bmap has not been marked as unused.
The principle of du Statistics
Du uses the stat command to count the total space occupancy of each file (including subdirectories). Because the stat command is used for each file involved, it is slower.
1. If other file systems are mounted in the statistics directory, the file system will also be counted. For example, when "du-sh /", the files of all partitions are counted, including those mounted. As with the "/" at the beginning of this article, the result of du is 244G, which is significantly larger than the result of df statistics, because a partition is mounted to the / mnt directory.
# # Statistical results of df [root@xuexi ~] # df-hT Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 1.7G 15G 11% / tmpfs tmpfs 491M 0491M 0% / dev/shm / dev/sda1 ext4 239M 68m 159m 30% / boot / / 192.168.0.124/win cifs 381G 243G 138G 64% / mnt # # du root directory statistics [root@xuexi ~] # du-sh / 2 > / dev/null 244G /
two。 If a file is deleted, the du command cannot count it even if it is referenced by another process. This file cannot be found by the stat command.
3. You can count the sum of certain file sizes you want to count across partitions. Because they can all be found and counted by stat. For example: count the size of all img files under Linux.
# # Statistical results of df [root@xuexi ~] # find /-type f-name "* .img"-print0 | xargs-0 du-csh 19m / boot/initramfs-2.6.32-504.el6.x86_64.img 13m / mnt/linux tool / cirros-0.3.4-x86_64-disk.img 31m total
The two img files counted here are in different partitions.
The principle of df Statistics
Df reads the superblock of each partition to get free data blocks, used data blocks, thus calculating free space and used space, so df statistics are extremely fast (superblock only takes up 1024 bytes).
1. When other partitions are mounted under a file system, df does not count that partition. This is easy to understand, because df reads the superblock of the respective partition, and even if partition 1 is mounted in the directory of partition 0, df can only read the superblock of partition 0 when counting partition 0.
For example, the following / mnt and / boot are not counted in "/".
[root@xuexi] # df-hT Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 1.7G 15G 11% / tmpfs tmpfs 491M 0491M 0% / dev/shm / dev/sda1 ext4 239M 68m 159m 30% / boot / / 192.168.0.124/win cifs 381G 243G 138G 64% / mnt
two。 Because df reads superblock every time, df automatically converts the statistics of a file in the file system to count the information of the file system.
[root@xuexi] # df-hT / etc/fstab Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 1.7G 15G 11% /
3.df counts files that have been deleted but are still referenced by the process.
Normally, deleting a file immediately releases the relevant pointer and marks the relevant bitmaps in imap and bmap as unused. As soon as the bmap changes, the file system immediately knows which blocks in each block group are free and which are used, and this information is updated to the superblock of the partition. So df can count real-time spatial information immediately.
However, when a file is deleted, if there is a process referencing the file, according to the previous analysis, the data block of the file will not be marked as unused in bmap, and the usage of data blocks will not be updated to superblock. Because df calculates free space and used space based on the number of free and used blocks in superblock, df counts the file that has been "deleted" into the used space.
For example, create a larger file to put in the "/" directory, and du and df count the used space in the root directory.
[root@xuexi ~] # dd if=/dev/zero of=/my.iso bs=1M count=1000 [root@xuexi ~] # df-hT / Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 2.7G 14G 17% / [root@xuexi ~] # du-sh-exclude= "/ mnt" / 2 > / dev/null 2.7g /
They are equal in GB units. Now use a process to reference the file, then delete the file, and then du and df statistics.
[root@xuexi ~] # tail-f / my.iso & [root@xuexi ~] # rm-rf / my.iso [root@xuexi ~] # ls / my.iso ls: cannot access / my.iso: No such file or directory [root@xuexi ~] # du-sh-exclude= "/ mnt" / 2 > / dev/null 1.8G / [root@xuexi ~] # df-hT / Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 2.7G 14G 17% /
You can see that the my.iso file is no longer available to the outside world, so du cannot count this file. Df, on the other hand, counts the file size because the data block occupied by my.iso has not been marked as unused. Then turn off the tail process, and then df counts the space, and the result will appear as normal as du.
[root@xuexi ~] # jobs [1] + Running tail-f / my.iso & [root@xuexi ~] # kill% 1 [root@xuexi ~] # df-hT / Filesystem Type Size Used Avail Use% Mounted on / dev/sda2 ext4 18G 1.7G 15G 11% /
If you don't know which files in the file system have been deleted but are still referenced by the process, you can use lsof to get them. Through it, you can also get the size of the file to see which file "occupies the manger and how many pits". For example, use lsof to view before shutting down the tail process. You can see that the tail process takes up / my.iso, and the file size is 1048576000 bytes.
[root@xuexi] # lsof | grep deleted php-fpm 12597 root txt REG 8 nobody txt REG 2 4058416 931143 / usr/sbin/php-fpm (deleted) php-fpm 12657 nobody txt REG 8 2 4058416 931143 / usr/sbin/php-fpm (deleted) php-fpm 12707 nobody txt REG 8 nobody txt REG 2 4058416 931143 / usr/sbin/php-fpm (deleted) php-fpm 12708 nobody txt REG 8 2.4058416 931143 / usr / sbin/php-fpm (deleted) tail 14437 root 3r REG 8 1048576000 7171 / my.iso (deleted) on why the statistical results of analyzing du and df are different. I hope the above content can be of some help to you and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.