Why is the FILE RECORD size 0 after NTFS deletes more than 4G large files or database files 12/21 Update SLTechnology News&Howtos

Why is the FILE RECORD size 0 after NTFS deletes more than 4G large files or database files

2025-12-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Why is the FILE RECORD size 0 after NTFS deletes more than 4G large files or database files?

A: when NTFS deletes a file, the following processes must be completed before it is completed:

1. Change the file system $bitmap to free up space.

2. Change the property of the $mft filerecord item to delete

3. Change the bitmap information of $mft:$bitmap to 0 to release the occupied space of this filerecord.

4. Clear the item information about this file in the directory linked list.

This process is an ideal processing rule, but in fact, the biggest headache for OS is to consider this problem: if there is an interruption in the above four steps (such as sudden power outage, crash, etc.), how to continue the next operation, or to maintain the file system is consistent (the simplest, if the file has been deleted, but the directory is still there, it is always inappropriate, and it is too time-consuming to check it all once. In order to solve this problem, NTFS introduces $logfile, which simply records the status of a complete IO operation that is being performed (such as deleting a file). If it is not successful, you can simply roll back to the status of failure next time.

But here comes the problem again, if a file is too large, or the storage linked list is too long (that is, too many fragments). The part of recording the meta-information of this file will become very large. For example, the size of a file is 4G. According to the size of 4K blocks, a continuous bitmap must have at least 1m. In order not to save too much information in the log file (such as a 4T file, saving 1g bitmap first, which is too slow and the variable increases), NTFS adopts batch processing for complex files or large files: that is, a file may be constantly getting smaller. Become smaller until it becomes 0.

In order to maintain the consistency of operation. Guess, NTFS set two cases, if it is judged that one log record is enough to complete an IO atomic operation, there is no need to clear the size and location information (runlist) of filerecord. But if ntfs cannot log one IO atomic operation at a time, it needs to be divided into several separate IO atomic operations. Each IO atomic operation is logged once and updated to a new state when it is completed-- so that a large file or a file with multiple fragments is deleted, and after the last IO atomic operation, the size is cleared to 0, and the RUNLIST empties the state.

In this problem, 4G is not really a variable, and the guess is caused by the size of 4K blocks and the file releasing 1m clusters at a time. The database is often not easy to recover, even if the size is less than 4G, the reason is that the database is growing, there are many fragments, and fragmentation leads to a large amount of metadata, scattered location, unable to complete the release and other operations at one time.

-Zhang Yu, North Asia data recovery Center

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.