Illustration | you call this a file system? 07/19 Update SLTechnology News&Howtos

Illustration | you call this a file system?

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

This article comes from Weixin Official Accounts: Low Concurrent Programming (ID: dipingfa), original title: Illustration| You call this shit a file system?, Author: Flash

You have a hard drive in your hand, the size of a terabyte.

You have a stack of files.

These files look like binary data to the hard drive.

You are going to store these files on your hard drive and read them when you need them.

What software should be designed to make it easier to read and write these files on the hard drive?

First of all, I don't want to deal with complicated sectors, device drivers and other details, so I first implemented a simple function to logically divide the hard disk into blocks, and can read and write in blocks.

Each block is defined as the size of two physical sectors, i.e. 1024 bytes, or 1KB.

Hard disk is too big to analyze, we assume that your hard disk is only 1MB, then this hard disk has 1024 blocks.

OK, let's start saving files!

Prepare a document

Just pick a block and put it in. Block 3!

Success! First victory!

2 Save another file!

Huh? Found a problem, in case this file is also saved to block 3, isn't the original file overwritten? No, there has to be a place to record what blocks are currently available, like this.

Block 0: unused

Block 1: Not Used

Block 2: Not Used

Block 3: Used

Block 4: Not Used

...

Block 1023: Not Used

Then let's use Block 0 to record the usage of all blocks! How do you record it?

Bitmap!

Then we give block 0 a name, called block bitmap, after which block 0 is dedicated to recording the use of all blocks, no longer used to store specific files.

When we store a new file, we only need to find the first 0 bit in the block bitmap to find the first unused block and store the file. At the same time, don't forget to put the corresponding position in the block bitmap 1.

Perfect!

3 Below, we try to read the file we just read.

Huh? I've got another problem. How do I find the file? Based on block numbers? It's stupid. It's like when you go to a bookstore and the clerk asks you for the book number instead of the title. It doesn't make sense.

So we give each file a name, a filename, by which to find the file.

There has to be a place where file names correspond to block numbers, like this.

Tweet: Block 3

Math Final Review Materials. mp4: Block 5

Secrets of Low Concurrency Programming.pdf: Block 10

...

Don't worry, since you have to choose a place to record the file name, you may wish to record more information we care about, such as file size, file creation time, file permissions, etc.

And these things naturally have to be stored on the hard drive, so we choose to use a fixed size of space to represent this information. How much space? 128 bytes.

Why 128 bytes? I'd love to.

We call this 128-byte structure an inode.

After that, every time we store a new file, we not only need to occupy a block to store the file itself, but also occupy an inode to store the meta-information of the file, and the block number of the inode field points to the block number of the file.

If an inode is 128 bytes, then a block can hold 8 inode, we can number these inode.

If you don't think you have enough inode, you can also use two or more blocks to store inode information, but this will reduce the number of blocks used to store data, which depends on your own balance.

Similarly, as with block bitmaps managing block usage, we also need an inode bitmap to manage inode usage. Let's put the inode bitmap in block 1!

At the same time, we put the inode information in block 2, one coexistence of 8 inode, so that our block 2 is called inode table.

Now, our file system structure looks like this.

Note: The block bitmap manages the blocks available, with each bit representing whether a block is used or not. The inode bitmap manages inode by inode, not the block occupied by inode. For example, there are 8 inode in the above picture, so there are 8 bits in the inode bitmap to manage their use or not.

4 Now, our files are small enough to fit in one block.

But what if you need two blocks, three blocks, four blocks?

Simple, we just need to use sequential storage method, and inode only records the first block of the file and how many blocks are needed after that.

The disadvantage of this method is that it is easy to leave large and small holes, and after the arrival of new files, it is difficult to find suitable blank blocks, and space will be wasted.

It seems that this method is not working, then what should we do?

Since the block number of the file is recorded in the inode, why not expand it and record more blocks?

Originally, only one block number was recorded in the inode, but now it is expanded to record 8 block numbers! And the blocks don't need to be contiguous.

Well, that's a viable option!

But this can only represent 8 blocks. The maximum file that can be recorded is 8K (remember, a block is 1K). Now the file easily exceeds this limit. What should I do?

It's very simple. We can use one of these blocks as an indirect index.

This instantly leaves 263 blocks available (256 − 1 more blocks). This type of index is called a first-level indirect index.

If that's not enough, make another block a first-level indirect index, or a second-level indirect index (a second-level indirect index can add 256 * 256 - 1 blocks).

Our file system, for the time being only get a level 1 indirect index. The hard disk has only 1024 blocks, so 263 blocks is big enough for a file. No matter how big it is, it is not allowed to be so willful.

OK, now we can save very large files, and can read them accurately by file name and file size!

5 But we have to improve, let's think about what's wrong with this file system.

For example, how do we know when there are not enough inode numbers? Do you need to find it in the inode bitmap, can't find it before you know it's not enough?

The same is true when the number of blocks is insufficient.

If only there was a global place to record all this, it would be good and convenient to adjust at any time, such as this

number of inodes

Number of free inode

block number

Number of free blocks

Let's just take another chunk to store that data! Since they seem to describe the file system from God's perspective, we put it on top of the very first block and call it a superblock, which is now laid out as follows.

We continue to improve.

Now, block bitmap, inode bitmap, inode table, are fixed to occupy this block 1, block 2, block 3 these three positions.

What if the number of subsequent inode is so large that the inode table or inode bitmap needs to occupy multiple blocks?

Or, the number of blocks increases (the hard disk itself gets bigger, or each block gets smaller), and the block bitmap needs to occupy multiple blocks. What should I do?

The program is dead. It won't guess what blocks mean unless you tell it.

It was very simple. Just like super block recording information, this information was also recorded in a block, so there was no need to worry. Let's choose block 1 immediately after the superblock to record this information and call it a block descriptor.

Of course, these block numbers are only the starting block number of the record. Block bitmap, inode bitmap and inode table can occupy multiple blocks respectively.

All right, we're done!

6 Now let's try to save another batch of files.

Sunflower Scripture.txt

Mathematics Final Review Materials.mp4

1.mp4

2.mp4

3.mp4

4.mp4

Secrets of Low Concurrency Programming.pdf

Huh? This looks very uncomfortable, all the files are tiled open, can there be hierarchy? like this

Sunflower Scripture.txt

Mathematics Final Review Materials.mp4

duansuo

1.mp4

2.mp4

3.mp4

4.mp4

Secrets of Low Concurrency Programming.pdf

We will sunflower book.txt this called ordinary file, will be redundant this called directory file, if you want to access redundant 1.mp4, then the full file name should be written

1.mp4/1.mp4

How do you do that? Then we have to bring up the inode structure again.

An attribute is needed to distinguish whether the file is a normal file or a directory file.

What's missing is what's missing, we're already familiar with it, add a 4 byte specifically to indicate the file type.

If it is a normal file, the data block pointed to by this inode is still the same as before, that is, the intact content of the file itself.

However, if it is a directory file, the data block pointed to by this inode needs to be re-planned.

What should this data block look like? It could be a structure next to each other pointing to different inode, like this.

This way, first find the data block by adding this directory file. Then according to the data block in a structure with inode information, find all the files under this directory.

Perfect!

7 But think about it, if you want to see all the files in this directory (such as ll command), the file name and file type are displayed, what do you do?

You need to take the inode pointed to by each structure from the inode table, and then take out the file name and file type, which is a waste of time.

And letting users see all the files in a directory is an extremely common operation.

So, why not put common information like file names and file types in the structs of the data blocks.

At the same time, the file name in the inode structure seems to be useless. This kind of variable length thing is very annoying in this fixed-length structure. I have long wanted to remove it. It also saves space for other information, such as the array of blocks in which the file is located, so you can have a few more.

Great, get rid of it!

OK, done, now we can classify the files into different directories, you can also create directories under the directory, unlimited sets of dolls!

8 The file system is now more perfect, but there is still a little bit uncomfortable.

We access a directory, we can comfortably see the files in the directory, and then access the files or directories under this directory according to the name. The whole process is a routine.

However, all files under the topmost directory, that is, the root directory, still need to be obtained by traversing all inode. Can it be unified with the above routine?

The answer is very simple, we stipulate that the 0 inode in the inode table indicates the root directory, and all access starts from this root directory!

Okay, there's no "then" this time!

Let's finish by appreciating our file system architecture.

You don't think it's a big deal.

But this shit, it's called a file system.

This file system is basically the same as the classic file system ext2 on linux.

Here is the structure of the ext2 file system I drew (only the core fields are drawn in the fields section)

I guess you can't see clearly, let me say the main similarities and differences:

1. In front of the super block is the boot block, which is the PC Alliance's 1KB dedicated space for hard disks, which cannot be used by any file system.

2. ext2 file system first divides the entire hard disk into many block groups, but if there is only one block group, the overall structure of our file system is exactly the same, namely super block, block descriptor, block bitmap, inode bitmap, inode table, data block.

3. The ext2 file system uses 15 blocks in the inode table to locate files, of which the 13th block is a first-level indirect index, 14 are second-level indirect indexes, and 15 are third-level indirect indexes.

4. Ext2 file system file types are divided into more, there are common such as block device files, character device files, pipe files, socket files and so on.

5. The ext2 file system has more information in the superblock, block descriptor, and inode tables, but the core is the same as our file system, and these fields are added in subsequent ext3 and ext4 to maintain forward compatibility.

6. Ext2 file system inode 2 for the root directory, and our system is 0 inode for the root directory, this is very arbitrary, you design a file system set a 187 inode for the root directory no one stops you.

If you want to know all the details of the ext2 file system, there are three ways.

1. Look at the source code, linux 1.0 after the source code have ext2 file system implementation, the source code is the most accurate.

2. Look at the official documentation, here's a pdf link.

https://www.nongnu.org/ext2-doc/ext2.pdf

3. Check out this great blog, I recommend one here.

http://docs.linuxtone.org/ebooks/C&amp;CPP/c/ch29s02.html

4. Use linux mke2fs command to generate a disk image of ext2 file system, and then analyze its format byte by byte, you can get my image analysis file in public number low concurrent programming reply ext2.

If it is not difficult to read the source code and official documents, of course I mainly push these two, because after all, it is first-hand information.

But most people probably can't do it, and sometimes it's not necessary, so read some good blogs.

Introductory thinking, I think this article is even a very high quality article, it will take you from the designer's point of view to understand why the file system is designed this way.

To introduce the details, those who even write the format and fields of the file system incorrectly, don't read it, so I recommend an article here in conscience, that is, the above way three, you can rest assured that bold, word-for-word consumption.

This article comes from Weixin Official Accounts: Low Concurrent Programming (ID: dipingfa), Author: Flash Guest

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.