Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the basic concept of Linux file system

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the basic concept of Linux file system". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the basic concepts of Linux file system".

Basic concepts of Linux file system

Linux was originally designed as a MINIX1 file system, which only supports 14-byte file names, and its maximum file support is only 64 MB. The file system after MINIX 1 is the ext file system. Compared with MINIX 1, ext system has a great improvement in supporting byte size and file size, but ext is still not as fast as MINIX 1, so ext 2 was developed to support long file names and large files, and has better performance than MINIX 1. This makes him the primary file system for Linux. It's just that Linux can use VFS to support multiple file systems. When linking to Linux, users can dynamically mount different file systems to the VFS.

A file in Linux is a sequence of bytes of any length, and files in Linux can contain any information, such as ASCII codes, binaries, and other types of files without distinction.

For convenience, files can be organized in a directory, and directories stored as files can be processed as files to a large extent. Directories can have subdirectories, thus forming a hierarchical file system, and the root directory under the Linux system is /, which usually contains multiple subdirectories. The character / is also used to distinguish directory names. For example, "/ usr/cxuan" indicates the usr directory under the root directory, where there is a subdirectory called cxuan.

Let's introduce the directory name under the root directory of the Linux system.

/ bin, which is an important binary application, contains binaries, and the commands used by all users of the system are here

/ boot, start the related file that contains the boot loader

/ dev, which contains device files, terminal files, USB, or any device connected to the system

/ etc, configuration files, startup scripts, etc., including configuration files for all programs, as well as shell scripts to start / stop a single application

/ home, the local primary path, where all users use the home directory to store personal information

/ lib, a system library file that contains binary library files that are supported under / bin and / sbin

/ lost+found, which provides a lost + search system under the root directory, which must be under the root user to view the contents of the current directory.

/ media, mount removable media

/ mnt, mount the file system

/ opt, providing an optional application installation directory

/ proc, a special dynamic directory for maintaining system information and status, including current running process information

/ root,root user's main directory folder

/ sbin, an important binary system file

/ tmp, temporary files created by the system and users. All files in this directory will be deleted when the system restarts.

/ usr, which contains applications and files that are accessible to most users

/ var, frequently changing files, such as log files or databases, etc.

In Linux, there are two paths, one is the absolute path (absolute path), the absolute path tells you to find files from the root directory, the disadvantage of the absolute path is too long and inconvenient. The other is the relative path (relative path), which is also called the working directory (working directory).

If / usr/local/books is the working directory, then the shell command

Cp books books-replica

It means the relative path, and

Cp / usr/local/books/books / usr/local/books/books-replica

Represents an absolute path.

In Linux, it is common for one user to use another user's file or to use a file in the file tree structure. Two users share the same file, which is in the directory structure of one user, and when another user needs to use the file, it must be referenced to him through an absolute path. If the absolute path is long, it can be very troublesome to type each time, so Linux provides a link mechanism.

For example, here is a diagram before using a link

As shown above, for example, if there are two work accounts jianshe and cxuan,jianshe that want to use the A directory under the cxuan account, it might enter / usr/cxuan/A, which is a diagram after an unused link.

The hint after using the link is as follows

Now jianshe can create a link to use the directory under cxuan. '

When a directory is created, two directory entries are created at the same time. And... The former represents the working directory itself, and the latter represents the parent directory of the directory, which is the directory in which the directory is located So what does it mean to access directories in cxuan in / usr/jianshe that the.. / cxuan/xxxLinux filesystem is disk-insensitive? Generally speaking, the file systems on one disk remain independent of each other. If one file system directory wants to access the file system on another disk, you can do something like this in Windows.

The two file systems remain independent of each other on different disks.

In Linux, mount is supported, which allows one disk to be mounted to another, so the above relationship will look like this

After hanging, the two file systems no longer need to care about which disk the file system is on, and the two file systems are visible to each other.

Another feature of the Linux file system is the support for locking. In some applications, two or more processes use the same file at the same time, which is likely to lead to competitive conditions (race condition). One solution is to lock it with different granularity, which is to prevent a process from modifying only one line of records so that the whole file cannot be used.

POSIX provides a flexible locking mechanism with different levels of granularity, allowing a process to lock a byte or an entire file using an indivisible operation. The locking mechanism requires the process trying to lock to specify its "file to lock, start location, and bytes to lock"

The Linux system provides two kinds of locks: "shared lock" and "mutex lock". Adding an exclusive lock will not succeed if a part of the file has been added a shared lock; if part of the file system has been added a mutex, any locking before the mutex has been released will not succeed. In order to lock successfully, all bytes of the requested part of the lock must be available.

In the locking phase, the process needs to design the situation after the locking failure, that is, to determine whether to choose blocking after the locking failure. If blocking is selected, then when the lock in the locked process is deleted, the process will unblock and replace the lock. If the process chooses non-blocking, the lock will not be replaced and will be returned immediately from the system call, marking the status code to indicate whether the lock was successful, and then the process will choose the next time to try again.

Locked areas can overlap. Below we demonstrate locked areas under three different conditions.

As shown in the figure above, A's shared lock is locked from the fourth byte to the eighth byte

As shown in the figure above, the process adds shared locks on both An and B, where 6-8 bytes are overlapping locks

As shown in the figure above, processes An and B and C add shared locks at the same time, so the sixth and seventh bytes are shared locks.

If a process tries to lock at the 6th byte, the setting will fail and block. Because the area is locked by A B C at the same time, the process can not lock successfully until the A B C releases the lock.

Linux file system call

Many system calls are related to files and file systems. Let's first take a look at the system calls to a single file, and then look at the system calls to the entire directory and file.

To create a new file, the creat method is used, and note that there is no e.

❝is talking about an episode here. Someone once asked Ken Thompson, the founder of UNIX, what would you do if you had the chance to rewrite UNIX? he replied that he wanted to change creat to create. ❞

The two arguments to this system call are the file name and the protected mode

Fd = creat ("aaa", mode)

This command creates a file called aaa and sets the protection bit of the file according to mode. These bits determine which user may access the file and how.

The creat system call not only creates a file called aaa, but also opens it. To allow subsequent system calls to access the file, the creat system call returns a non-negative integer called the file descriptor (file descriptor), which is the fd above.

If the creat system call is called on an existing file, the contents of the file will be cleared, starting at 0. The open system call can also create files by setting the appropriate parameters.

Let's take a look at the main system calls, as shown in the following table

System call description fd = creat (name,mode) A way to create a new file fd = open (file,...) Open file read, write, or read write s = close (fd) close an open file n = read (fd, buffer, nbytes) read data from the file to the cache n = write (fd, buffer, nbytes) write data from the cache to the file position = lseek (fd, offset, whence) move the file pointer s = stat (name, & buf) to get file information s = fstat (fd) & buf) get file information s = pipe (& fd [0]) create a pipe s = fcntl (fd,...) Other operations such as file locking

In order to read and write a file, the premise is that you need to open the file first, you must use creat or open to open it, and the parameter is how to open the file, whether it is read-only, read-write or write-only. The open system call also returns a file descriptor. After opening the file, you need to close it using the close system call. The minimum number of fd returned by close and open is always unused.

❝what is a file descriptor? A file descriptor is a number that identifies a file that is open in the computer operating system. It describes data resources and how they are accessed. ❞

When the program asks to open a file, the kernel does the following

Grant access

Create an entry (entry) in the global file table (global file table) to provide the location of the entry to the software

The file descriptor consists of unique non-negative integers, and there is at least one file descriptor for each open file on the system. File descriptors were originally used in Unix and are used by modern operating systems, including Linux,macOS and BSD.

When a process successfully accesses an open file, the kernel returns a file descriptor that refers to the entry entry to the global file table. This file table entry contains the file's inode information, byte shift, access restrictions, and so on. For example, the following figure shows

By default, the first three file descriptors are STDIN (standard input), STDOUT (standard output), and STDERR (standard error).

The file descriptor for standard input is 0. In the terminal, it defaults to the user's keyboard input.

The file descriptor for standard output is 1, and in the terminal, the default is the user's screen.

The default data flow associated with the error is 2, and in the terminal, the default is the user's screen.

After a brief chat about file descriptors, let's go back to the discussion of file system calls.

Among the file system calls, read and write are the most expensive. Both read and write have three parameters

File descriptor: tells which open file needs to be read and written

Buffer address: tells where data needs to be read and written from

Statistics: tell how many bytes need to be transferred

This is all the parameters, the design is very simple and lightweight.

Although almost all programs read and write files sequentially, some programs need to be able to randomly access any part of the file. Associated with each file is a pointer that indicates the current location in the file. When reading (or writing) sequentially, it usually points to the next byte to be read (written). If the pointer is in the position of 4096 before reading 1024 bytes, it will automatically move to the position of 5120 after successfully reading the system call.

The Lseek system call changes the value of the pointer position so that subsequent calls to read or write can start anywhere in the file, even beyond the end of the file.

❝lseek = Lseek, the first uppercase of the paragraph. ❞

The reason lseek avoids being called seek is that seek is already used for search on previous 16-bit computers.

Lseek has three parameters: the first is the file descriptor of the file, the second is the location of the file, and the third tells whether the location of the file is relative to the beginning of the file, the current location or the end of the file

Lseek (int fildes, off_t offset, int whence)

The return value of lseek is the absolute location in the file after the file pointer is changed. Lseek is the only system call that never causes a real disk lookup. It just updates the current file location, which is the number in memory.

For each file, Linux tracks the file pattern (regular, directory, special file), size, last modification time, and other information. The program can see this information through stat system calls. The first parameter is the file name and the second is the pointer to the structure of the request information to be placed. The properties of these structures are shown in the following figure.

The device that stores the file stores the device i-node numbering file mode (including protection bit information) the number of file links the file owner identifies the group file size to which the file belongs (in bytes) creation time Last modification / access time

Fstat calls are the same as stat, except that fstat can operate on open files, while stat can only operate on paths.

The pipe file system call is used to create the shell pipeline. It creates a series of pseudo files to buffer the data between the pipe component and the pipe component and returns the file descriptor that reads or writes to the buffer. In the pipe, do something like this

Sort

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report