How to understand the dependency of Linux kernel and its related architecture 07/03 Update SLTechnology News&Howtos

How to understand the dependency of Linux kernel and its related architecture

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how to understand the dependency of the Linux kernel and its related architecture". In daily operation, I believe many people have doubts about how to understand the dependency of the Linux kernel and its related architecture. The editor has consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubt of "how to understand the dependency of the Linux kernel and its related architecture". Next, please follow the editor to study!

There are two reasons for Linux kernel's success:

Flexible architecture design makes it easy for a large number of volunteer developers to join the development process.

Each subsystem, especially those that need to be improved, has good scalability.

It is for these two reasons that Linux kernel can continue to evolve and improve.

1. The position of Linux kernel in the whole computer system

Principles of hierarchical structure:

The dependencies between subsystems are from the top down: layers pictured near the top depend on lower layers, but subsystems nearer the bottom do not depend on higher layers.

This kind of dependency between subsystems can only be from top to bottom, that is, the subsystem at the top of the figure depends on the subsystem at the bottom, and vice versa.

Second, the role of the kernel

Virtualization (abstraction) abstracts the computer hardware into a virtual machine for use by the user process process; the process runtime does not need to know how the hardware works, just call the virtual interface virtual interface provided by Linux kernel.

Multitasking, in fact, is the parallel use of computer hardware resources by multiple tasks, and the task of the kernel is to arbitrate the use of resources, creating the illusion that each process thinks it is an exclusive system.

PS: process context switching means changing the status word of the program, the contents of the base address register of the page table, the task_struct instance pointed to by current, PC-- > that means replacing the files opened by the process (which can be found through the files of task_struct), and replacing the execution space of the process memory (which can be found through the mem of task_struct).

III. The overall architecture of the Linux kernel

The central system is the process scheduler Process Scheduler,SCHED: all remaining subsystems depend on the process scheduler because the remaining subsystems need to block and resume processes. When a process needs to wait for a hardware action to complete, the corresponding subsystem blocks the process; when the hardware action is completed, the subsystem resumes the process: both the blocking and recovery actions depend on the process scheduler to complete.

Each dependency arrow in the figure above has a reason:

The process scheduler relies on the memory manager Memory manager: when a process resumes execution, it relies on the memory manager to allocate memory for it to run.

The IPC subsystem relies on the memory manager: the shared memory mechanism is a method of inter-process communication in which two processes run using the same shared memory space to transmit information.

VFS depends on network interface Network Interface: supports NFS network file system

VFS depends on memory manager: support for ramdisk devices

The memory manager relies on VFS because to support swapping swapping, processes that are temporarily not running can be swapped out to the swap partition swap on disk and entered into a suspended state.

Fourth, the highly modular design of the system is conducive to division of labor and cooperation.

Only a very small number of programmers need to work across multiple modules, which does happen, only when the current system needs to rely on another subsystem

Hardware device driver hardware device drivers, file system module logical filesystem modules, network device driver network device drivers and network protocol module network protocol modules have the highest scalability.

5. Data structure in the system

Task list Task List

The process scheduler maintains a data structure for each process. Task_struct; all processes are managed by linked lists. The task list; process scheduler also maintains a current pointer to the process that is currently occupying CPU.

Memory mapped Memory Map

The memory manager stores the mapping of the virtual address to the physical address of each process; it also provides how to swap out specific pages, or how to handle page faults. This information is stored in the data structure mm_struct. Each process has a mm_struct structure, and in the process's task_struct structure, there is a pointer mm to the mm_struct structure of the secondary process.

In mm_struct, there is a pointer pgd to the process's page directory table (that is, the page directory header address)-> when the process is scheduled, the pointer is replaced with a physical address and written to the control register CR3 (page base register under x86 architecture)

I-nodes

VFS represents the file image on disk through the inodes node, and inodes is used to record the physical properties of the file. Each process has a files_struct structure that represents the file opened by the process, with a files pointer in the task_struct. File sharing can be achieved using the inodes node. There are two ways to share files: (1) opening files through the same system file points to the same inodes node, which occurs between parent and child processes; (2) opening files through different systems pointing to the same inode node, for example, with hard links; or opening the same file with two unrelated pointers.

Data connection Data Connection

The roots of all data structures in the kernel are in the task list maintained by the process scheduler. In the data structure task_struct of each process in the system, there is a pointer mm to its memory mapping information, a pointer to the file it opens (the user opens the file table), and a pointer to the network socket opened by the process.

VI. Subsystem architecture

1. Process Scheduler Process Scheduler architecture

(1) Target

Process scheduler is the most important subsystem in Linux kernel. The system uses it to control access to CPU-not only the access of user processes to CPU, but also the access of other subsystems to CPU.

(2) Module

The scheduling policy module scheduling policy module: determines which process gains access to the CPU; the scheduling policy should allow all processes to share CPU as fairly as possible.

Architecture-related module architecture-specific module designs a set of unified abstract interfaces to shield the hardware details of specific architecture interface chips. This module interacts with CPU to block and resume the process. These operations include getting the register and state information that each process needs to save, and executing assembly code to complete blocking or recovery operations.

The architecture-independent module architecture-independent module interacts with the scheduling policy module to determine the next process to execute, and then calls the architecture-related code to resume the execution of that process. Not only that, the module also calls the interface of the memory manager to ensure that the memory mapping information of the blocked process is saved correctly.

The system call interface module system call interface allows user processes to access resources that Linux Kernel explicitly exposes to user processes. Through a set of properly defined basically unchanged interfaces (POSIX standard), the user application is decoupled from the Linux kernel, so that the user process will not be affected by kernel changes.

(3) data representation

The scheduler maintains a data structure, task list, in which elements are task_struct instances of each active process; this data structure contains not only information used to block and restore processes, but also additional count and state information. This data structure is publicly accessible throughout the kernel layer.

(4) dependency, data flow, control flow

As mentioned earlier, the scheduler needs to invoke the functions provided by the memory manager to select the appropriate physical address for the process that needs to resume execution, which is why the process scheduler subsystem relies on the memory management subsystem. When other kernel subsystems need to wait for the hardware request to complete, they all rely on the process scheduling subsystem for process blocking and recovery. This dependency is reflected through function calls and access to shared task list data structures. All kernel subsystems read or write data structures that represent currently running processes, thus forming a two-way data flow throughout the system.

In addition to the data flow and control flow in the kernel layer, the OS service layer also provides an interface for user processes to register timers. This forms the control flow of the user process by the scheduler. Usually the use case for waking up the sleep process is not in the normal range of control flow because the user process cannot predict when it will be awakened. Finally, the scheduler interacts with CPU to block and restore processes, which in turn forms data flow and control flow between them-- CPU interrupts currently running processes and allows the kernel to schedule other processes to run.

two。 Memory Manager Memory Manager Architecture

(1) Target

The memory management module is responsible for controlling how the process accesses physical memory resources. The mapping between process virtual memory and machine physical memory is managed through the hardware memory Management system (MMU). Each process has its own independent virtual memory space, so two processes may have the same virtual address, but they actually run in different areas of physical memory. MMU provides memory protection so that the physical memory space of the two processes does not interfere with each other. The memory management module also supports swapping-swapping temporarily unused memory pages to swap partitions on disk, which makes the virtual address space of the process larger than the size of physical memory. The size of the virtual address space is determined by the machine word length.

(2) Module

Architecture specific module provides a virtual interface to access physical memory

The architecture-independent module architecture independent module is responsible for address mapping and virtual memory swapping for each process. When a page fault occurs, it is up to this module to decide which memory page should be swapped out of memory-- because this memory page swap selection algorithm needs little change, there is no independent policy module established here.

The system call interface system call interface provides strict access interfaces (malloc and free;mmap and ummap) for user processes. This module allows processes to allocate and free memory and perform memory-mapped file operations.

(3) data representation

Memory management stores the mapping information from virtual memory to physical memory for each process. This mapping information is stored in an instance of the mm_struct structure, and the pointer to this instance is stored in the task_struct of each process. In addition to storing mapping information, the data block should also store information about how the memory manager fetches and stores pages. For example, executable code can store an executable image as a backup; however, dynamically requested data must be backed up to the system page. I don't understand this. Could you please solve the problem? )

Finally, the memory management module should also store access and technical information to ensure the security of the system.

(4) dependency, data flow and control flow

The memory manager controls physical memory and accepts hardware notification (page break) when a page failure page fault occurs-which means that there is a two-way data flow and control flow between the memory management module and the memory management hardware. Memory management also relies on the file system to support swapping and memory mapping-this requirement means that the memory manager needs to call the functional interface procedure calls provided to the file system to store and fetch memory pages to and from disk. Because file system requests are very slow, the memory manager puts the process into hibernation before waiting for the memory page to be swapped in-a requirement that allows the memory manager to call the process scheduler's interface. Because the memory mapping of each process is stored in the data structure of the process scheduler, there is also a two-way data flow and control flow between the memory manager and the process scheduler. The user process can establish a new process address space and be aware of page fault errors-- this requires control flow from the memory manager. Generally speaking, there is no data flow from the user process to the memory manager, but the user process can get some information from the memory manager through select system calls.

3. Virtual file system Virtual File System architecture

(1) Target

The virtual file system provides a unified access interface for data stored on hardware devices. Can be compatible with different file systems (ext2,ext4,ntf, etc.). Almost all the hardware devices in the computer are represented as a general device driver interface. Logical file systems promote compatibility with other operating system standards and allow developers to implement file systems with different strategies. The virtual file system goes a step further, allowing system administrators to mount any logical file system on any device. The virtual file system encapsulates the details of physical devices and logical file systems and allows user processes to access files using a unified interface.

In addition to traditional file system targets, VFS is also responsible for mounting new executables. This task is accomplished by the logical file system module, which enables Linux to support a variety of executables.

(2) Module

Device driver module

Device independent interface module Device Independent Interface: provides the same view of all devices

Logical file system logical file system: for each supported file system

System independent interface system independent interface provides interfaces that are independent of hardware resources and logical file system. This module provides all resources through block device nodes or character device nodes.

The system call module system call interface provides unified control access to the file system by user processes. The virtual file system masks all special features for user processes.

(3) data representation

All files are represented in inode. Each inode records the location information of a file on the hardware device. Not only that, inode also stores function pointers to logical file system modules and device drivers that can perform specific read and write operations. By storing function pointers in this form (that is, the idea of virtual functions in object-oriented), specific logical file systems and device drivers can register themselves with the kernel without requiring the kernel to rely on specific module features.

(4) dependency, data flow and control flow

A special device driver is ramdisk, which opens up an area in main memory and uses it as a persistent storage device. The device driver uses the memory management module to complete the task, so there is a dependency on the memory management module in VFS (the dependency in the figure is reversed, it should be that VFS depends on the memory management module), data flow, and control flow.

The logical file system supports the network file system. This file system accesses files from another machine just as it accesses local files. To achieve this, a logical file system accomplishes its tasks through the network subsystem-- which introduces a dependency of VFS on the network subsystem and the control flow and data flow between them.

As mentioned earlier, the memory manager uses VFS for memory swapping and memory mapping. In addition, when VFS waits for the hardware request to complete, VFS needs to use the process scheduler to block the process; when the request is completed, VFS needs to wake up the process through the process scheduler. Finally, the system call interface allows user processes to call to access data. Unlike the previous subsystem, VFS does not provide a mechanism for users to register ambiguous calls, so there is no control flow from VFS to the user process.

4. Network interface Network Interface architecture

(1) Target

The network subsystem enables the Linux system to connect with other systems through the network. This subsystem supports many hardware devices as well as many network protocols. The network subsystem shields the implementation details of hardware and protocols, and abstracts simple and easy-to-use interfaces for use by user processes and other subsystems-user processes and other subsystems do not need to know the details of hardware devices and protocols.

(2) Module

Network device driver module network device drivers

The device independent interface module device independent interface module provides a consistent access interface for all hardware devices, so that the high-level subsystem does not need to know the details of the hardware.

The network protocol module network protocol modules is responsible for implementing each network transport protocol, such as TCP,UDP,IP,HTTP,ARP and so on.

The protocol independent module protocol independent interface provides a consistent interface independent of specific protocols and specific hardware devices. This allows the rest of the kernel subsystem to access the network without relying on specific protocols or devices.

The system call interface module system calls interface specifies the network programming API that the user process can access.

(3) data representation

Each network object is represented as a socket socket. Sockets are associated with processes in the same way as inode nodes. Sockets can be shared by multiple processes by two task_struct pointing to the same socket.

(4) data flow, control flow and dependencies

When the network subsystem needs to wait for the hardware request to complete, it needs to block and wake up the process through the process scheduling system-which forms the control flow and data flow between the network subsystem and the process scheduling subsystem. Not only that, the virtual file system implements the network file system (NFS) through the network subsystem-which forms the data flow and control flow of the VFS and the nails of the network subsystem.

VII. Conclusion

1. The Linux kernel is a layer in the whole Linux system. Conceptually, the kernel consists of five main subsystems: process scheduler module, memory management module, virtual file system, network interface module and inter-process communication module. These modules interact with each other through function calls and shared data structures.

2. The Linux kernel architecture promotes his success, which enables a large number of volunteer developers to work together appropriately, and makes each specific module easy to expand.

Scalability 1: the Linux architecture makes these subsystems extensible through a data abstraction technology-each specific hardware device driver is implemented as a separate module that supports a unified interface provided by the kernel. In this way, individual developers can add new device drivers to the Linux kernel with minimal interaction with other kernel developers.

Scalability 2: the Linux kernel supports a variety of different architectures. In each subsystem, the architecture-related code is separated to form a separate module. In this way, when some manufacturers launch their own chips, their kernel development teams only need to re-implement the machine-related code in the kernel so that the kernel can be ported to the new chip.

At this point, the study on "how to understand the dependencies of the Linux kernel and its related architecture" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.