Detailed introduction of Linux user Space and Kernel address Space 07/09 Update SLTechnology News&Howtos

Detailed introduction of Linux user Space and Kernel address Space

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "detailed introduction of Linux user space and kernel address space". In daily operation, I believe many people have doubts about the detailed introduction of Linux user space and kernel address space. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "detailed introduction of Linux user space and kernel address space". Next, please follow the editor to study!

Linux operating system and driver run in kernel space, and applications run in user space. They cannot simply use pointers to transfer data, because of the virtual memory mechanism used by Linux, the data in user space may be swapped out. When the kernel space uses user space pointers, the corresponding data may not be in memory.

Linux kernel address mapping model

X86 CPU uses a segment-page address mapping model. The address in the process code is a logical address, and only after a segment-page address mapping can the physical memory be actually accessed.

The paragraph-page mechanism is shown in the following figure.

Linux kernel address space partition

Usually, the 32-bit Linux kernel address space is divided into user space (3G) and kernel space (4G). Note that this is the 32-bit kernel address space partition, while the 64-bit kernel address space partition is different.

The Origin of High-end memory in Linux Kernel

When kernel module code or thread accesses memory, the memory addresses in the code are logical addresses, and corresponding to the real physical memory addresses, one-to-one mapping of addresses is required. For example, the physical address corresponding to the logical address 0xc0000003 is 0 × 3, and the physical address corresponding to 0xc0000004 is 0 × 4,... ... The relationship between the logical address and the physical address is

Physical address = logical address-0xC0000000

Assuming that according to the above simple address mapping relationship, then the kernel logical address space access is 0xc0000000 ~ 0xffffffff, then the corresponding physical memory range is 0 × 0 ~ 0 × 40000000, that is, only 1 GB of physical memory can be accessed. If 8 GB of physical memory is installed in the machine, the kernel can only access the first 1 G of physical memory, and the latter 7 GB of physical memory will not be accessible, because the address space of the kernel has been fully mapped to the physical memory address range of 0 × 0 ~ 0 × 40000000. Even if 8 gigabytes of physical memory is installed, how can the kernel access the memory with a physical address of 0 × 40000001? There must be a memory logical address in the code. The address space of 0xc0000000 ~ 0xffffffff has been used up, so the memory after the physical address 0 × 40000000 cannot be accessed.

Obviously, the kernel address space 0xc0000000 ~ 0xfffffff cannot be used entirely for simple address mapping. Therefore, the x86 architecture divides the kernel address space into three parts: ZONE_DMA, ZONE_NORMAL, and ZONE_HIGHMEM. ZONE_HIGHMEM is high-end memory, which is the origin of the concept of high-end memory.

In the x86 structure, the three types of areas are as follows:

16MB starting from ZONE_DMA memory

ZONE_NORMAL 16MB~896MB

ZONE_HIGHMEM 896MB ~ end

Understanding of High-end memory in Linux Kernel

Earlier we explained the origin of high-end memory. Linux divides the kernel address space into three parts: ZONE_DMA, ZONE_NORMAL and ZONE_HIGHMEM, and the high-end memory HIGH_MEM address space ranges from 0xF8000000 to 0xFFFFFFFF (896MB~1024MB). So how does the kernel access all physical memory with the help of 128MB high-end memory address space?

When the kernel wants to access memory higher than the 896MB physical address, find a logical address space of the corresponding size from the 0xF8000000-0xFFFFFFFF address space and borrow it for a while. Borrow this logical address space, create a map to the physical memory you want to access (that is, fill the kernel PTE page table), temporarily use it for a while, and return it after use. In this way, others can also use this address space to access other physical memory, using a limited address space to access all physical memory. As shown in the following picture.

For example, if the kernel wants to access a piece of physical memory with the size of 1MB starting from 2G, that is, the physical address range is 0 × 80000000 ~ 0x800FFFFF. Before access, find a section of free address space of 1MB size. Assume that the free address space found is 0xF8700000 ~ 0xF87FFFFF, and use the logical address space of this 1MB to map to the memory of physical address space 0 × 80000000 ~ 0x800FFFFF. The mapping relationship is as follows:

When the kernel accesses 0 × 80000000 ~ 0x800FFFFF physical memory, the 0xF8700000 ~ 0xF87FFFFF kernel linear space is freed. In this way, other processes or code can also use the address 0xF8700000 ~ 0xF87FFFFF to access other physical memory.

From the above description, we can know the most basic idea of high-end memory: borrow a section of address space, establish a temporary address mapping, and release it after use, so that this address space can be recycled to access all physical memory.

Seeing this, some people can't help but ask: what if a kernel process or module keeps occupying a certain logical address space and doesn't release it? If this happens, the high-end memory address space of the kernel will become more and more tight, and if it is occupied and not released, it will be inaccessible without a mapping to physical memory.

In some office buildings in Tsim Sha Tsui, Hong Kong, toilets are rare and have locks. If customers want to go to the bathroom, they can get the key from the front desk and return the key to the front desk after convenience. Although there is only one bathroom, it can meet the needs of all customers to go to the bathroom. If a customer has been occupying the bathroom and the key is not returned, no other customer can go to the bathroom. The idea of high-end memory management in the Linux kernel is similar.

Division of High-end memory in Linux Kernel

The kernel divides high-end memory into three parts: VMALLOC_START~VMALLOC_END, KMAP_BASE~FIXADDR_START, and FIXADDR_START~4G.

For high-end memory, you can get the corresponding page through alloc_page () or other functions, but to access the actual physical memory, you have to convert the page to a linear address (why? Think about how MMU accesses physical memory, that is, we need to find a linear space for the page corresponding to high-end memory, a process called high-end memory mapping.

Corresponding to the three parts of high-end memory, there are three ways to map high-end memory:

Map to the Kernel dynamic Mapping Space (noncontiguous memory allocation)

This approach is simple, because with vmalloc (), when requesting memory in the "kernel dynamic mapping space", it is possible to get pages from the high-end memory (see the implementation of vmalloc), so it is possible that the high-end memory is mapped to the "kernel dynamic mapping space".

Persistent kernel mapping (permanent kernel mapping)

If you get the page corresponding to the high-end memory through alloc_page (), how do you find a linear space for it?

The kernel sets aside a linear space for this purpose, from PKMAP_BASE to FIXADDR_START, to map high-end memory. On the 2. 6 kernel, this address range is from 4G-8M to 4G-4M. This space is called "kernel permanent mapping space" or "permanent kernel mapping space". This space uses the same page catalog table as other spaces, which for the kernel is swapper_pg_dir, and for ordinary processes, it is pointed to through the CR3 register. Typically, this space is 4m in size, so you only need a page table, which the kernel looks for by pkmap_page_table. With kmap (), you can map a page to this space. Because this space is 4m, it can map up to 1024 page at the same time. Therefore, for unused page, and when released from this space (that is, de-mapping), a linear address corresponding to page can be released from this space through kunmap ().

Temporary mapping (temporary kernel mapping)

The kernel reserves some linear space between FIXADDR_START and FIXADDR_TOP for special needs. This space is called "fixed mapping space". In this space, part of it is used for temporary mapping of high-end memory.

This space has the following characteristics:

(1) each CPU takes up a piece of space

(2) in the space occupied by each CPU, it is divided into several small spaces, each small space size is 1 page, and each small space is used for a purpose, which is defined in the km_type in kmap_types.h.

When you want to make a temporary mapping, you need to specify the purpose of the mapping. according to the purpose of the mapping, you can find the corresponding small space, and then use the address of this space as the mapping address. This means that a temporary mapping causes the previous mapping to be overwritten. Temporary mapping can be achieved through kmap_atomic ().

Frequently asked questions:

1. Is there a high-end memory concept in user space (process)?

User processes do not have the concept of high-end memory. High-end memory exists only in kernel space. User processes can only access 3G physical memory at most, while kernel processes can access all physical memory.

2. Is there high-end memory in the 64-bit kernel?

In reality, there is no high-end memory in the 64-bit Linux kernel because the 64-bit kernel can support more than 512GB memory. If the physical memory installed on the machine exceeds the range of the kernel address space, there will be high-end memory.

3. How much physical memory can the user process access? How much physical memory can kernel code access?

32-bit system user processes can access up to 3GB, and kernel code can access all physical memory.

64-bit system user processes can access up to more than 512GB, and kernel code can access all physical memory.

4. what is the relationship between high-end memory and physical address, logical address and linear address?

High-end memory is only related to logical addresses, not directly related to logical addresses and physical addresses.

5. Why not allocate all the address space to the kernel?

If all address space is given to memory, how does the user process use memory? How to ensure that the kernel's use of memory does not conflict with user processes?

(1) Let's ignore Linux's support for segment memory mapping. In protected mode, we know that no matter whether CPU runs in user mode or kernel state, the address accessed by the CPU executor is a virtual address. MMU must read the value in the control register CR3 as the pointer to the current page directory, and then convert the virtual address to the real physical address according to the paging memory mapping mechanism (see relevant documentation) in order to allow CPU to access the real physical address.

(2) for 32-bit Linux, each process has 4G addressing space, but when a process accesses an address in its virtual memory space, how can it not be confused with the virtual space of other processes? Each process has its own page directory PGD,Linux stores the pointer to that directory in the memory structure task_struct. (struct mm_struct) mm- > pgd corresponding to the process. Whenever a process is scheduled (schedule ()) is about to enter the running state, the Linux kernel sets CR3 (switch_mm ()) with the process's PGD pointer.

(3) when creating a new process, create a new page directory PGD for the new process, and copy the kernel page directory entry from the kernel page directory swapper_pg_dir to the corresponding location of the new process page directory PGD. The specific process is as follows:

Do_fork ()-- > copy_mm ()-- > mm_init ()-- > pgd_alloc ()-- > set_pgd_fast ()-- > get_pgd_slow ()-> memcpy (& PGD + USER_PTRS_PER_PGD, swapper_pg_dir + USER_PTRS_PER_PGD, (PTRS_PER_PGD-USER_PTRS_PER_PGD) * sizeof (pgd_t))

In this way, the page directory of each process is divided into two parts, the first part is "user space", which is used to map its entire process space (0x0000 0000-0xBFFF FFFF), that is, 3G byte virtual address, and the second part is "system space", which is used to map (0xC000 0000-0xFFFF FFFF) 1G byte virtual address. It can be seen that the second part of the page directory of each process in the Linux system is the same, so from the process's point of view, each process has 4G bytes of virtual space, the lower 3G bytes are its own user space, and the highest 1G bytes are system space shared with all processes and the kernel.

(4) now suppose we have the following scenario:

In process A, set the hostname of the computer in the network through the system call sethostname (const char * name,seze_t len).

In this scenario, we are bound to involve the problem of passing data from user space to kernel space. Name is an address in user space, which is set to an address in the kernel through system calls. Let's look at some of the details of this process: the specific implementation of the system call is to store the parameters of the system call in the register ebx,ecx,edx,esi,edi (up to five parameters, this scenario has two name and len), then store the system call number in the register eax, and then make process An enter the system space through the interrupt instruction "int 80". Because the CPU running level of the process is less than or equal to the access level 3 of the trap gate set for the system call, you can enter the system space without hindrance to execute the function pointer system_call () set for int 80. Because system_call () belongs to kernel space, its run level DPL is 0. CPU switches the stack to the kernel stack, which is the system space stack of process A. We know that when the kernel creates the task_struct structure for the new process, it allocates two consecutive pages, that is, the size of 8K, and uses the size of the bottom 1k for task_struct (such as # define alloc_task_struct () ((struct task_struct *) _ get_free_pages (GFP_KERNEL,1)), while the rest of the memory is used for the stack space of system space, that is, when moving from user space to system space. The stack pointer esp becomes (alloc_task_struct () + 8192), which is why system space usually uses a macro definition current (see its implementation) to get the task_struct address of the current process. Each time the process enters system space from user space, the system stack is pushed into user stack SS, user stack pointer ESP, EFLAGS, user space CS, EIP, then system_call () presses eax, then calls SAVE_ALL into ES, DS, EAX, EBP, EDI, ESI, EDX, ECX, EBX, and then calls sys_call_table+4*%EAX, in this case sys_sethostname ().

(5) in sys_sethostname (), after some protection considerations, copy_from_user (to,from,n) is called, where to points to kernel space system_utsname.nodename, such as 0xE625A000 from pointing to user space such as 0x8010FE00. Now process An enters the kernel and runs in system space. MMU maps virtual addresses to physical addresses according to its PGD, and finally completes the replication of data from user space to system space. Before preparing to copy, the kernel needs to determine the validity of the user space address and length, and does not check whether the entire interval of a certain length from the user space address has been mapped. If an address in the interval is not mapped or read-write permissions and other problems arise, it will be regarded as a bad address, and a page exception will be generated and will be handled by the page exception service program. The process is as follows: copy_from_user ()-> generic_copy_from_user ()-> access_ok () + _ _ copy_user_zeroing ().

(6) Summary:

* process addressing space 0room4G

* the process can only access 3G~4G in user mode and can only access 3G in kernel mode.

* the process enters the kernel state through the system call

* the 3G~4G part of the virtual space of each process is the same

* the process moving from the user mode to the kernel state will not cause a change in CR3, but will cause a change in the stack

Linux simplifies the segmentation mechanism so that the virtual address is always the same as the linear address, so the virtual address space of Linux is also 0room4G. The Linux kernel divides the 4G byte space into two parts. The highest 1G bytes (from the virtual address 0xC0000000 to 0xFFFFFFFF) are made available to the kernel and are called "kernel space". The lower 3G bytes (from virtual address 0x00000000 to 0xBFFFFFFF) are used by various processes, called "user space". Because each process can enter the kernel through system calls, the Linux kernel is shared by all processes in the system. Therefore, from a specific process's point of view, each process can have 4G bytes of virtual space.

Linux uses two levels of protection: level 0 for kernel use and level 3 for user programs. As you can see from the figure (which cannot be represented here), each process has its own private user space (0such 3G), which is not visible to other processes in the system. The highest 1GB byte virtual kernel space is shared by all processes and the kernel.

1. Mapping from virtual kernel space to physical space

What is stored in the kernel space is the kernel code and data, while the user space of the process stores the code and data of the user program. Both kernel space and user space are in virtual space. Readers will ask, when the system starts, isn't the kernel code and data loaded into physical memory? Why are they also in virtual memory? This has something to do with the compiler, which we will understand later through a specific discussion.

Although kernel space occupies the highest 1GB bytes in each virtual space, mapping to physical memory always starts at the lowest address (0x00000000). For kernel space, the address mapping is a very simple linear mapping. 0xC0000000 is the displacement between the physical address and the linear address, which is called PAGE_OFFSET in Linux code.

Let's take a look at the description and definition of address mapping in kernel space in include/asm/i386/page.h:

/ *

* This handles the memory map.. We could make this a config

* option, but too many people screw it up, and too few need

* it.

* A _ PAGE_OFFSET of 0xC0000000 means that the kernel has

* a virtual address space of one gigabyte, which limits the

* amount of physical memory you can use to about 950MB.

* If you want more physical memory than this then see the CONFIG_HIGHMEM4G

* and CONFIG_HIGHMEM64G options in the kernel configuration.

, /

# define _ _ PAGE_OFFSET (0xC0000000)

……

# define PAGE_OFFSET ((unsigned long) _ _ PAGE_OFFSET)

# define _ _ pa (x) ((unsigned long) (x)-PAGE_OFFSET)

# define _ va (x) ((void *) ((unsigned long) (x) + PAGE_OFFSET))

The comments in the source code state that if you have more physical memory than 950MB, you need to add the CONFIG_HIGHMEM4G and CONFIG_HIGHMEM64G options when compiling the kernel, which we will not consider for the time being. If the physical memory is less than 950MB, for kernel space, given a virtual address x, its physical address is "x-PAGE_OFFSET", and given a physical address x, its virtual address is "x + PAGE_OFFSET".

Again, the macro _ _ pa () only maps a virtual address of kernel space to a physical address, and never applies to user space, where address mapping is much more complex.

two。 Kernel image

In the following description, we call kernel code and data kernel images (kernel image). When the system boots, the Linux kernel image is installed at the beginning of the physical address 0x00100000, where the 1MB begins (1m is reserved for other purposes). However, during normal operation, the entire kernel image should be in the virtual kernel space, so the linker adds an offset PAGE_OFFSET to all symbolic addresses when connecting to the kernel image, so that the starting address of the kernel image in kernel space is 0xC0100000.

For example, the process's page directory PGD, which belongs to the kernel data structure, is in kernel space. During process switching, the register CR3 is set to point to the page directory PGD of the new process, and the starting address of this directory is a virtual address in kernel space, but what CR3 needs is a physical address, so address translation is performed with _ _ pa (). There is a line like this in mm_context.h:

Asm volatile ("movl% 0pen%% CR3":: "r" (_ _ pa (next- > pgd)

This is a line of embedded assembly code that translates the page directory start address next_pgd of the next process into a physical address through _ _ pa (), stores it in a register, and then writes it to the CR3 register with the mov instruction. After the processing of this line, CR3 points to the page directory table PGD of the new process next

At this point, the study of "detailed introduction to Linux user space and kernel address space" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.