What are the knowledge points of Linux memory management 07/01 Update SLTechnology News&Howtos

What are the knowledge points of Linux memory management

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the relevant knowledge of "what are the knowledge points of Linux memory management". The editor shows you the operation process through an actual case, the operation method is simple and fast, and it is practical. I hope this article "what are the knowledge points of Linux memory management" can help you solve the problem.

1 preface

Memory management is a very important part of the Linux kernel, everything is connected, and some problems in computer science can find prototypes in real life, so I think most computer scientists are good at observing life and summing up. Human society is a complex machine full of mechanisms and rules, so sometimes it is better to jump into the ocean of code than to go back to life, find a prototype and then explore the code.

2 Why do you need to manage memory

Laozi's famous view is to govern by doing nothing. To put it simply, it can operate methodically without too much intervention and fully relying on self-consciousness. The ideal is beautiful and the reality is cruel.

There are some problems with managing memory in a primitive and simple way in a linux system. Let's take a look at a few scenarios.

2.1 memory management issues

Process space isolation problem

If there are three ABC processes running in the memory space of linux, set the address space assigned by os to process An is 0-20m process B address space 30-80m, process C address space 90-120m, as shown in the figure:

At some point, there may be problems with the access to the program space, such as process An accesses the space belonging to process B, process B accesses the space belonging to process C, and even modifies the value of the space, which will cause confusion and errors. so in practice, this is not allowed to happen.

Memory efficiency and insufficient memory issues

The memory of the machine is limited, and the number of processes cannot be determined. if the process that has been started at some point occupies all the memory space, it will not be possible to start a new process because there is no new memory to allocate. however, we observe that processes that have been started are sometimes sleeping, that is, they do not use memory, so the efficiency is really a little low. So we need an administrator to free up unused memory. In addition, continuous memory is really precious, and many times we can't allocate continuous memory effectively and timely, so virtualization and discretization may effectively improve memory utilization.

Problems of program positioning, debugging, compilation and running

Because the location of the program is uncertain, we will have a lot of problems in locating problems, debugging code, compiling and executing. We hope that each process has a consistent and complete address space, and the same starting position places heaps, stacks and code segments, so as to simplify the use of linker linkers and loader loaders in the process of compilation and execution.

2.2 Virtual address space

In order to solve some of the above problems, linux system introduces the concept of virtual space. The emergence of virtualization is closely related to hardware, which can be said to be the result of the combination of software and hardware. Virtual address space is the middle layer added by programs and physical space, which is also the focus of memory management.

As a kind of large capacity storage, disk disk also participates in the operation of the program as a part of "memory". The memory management system will swap out the pages of the less commonly used inactive memory. It can be considered that the memory is the disk cache, and the active data is retained in the memory, thus indirectly expanding the limited physical memory space, which is called virtual memory relative to the physical memory.

3. Segment page management mechanism

This article does not go deep into managing memory by segments and managing memory by pages, because there are many excellent articles about these details, and you are interested in using search engines at one click.

The segmented page mechanism is not achieved overnight. It has gone through the stages of simple physical segmentation, simple paging and simple logical segmentation, and finally evolved the memory management mode of the combination of segmentation and paging. The combination of segmented pages and paging not only obtains the advantages of segmentation and paging, but also avoids the disadvantages of a single mode, which is a better management mode.

This paper only wants to explain some concepts for the segment page management mechanism. The segment page management mechanism is a combination of segmented management and paging management, segment management is a logical management mode, and paging management is a partial physical management mode.

Some of the technologies and implementations in computers can be found in real life. That's probably what it means to say that art and technology are derived from life.

Take a chestnut:

In the household registration management, there will be the concept of districts, counties and cities, but in fact there is no such entity, which is logical. The addition of these administrative units can make address management more direct.

For us residents, the only entity is their own house, this is a physical unit, is real, this is also the most basic unit.

Compared with the linux page time management, the segment is the logical unit equivalent to the concept of district, county and city, and the page is the physical unit equivalent to the concept of community / house, which is much more convenient.

The multi-level page table is also easy to understand. If the total physical memory has 4GB and the page size is 4KB, then there will be a total of 2 ^ 20 pages, and the number is still very large. It is not convenient for numbering to establish index addressing, so multi-level page tables are introduced to reduce storage and facilitate management.

A diagram of the mapping relationship between logical addresses and physical addresses supported by the segment page mechanism, that is, the corresponding relationship between virtual addresses and physical addresses:

The picture is from the Internet.

The memory management unit (MMU Memory Management Unit) is a hardware layer component that mainly provides mapping of virtual addresses to physical addresses.

The work flow of MMU: CPU generates a logical address to the segmentation unit, which converts the logical address to a linear address, and then the linear address is given to the paging unit. The paging unit converts the memory physical address according to the page table mapping, in which a page fault may occur.

Page Fault is only when the software tries to access a virtual address, after the segment page is converted to a physical address, and then finds that the page is not in memory, then cpu will report an interrupt, and then call in or allocate the relevant virtual memory. If an exception occurs, it may also be interrupted directly.

4. Physical memory and memory fragmentation

The segment page management mechanism mentioned earlier is part of the virtual space, but another important part of linux memory management is the management of physical memory, that is, how to allocate and reclaim physical memory, which involves some memory allocation algorithms and allocators.

4.1 physical memory allocator

Allocators and allocation algorithms are like corporate finance, memory is like corporate funds, how to make rational use of funds is the financial job, and how to make rational use of physical memory is the responsibility of the allocator.

4.2 Classification and mechanism of memory fragmentation

If we don't know what memory fragmentation is, imagine what we often call fragmentation time, that is, time that is idle but not used, and memory is the same.

Neither time nor memory can be used effectively after fragmentation, so it is very important for us to manage and reduce fragmentation reasonably, which is also the research focus of physical memory allocation algorithm and allocator.

According to the location and cause of fragments, memory fragments are divided into external fragments and internal fragments. Let's take a look at the visual display of these two kinds of fragments:

The picture is from the Internet.

It can be seen from the figure that external fragments are unallocated memory space between processes. The occurrence of external fragments is directly related to the frequent allocation and release of memory by processes, which is easy to understand. by simulating the release of processes that allocate different space at different times, you can see the generation of external debris.

Internal fragmentation is mainly due to the granularity of the allocator and some address restrictions, resulting in the actual allocation of memory larger than the required memory, so that there will be memory holes within the process.

Although the virtual address makes the memory used by the process discrete in physical memory, many times the process needs a certain amount of continuous physical memory. If a large number of fragments exist, it will cause the problem that the process cannot be started, as shown in the figure Process7 requires a continuous piece of physical memory but cannot be allocated:

The picture is from the Internet.

If it is still not very clear, imagine the scene of going to the canteen or taking the bus with three or five friends. There are no three consecutive seats in the whole car, so either sit separately or stand:

5. Basic principles of partner system algorithm

5.1 some preparation knowledge

Physical page frame

Linux divides physical memory according to pages, and the size of memory pages may vary in different software and hardware. Linux kernel is set to 4KB, and some kernels may be larger or smaller. At that time, different sizes are considered in practice, just like bread, which is not uniform.

Page frame record structure

In order to establish the monitoring of the use of the physical memory page page in the kernel, there will be data structures such as struct page to record the location address / usage of the page, which is equivalent to an account for the management of the memory page by the kernel.

Delay allocation and real-time allocation

Linux system can be divided into kernel mode and user mode, so the request for memory in kernel mode is satisfied immediately and the request must be reasonable. However, the request for memory in user mode always delays the allocation of physical memory as much as possible, so the process in user mode first obtains a virtual memory area and obtains a real piece of physical memory through page fault exception at runtime. When we execute malloc, we only get virtual memory, not real physical memory, which is also caused by this reason.

5.2 introduction to partner system

The first time I heard the name of this algorithm, I wondered why it was called a partner system. Let's uncover the secret together.

What problem does the partner system want to solve?

Partner system algorithm is a powerful tool to solve external fragments. to put it simply, it establishes a set of management mechanism to allocate and recover resources efficiently and reduce external fragments in the scenario of frequent requests and release of a set of consecutive page frames of different sizes.

The idea of solving the external debris

The first idea is that mapping existing external fragments to continuous linear space through new technologies is equivalent to a governance scheme instead of reducing the generation of external fragments. but this scheme is ineffective when continuous physical memory is really needed.

The second way of thinking: record these small free discontiguous memory, and if there are new allocation requirements, search for appropriate ones to allocate free memory, so as to avoid allocating memory in new areas, and there is a feeling of turning waste into treasure. in fact, this scene is also very familiar when you want to eat a bag of biscuits, your mother will definitely say eat the remaining half of the previous half, do not open a new one.

Based on some other considerations, the linux kernel chose the second way to deal with external fragmentation.

Definition of partner memory block

In a partner system, two memory areas of the same size and contiguous physical addresses are called partners, and the requirement of contiguous addresses is actually more stringent, but this is also the key to the algorithm, because such two memory areas can be merged into a larger area.

The core idea of partner system

The partner system manages successive physical page frames of different sizes, allocates them from the nearest page frame size when applying, reassembles the rest, and merges the partnership memory into a large page frame.

5.3 basic process of partner system

The partner system maintains 11 block linked lists, each of which contains 2 ^ n consecutive physical pages. When 1024 4KB pages correspond to 4MB-sized continuous physical memory blocks, where n is called order. In partner systems, order is 0: 10, that is, the smallest is 4KB, and the largest memory block is 4MB. These physical blocks of the same size are managed by bidirectional linked lists. The figure shows the two bidirectional linked lists of order=0 and order=2:

The picture is from the Internet.

Memory request process: suppose a page block is requested, and the partner system algorithm first checks the order=0 linked list to see if there are any free blocks to allocate. If not, find the next larger block, find a free block in order=1 's linked list, split 2 page frames when there is in the linked list, and allocate 1 page frame to add to order=0 's linked list. If no free block is found in the linked list of the order=1, the search continues to the larger order. If it is found for split processing, if there is no free block in the linked list to the order=10, the algorithm reports an error.

The process of merging memory: the process of merging memory is the embodiment of partner blocks in the partner algorithm, which combines the memory of two blocks with the same size An and their continuous physical addresses into a separate block with a size of 2A. The partner algorithm iterates and merges from the bottom up. In fact, this process is very similar to the merging process of sst in leveldb, except that the partner algorithm requires that the memory blocks are continuous, and this process also reflects the friendliness of the partner system to large blocks of memory.

The picture is from the Internet.

5.4 advantages and disadvantages of partner system

The partner system algorithm solves the problem of external fragmentation well, and the allocation of large memory blocks is friendly and small-grained memory may cause internal fragmentation, but the partner system has a strict definition of partner blocks. and more linked list operations are involved in the process of merging partner blocks, and some frequent applications may be split as soon as they are merged, which makes it useless. So there are still some problems with the partner system.

6. Slab allocator

From the introduction of the partner system, we can know that the smallest unit allocated is the 4KB page frame, which is very wasteful for some frequently requested memory as small as tens of bytes, so we need a finer-grained allocator, which is the slab allocator.

The slab allocator is not separate from the partner system, but is built on the partner system. It can be regarded as the secondary distributor of the partner system, closer to the user side, but the slab allocator is more complex than the partner system in structure implementation because it is closer to the user.

Personal feeling that the highlights of the slab allocator include: minimum granularity for objects and memory lazy return.

The slab allocator used by Linux is based on an algorithm first introduced by Jeff Bonwick for the SunOS operating system. The allocator of Jeff revolves around the object cache. In the kernel, a large amount of memory is allocated for a limited set of objects, such as file descriptors and other common structures. Jeff found that initializing normal objects in the kernel takes longer than it takes to allocate and release them. So his conclusion is that memory should not be freed back to a global memory pool, but should be kept in a state specific to initialization.

From "Analysis of linux slab allocator"

The theoretical basis of slab's adoption of objects as the minimum unit is that the time it takes to initialize a structure may exceed the time allocated and released.

The slab allocator can be seen as a memory pre-allocation mechanism, just as supermarkets put commonly used items in a location that is easier for everyone to find, and they can be allocated as soon as they are ready to apply in advance.

Slabs_full: the slab in the linked list has been fully allocated

Slabs_partial: the slab part of the linked list has been assigned

Slabs_empty: the slab in the linked list is idle, that is, it can be recycled.

Objects are allocated and released from slab, and the slab list of each kmem_cache has a state transition, but the recovered slab is not immediately returned to the partner system, and the recently released objects are allocated first. The purpose is to make use of the locality principle of the cpu cache, we can see that the details of the slab allocator are well done, but in order to implement this complex set of logic Maintaining multiple queues is more complex than a partner system.

This is the end of the introduction of "what are the knowledge points of Linux memory management". Thank you for your reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.