Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the kernel architecture of Linux system

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about how to analyze the kernel architecture of the Linux system. The article is rich in content and analyzes and describes it from a professional point of view. I hope you can get something after reading this article.

The kernel is a very important part of a Linux system, so what exactly does the Linux kernel look like? The following article and you in-depth explanation of the Linux system kernel mechanism, friends in need can refer to.

1: before the kernel can use high-end memory pages, it must be mapped to the memory virtual address space using the kmap and kunmap functions discussed below.

2:UMA computers (consistent memory access, uniform memory access) organize available memory in a continuous manner.

Each CPU of a 3:NUMA computer (non-uniform memory access, non-uniform memory access) system has local memory, which supports extremely fast access, and processors are connected by bus to support access to local memory of other CPU.

4: the kernel distinguishes three configuration options: FLATTMEM,DISCONTIGMEM,SPARSEMEM,DISCONTIGMEM.

5: memory is divided into nodes. Each node is associated with a processor in the system. Represented in the kernel as an instance of pg_data_t.

6: each node is divided into memory domain, which is the further subdivision of memory domain.

7:

Note: zonelist: a pointer to the zonelist data structure, which describes the memory management area suitable for memory allocation in order of priority.

8:

1) the ZONE_DMA tag is suitable for the memory domain of DMA.

2) ZONE_DMA32 marks memory domains that are addressable using 32-bit address words and are suitable for DMA.

3) ZONE_NORMAL marks a normal memory domain that can be mapped directly to kernel segments, which is the only memory domain that is guaranteed to exist in all architectures, but there is no guarantee that the address range corresponds to the actual physical memory.

4) ZONE_HIGHMEM marks the physical memory beyond the kernel segment.

5) pseudo memory domain ZONE_MOVABLE.

6) MAX_NR_ZONES acts as a closing flag, which is used when the kernel wants to iterate over all memory domains in the system.

7) each memory domain is associated with an array to organize physical memory pages (page frames) that belong to that memory domain. For each page frame, an instance of struct page is assigned along with the required management data.

8) each node provides an alternate list (with the help of struct zonelist). The list contains other nodes (and associated memory domains) that can be used to allocate memory in place of the current node.

3.2.2 data structure

1) Node management

Pg_data_t is used to represent the basic elements of a node.

Typedef struct pglist_data {struct zone node_ zones [Max _ NR_ZONES]; struct zonelist node_ zonelists [Max _ ZONELISTS]; total number of int nr_zones; struct page*node_mem_map; struct bootmem_data * bdata; unsigned long node_start_pfn; unsigned long node_present_pages;/* physical memory pages * / unsighed long node_spanned_pages / * Total number of physical memory pages, including holes * / int node_id; struct pglist_data * pgdat_next; wait_queue_head_t kswapd_wait; struct task_struct * ksward; int ksward_max_order;} pg_data_t; Note: 1) node_zones is an array that contains the data structure of each memory field in the node.

2) node_zonelists specifies a list of standby nodes and their memory domains to allocate memory at the standby node when the current node has no free space.

3) the number of different memory domains in the node is saved in nr_zones

4) node_mem_map is a pointer to an array of page instances that describes all physical memory pages of a node, which contains pages of all memory domains in the node.

5) bdata points to an instance of the bootstrap memory allocator data structure.

6) node_start_pfn is the logical number of the first page frame of the NUMA node. All page frames are numbered sequentially, and the number of each page frame is globally unique.

It is always 0. 0 in UMA.

7) node_present_pages specifies the number of page frames in the node, and node_spanned_pages gives the length of the node calculated in page frames.

8) node_id is the global node ID.

9) pgdat_next connects to the next memory node, and all memory nodes in the system are connected by a single linked list, ending with a null pointer.

10) kswapd_wait is the waiting queue for the exchange daemon (swap daemon), which will be used when the page frame is swapped out of the node.

Kswapd points to the task_struct. Exe that is responsible for the exchange daemon of this node.

Kswapd_max_order is used for the implementation of the page exchange subsystem to define the length of the area that needs to be released.

11) the memory domain of the node is stored in node_ Zones [Max _ NR_ZONES]. The array always has three items. Even if the node does not have that many memory fields, if there are less than 3, the rest of the array items are populated with zeros.

Enum node_states {N_POSSIBLES, / * Node may become online * / N_ONLINE at some point, / * Node is online * / N_NORMAL_MEMORY, / * when is there a normal memory domain * / # ifdef CONFIG_HIGHMEM N_HIGH_MEMORY / * Node has a normal or high-end memory domain * / # else N_HIGH_MEMORY = northbound NORMALS MEMORY N_CPU, / * Node has one or more CPU*/ NR_NODE_STATES} Note: N_HIGH_MEMORY is used if the node has normal or high-end memory, and N_NORMAL_MEMORY is set only if the node does not have high-end memory.

2) memory domain

The kernel uses the zone structure to describe memory domains.

Struct zone {

/ * Fields normally accessed by the page allocator * /

Unsigned long pages_min,pages_low,pages_high; Note: 1) if the number of free pages is more than pages_high, the state of the memory domain is ideal.

2) if the number of free pages is lower than pages_low, the kernel begins to swap the pages out to the hard disk

3) if the number of free pages is lower than that of pages_min, the memory domain is in urgent need of free pages and page recycling is needed.

4) the watermark in the data structure is worth filling and processed by init_per_zone_pages_min.

5) setup_per_zone_pages_min sets the pages_min of struct zone

Pages_low,pages_high member.

Unsigned long lowmem_ reserve [Max _ NR_ZONES]; Note: this array specifies several items for various memory fields for critical memory allocations that must not fail anyway.

Struct per_cpu_pageset pageset [NR _ CPUS]; Note: this array is used to implement a list of hot / cold page frames for each CPU. The kernel uses these lists to keep "fresh" pages that can be used to satisfy the implementation.

Hot page frames: in the cache, you can quickly access

Cold page frames: page frames that are not in the cache

NR_CPUS is a macro constant that can be configured at compile time.

Note: the type of array element is per_cpu_pageset

Struct per_cpu_pageset {struct per_cpu_pages pcp [2]; / * Index 0 corresponds to hot pages, index 1 corresponds to cold pages * /} _ _ cacheline_aligned_in_smp; Note: this structure consists of an array with array items, the first item manages hot pages. The second page manages the cold page.

Useful data is stored in per_cpu_pages.

Struct per_cpu_pages {int count; / * the number of pages associated with the list * / int high; / * high is the page limit watermark, emptying the list when needed. If the value of count exceeds high, that is, if there are too many pages in the list * / int batch; / * add / delete multiple pages, the size of the block * / struct list_head list; / * list is a double linked list of pages. Save the cold page or hot page of the current CPU * /}

"

/ *

* Free areas of different lengths

, /

Spinlock_t lock

Struct free_area free_ area [Max _ OEDER]; Note: an array of data structures of the same name is used to implement a partner system, and each array element represents some contiguous area of memory of some fixed length. Management of free memory pages contained in each area. Free_area is a starting point.

ZONE_PAGGING (_ pad1_)

/ * Fields usually accessed by the scanner are retrieved by the page * /

Spinlock_t lru_lock

Struct list_head active_list; Note: is a collection of active pages

Struct list_head inactive_list; Note: is a collection of inactive pages

Unsighed long nr_scan_active; note: the number of active pages that need to be scanned when reclaiming memory

Unsighed long nr_scan_inactive; note: the number of inactive pages that need to be scanned when reclaiming memory

Unsighed long pages_scanned; Note: pages scanned since the last collection

Unsighed long flags; Note: describe the current state of the memory domain

Typedef enum {ZONE_ALL_UNERCLAIMABLE, / * all pages have been pinned and cannot be recycled * / ZONE_RECLAIM_LOCKED, / * prevent concurrent recycling * / ZONE_OOM_LOCKED, / * memory domain can be recycled * /} zone_flags_t

/ * memory domain statistics * /

Atmoic_long_t vm_ stat [NR _ VM_STAT_ITEMS]; Note: a large number of statistics about memory domains are maintained. The auxiliary function zone_page_state is used to read the information in vm_stat.

Int prev_priority; Note: stores the priority of the last scan operation that scanned the memory domain. The scan is performed by try_to _ free_pages until enough page frames are released.

ZONE_PAGGING (_ pad2_)

/ * rarely used or in most cases read-only fields * /

Wait_queue_head_t * wait_table; Note: it is a waiting queue that can be used to wait for a page to become available. The process is lined up in a queue.

The unsighed long wait_table_hash_nr_entries; column waits for certain conditions, and when the condition is true, the kernel informs the process to resume work.

Unsighed long wait_table_bits

/ * Fields that support discontinuous memory model * /

Struct pglist_data * zone_pgdat; Note: the association between the memory domain and the parent node is established by zone_pgdat, and zone_pgdat points to the corresponding pg_list_data instance.

Unsighed long zone_start_pfn; Note: the index of the first page frame in the memory domain

Unsighed long spanned_pages; / * Total length, including voids * /

Unsighed long present_pages; / * Total length, excluding holes * / number of pages actually available

/ *

* rarely used fields

, /

Char * name; Note: is a string that holds the idiomatic name of the memory domain. 3 options available, Normal

DMA,HighMem

} _ _ cacheline_maxaligned_in_smp

3. Calculation of watermark in memory domain

Before calculating the watermark, the kernel first determines the minimum amount of memory space that needs to be reserved for critical allocation. This value increases nonlinearly with the amount of available memory and remains in the global variable min_free_kbytes.

Note: 1) the lower bound of the high-end memory domain SWAP_CLUSTER_MAX.

2) it defines the size of the packet.

3) setup_per_zone_lowmem_reserve calculates lowmem_reserve

The above is the editor for you to share how to analyze the kernel architecture of the Linux system, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report