In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
Today, I will talk to you about how to carry out Linux kernel Crash analysis, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.
In the work, we often encounter some situations of kernel crash. This paper analyzes the printing information of the kernel after the emergence of crash, and the kernel version used is: Linux2.6.32.
The life cycle of each process ranges from a few milliseconds to several months. There is generally interaction with the kernel, for example, user-space programs use system calls to enter kernel space. Instead of using the stack space of user space, use the corresponding kernel stack space. For each process, the Linux kernel stores two different data structures in a separate storage space allocated for the process: one is the process stack in kernel state, and the other is the data structure thread_info next to the process descriptor, called thread descriptor. The stack size of the kernel is typically 8KB, which is 8192 bytes and occupies two pages. The kernel stack is defined in the thread_info.h file in the Linux-2.6.32 kernel:
# define THREAD_SIZE 8192
Use the following federated structure in the Linux kernel to represent the thread descriptor and kernel stack of a process, and file include/linux/sched.h in the kernel.
Union thread_union {struct thread_info thread_info; unsigned long Stack [thread _ SIZE/sizeof (long)];}
The structure is a consortium, and we have seen an explanation of union in the C language book, which is described in C Programming Language as follows:
1) the complex is a structure
2) the offset of all its members from the base address is 0
3) the space of this structure should be large enough to accommodate the widest members.
4) the alignment should be suitable for all the members.
From the above description, the size of the thread_union structure is 8192 bytes. This is the size of the stack array, and the type is the unsigned long type. Because the member variables in the consortium all occupy the same memory area, there is always a concept when writing code that only one of the member variables can be used for an instance of a consortium, otherwise the original variable will be overwritten. If this sentence is correct, there must be a premise that members occupy the same number of bytes, and when the number of bytes occupied by members is different, only the corresponding bytes will be overwritten. For the thread_union complex, we can access both members at the same time, as long as we can get the addresses of the two member variables correctly.
When a process in the kernel uses too much stack space, the kernel stack will overflow to the thread_info section, which will lead to serious problems (system restart), for example, the level of recursive calls is too deep; the data structure defined in the function is too large.
Figure: relationship between thread_infotask_struct in process and kernel stack
Let's take a look at the structure of thread_info:
Struct thread_info {unsigned long flags;/* underlying flag, * / int preempt_count;/* 0 = > preemptive, bug * / mm_segment_t addr_limit;/* process address space * / struct task_struct * task;/* task_struct pointer of the current process * / struct exec_domain * exec_domain / * execution interval * / _ U32 cpu;/* current cpu * / _ U32 cpu_domain;/* cpu domain * / struct cpu_context_save cpu_context;/* cpu context * / _ U32 syscall;/* syscall number * / _ U8 used_cp [16] / * thread used copro * / unsigned long tp_value; struct crunch_state crunchstate; union fp_state fpstate _ _ attribute__ ((aligned (8); union vfp_state vfpstate;#ifdef CONFIG_ARM_THUMBEE unsigned long thumbee_state;/* ThumbEE Handler Base register * / # endif struct restart_block restart_block;/* is used to implement the signal mechanism * /}
PS: (1) flag is used to save a variety of specific process flags, the two most important of which are TIF_SIGPENDING. If the signal of the process to be processed is set, TIF_NEED_RESCHED indicates that the process should need the scheduler to choose another process to replace the execution of this process.
Combined with the above knowledge, take a look at the above information that is printed when the kernel prints the stack information. The following print information is a situation encountered in the work, printing the stack information of the kernel, the PC pointer is in dev_get_by_flags, the inaccessible kernel virtual address is 45685516, and the generally accessible addresses in the kernel are addresses that begin with 0xCXXXXXXX.
Unable to handle kernel paging request at virtual address 45685516pgd = c65a4000 [45685516] * pgd=00000000Internal error: Oops: 1 [# 1] last sysfs file: / sys/devices/form/tpm/cfg_l3/l3_rule_addModules linked in: splic mmp (P) CPU: 0 Tainted: P (2.6.32.11 # 42) PC is at dev_get_by_flags+0xfc/0x140LR is at dev_get_by_flags+0xe8/0x140pc: [] lr: [] psr: 20000013sp: c07e9c28 ip: 00000000 fp: c07e9c64r10: c6bcc560 R9: c646a220 R8: c66a0000r7: c6a00000 R6: c0204e56 R5: 30687461 R4: 45685516r3: 00000000 R2: 00000010 R1: c0204e56 R0: ffffffffFlags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernelControl: 0005397f Table: 065a4000 DAC: 00000017Process swapper (pid: 0 Stack limit = 0xc07e8270) Stack: (0xc07e9c28 to 0xc07ea000) 9c20: c0204e56 c6a00000 45685516 c69ffff0 c69ffff0 c69ffff09c40: c6a00000 30687461 c66a0000 c6a00000 00000007 c64b210c c07e9d24 c07e9c689c60: c071f764 c06bed38 c66a0000 c66a0000 c6a00000 c6a00000 c66a0000 c6a000009c80: c07e9cfc c07e9c90 c03350d4 c0334b2c 00000034 00000006 00000100 c64b21049ca0: 0000c4fb c0243ece c66a0000 c0beed04 c033436c c646a220 c07e9cf4 000000009cc0: c66a0000 00000003 c0bee8e8 c0beed04 c07e9d24 c07e9ce0 c06e4f5c 00004c689ce0: 00000000 faa9fea9 faa9fea9 0000000000000 c6bcc560 c0335138 c646a2209d00: c66a0000 c64b2104 c085ffbc c66a0000 c0bee8e8 00000000 c07e9d54 c07e9d289d20: c071f9a0 c071ebc0 00000000 c071ebb0 800000000000007 c67fb460 c646a2209d40: c0bee8c8 00000608 c07e9d94 c07e9d58 c002a100 c071f84c c0029bb8 800000009d60: c07e9d84 c0beee0c c0335138 c66a0000 c646a220 00000000 c4959800 c49598009d80 : c67fb460 00000000 c07e9dc4 c07e9d98 c078f0f4 c0029bc8 00000000 c0029bb89da0: 80000000 c07e9dbc c6b8d340 c66a0520 00000000 c646a220 c07e9dec c07e9dc89dc0: c078f450 c078effc 00000000 c67fb460 c6b8d340 00000000 c67fb460 c64b20f29de0: c07e9e24 c07e9df0 c078fb60 c078f130 00000000 c078f120 80000000 c0029a949e00: 00000806 c6b8d340 c0bee818 00000001 00000000 c4959800 c07e9e64 c07e9e289e20: c002a030 c078f804 c64b2070 00000000 c64b2078 ffc45000 c64b20c2 c085c2dc9e40: 00000000 c085c2c0 00000000 c0817398 00086c2e c085c2c4 c07e9e9c c07e9e689e60: c06c2684 c0029bc8 00000001 00000040 00000000 c085c2dc c085c2c0 000000019e80: 0000012c 00000040 c085c2d0 c0bee818 c07e9ed4 c07e9ea0 c00284e0 c06c26089ea0: bf00da5c 00086c30 00000000 00000001 c097e7d4 c07e8000 00000100 c08162d89ec0: 00000002 c097e7a0 c07e9f14 c07e9ed8 c00283d0 c0028478 56251311 00023c889ee0: c07e9f0c 00000003 c08187ac 00000018 00000000 01000000 c07ebc70 00023cbc9f00: 56251311 00023c88 c07e9f24 c07e9f18 c03391e8 c0028348 c07e9f3c c07e9f289f20: c0028070 c03391b0 Ffffffff 0000001f c07e9f94 c07e9f40 c002d4d0 c00280109f40: 00000000 00000001 c07e9f88 60000013 c07e8000 c07ebc78 c0868784 c07ebc709f60: 00023cbc 56251311 00023c88 c07e9f94 c07e9f98 c07e9f88 c025c3e4 c025c3f49f80: 60000013 ffffffff c07e9fb4 c07e9f98 c025c578 c025c3cc 00000000 c09812049fa0: c0025ca0 c0d01140 c07e9fc4 c07e9fb8 c0032094 c025c528 c07e9ff4 c07e9fc89fc0: c0008918 c0032048 c0008388 00000000000 c0025ca0 00000000 000539759fe0: c0868834 c00260a4 00000000 c07e9ff8 00008034 c0008708 00000000 00000000Backtrace: [] (dev_get_by_flags+0x0/0x140) from [] (arp_process+0xbb4/0xc74) r7:c64b210c r6000000007 r5:c6a00000 r4:c66a0000
(1) first of all, take a look at which file the stack information is printed in the kernel. In the fault.c file, the _ _ do_kernel_fault function, in the above print, Unable to handle kernel paging request at virtual address 45685516, which is an inaccessible address in kernel space.
Static void _ do_kernel_fault (struct mm_struct * mm, unsigned long addr, unsigned int fsr, struct pt_regs * regs) {/ * Are we prepared to handle this kernel fault? * / if (fixup_exception (regs)) return; / * * No handler, we'll have to terminate things with extreme prejudice. * / bust_spinlocks (1); printk (KERN_ALERT "Unable to handle kernel% s at virtual address lx", (addr pgd); pgd= pgd_offset (mm, addr); printk (KERN_ALERT "[lx] * pgd=lx", addr, pgd_val (* pgd)); … }
(3) call in the die function to get the address of the thread_info structure in the die function.
Struct thread_info * thread = current_thread_info (); static inline struct thread_info * current_thread_info (void) {register unsigned long sp asm ("sp"); return (struct thread_info *) (sp & ~ (THREAD_SIZE-1));}
Sp: 0xc07e9c28 gets the address of thread_info through current_thread_info
(0xc07e9c28 & 0xffffe000) = 0xC07E8000 (address of thread_info, that is, address at the bottom of the stack)
(4) the following print information is printed in the _ _ die function
Internal error: Oops: 1 [# 1] last sysfs file: / sys/devices/form/tpm/cfg_l2/l2_rule_addModules linked in: splic mmp (P) CPU: 0 Tainted: P (2.6.32.11 # 42) PC is at dev_get_by_flags+0xfc/0x140LR is at dev_get_by_flags+0xe8/0x140pc: [] lr: [] psr: 20000013sp: c07e9c28 ip: 00000000 fp: c07e9c64r10: c6bcc560 R9: c646a220 R8: c66a0000r7 : c6a00000 R6: c0204e56 R5: 30687461 R4: 30687461r3: 00000000 R2: 00000010 R1: c0204e56 R0: ffffffffFlags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernelControl: 0005397f Table: 065a4000 DAC: 00000017Process swapper (pid: 0 Stack limit = 0xc07e8270) Stack: (0xc07e9c28 to 0xc07ea000)
Function call relationship: die ("Oops", regs, fsr);-à _ die (str, err, thread, regs)
The following is the definition of the _ _ die function:
Static void _ _ die (const char * str, int err, struct thread_info * thread, struct pt_regs * regs) {struct task_struct * tsk = thread- > task; static int die_counter; / * Internal error: Oops: 1 [# 1] * / printk (KERN_EMERG "Internal error:% s:% x [# d]" S_PREEMPT S_SMP ", str, err, + + die_counter) / * last sysfs file: / sys/devices/form/tpm/cfg_l2/l2_rule_add*/ sysfs_printk_last_file (); / * module information loaded in the kernel Modules linked in: splic mmp (P) * / print_modules (); / * print register information * / _ show_regs (regs) / * the comm in the Process swapper (pid: 0, stack limit = 0xc07e8270) tsk- > comm task_struct structure represents the name of the executable file after the path is removed. Here the swapper is the idle process, the process number is 0, and the kernel process init is created. Where stack limit = 0xc07e8270 points to the end address of thread_info. * / printk (KERN_EMERG "Process%. * s (pid:% d, stack limit = 0x%p)", TASK_COMM_LEN, tsk- > comm, task_pid_nr (tsk), thread + 1) / * the dump_mem function prints content from the top of the stack to the current sp * / if (! user_mode (regs) | | in_interrupt ()) {dump_mem (KERN_EMERG, "Stack:", regs- > ARM_sp, THREAD_SIZE + (unsigned long) task_stack_page (tsk)); dump_backtrace (regs, tsk); dump_instr (KERN_EMERG, regs);}}
In the above function, the pointing relationship between thread_info,task_struct,sp is mainly used. Stack, a member of the task_struct structure, is the bottom of the stack and the address of the corresponding thread_info structure. Stack data is stored downwards from + 8K at the bottom of the stack. SP points to the top of the current stack. (unsigned long) task_stack_page (tsk)
# define task_stack_page (task) ((task)-> stack), which gets the bottom of the stack, that is, the thread_info address, based on task_struct.
# define task_thread_info (task) ((struct thread_info *) (task)-> stack), which gets the thread_info pointer based on task_struct.
(5) dump_backtrace function
This function is used to print the calling relationship of the function. Fp for the frame pointer, used to trace the way the program, direction tracking function calls. This function is mainly checked by fp to see if backtrace can be done, and if possible, call the assembled c_backtrace, in the arch/arm/lib/backtrace.S function.
Static void dump_backtrace (struct pt_regs * regs, struct task_struct * tsk) {unsigned int fp, mode; int ok = 1; printk ("Backtrace:"); if (! tsk) tsk = current; if (regs) {fp = regs- > ARM_fp; mode = processor_mode (regs);} else if (tsk! = current) {fp = thread_saved_fp (tsk); mode = 0x10 } else {asm ("mov% 0, fp": "= r" (fp):: "cc"); mode = 0x10;} if (! fp) {printk ("no frame pointer"); ok = 0;} else if (verify_stack (fp)) {printk ("invalid frame pointer 0xx", fp); ok = 0;} else if (fp)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.