Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to find out the cause of Linux crash

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to find the cause of Linux crash". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Although the Linux kernel is known as the "undead" and rarely crashes or crashes, there is still a certain chance of downtime under special circumstances. Because Linux is widely used in production environments, each downtime will cause considerable losses. You may take it for granted that it has Uptime for hundreds of days, but as long as you Down for more than ten seconds, you will immediately sweat. It's really hard to imagine what a stock exchange outage would be like, and maybe investors across the country would make a fuss. So we need some tips to find out the cause of the crash, so as to avoid the crash or kernel crash.

Please note that the following methods may not apply to Server because there is a big difference between a desktop environment and a Server.

X Crash

In fact, Linux kernels rarely make mistakes, and the "crashes" we usually encounter are the illusion that X is unresponsive. What should I do if X doesn't respond?

The usual routine is Ctrl + Alt + F7 (F8) to switch to a tty, then log in with root, execute top to view the program that eats the most resources, and then use commands such as pkill/kill/killall to kill the program. Or use the key combination Ctrl + Alt + Backspace to restart X (black sun and white moon note: this shortcut key combination is turned off in Ubuntu and Fedora).

If the switch to tty fails or does not respond, you can try to log in to this computer using SSH, and then kill the program. Maybe it's just that X doesn't respond, and the kernel and SSH daemon still work, so you can implement this method.

Arch configuration SSH daemon

What if X doesn't work, various methods don't work, and there's no way to log in to this pc through SSH? Don't worry, we also have the "reisub" method. However, the kernel sysrq function (via) should be activated before enabling it. Execute echo "1" > / proc/sys/Kernel/sysrq or modify the / etc/sysctl.conf file to set Kernel.sysrq = 1 when the system starts. Press Alt+sysrq+ {reisub} in turn when the system is abnormal, and the system will restart automatically. (for sysrq, please see: what if Linux crashes? )

If the speaker does not press the Power button to force the shutdown, it may damage the hardware or lose data, or even lead to bad disk path!

X crashes and the kernel is intact

Common symptoms are: program does not respond, blurred screen, mouse movement pointer does not move, keyboard input does not recognize and so on. But the background music can be played normally, or the corresponding LED can be turned off normally after the keyboard Caps Lock/Num Lock/Scroll Lock key is pressed. In this case, you can use the above method to restart X or the computer can return to normal.

Application Crash

This is quite common, but it is also quite difficult to solve. Because most of the applications on Linux are open source, they may not be highly stable. Perhaps due to the lack of libraries or version errors, or the Bug of the code, it is possible to cause exceptions to the program.

Generally encounter this problem, it is recommended to check whether the configuration file is correct, the wrong modification of the configuration file may lead to the failure of the program. If you are sure that the configuration file is not wrong but the program is still abnormal, you can try to delete the configuration file (pay attention to backup! ), and then open the software again. Usually the configuration file of the program is as follows:

~ /. [APPNAME]

~ / .config / [APPNAME]

/ etc/ [APPNAME] .conf

Or it may be a library error, you can enter the program name or program path in the terminal to run the program, according to the terminal prompt information to debug. Due to the variety of possibilities that cause the program to crash, it is not possible to enumerate them all here, so it is recommended that you search google according to the error information and find a solution.

Kernel Panic

X's problem is easy to deal with, but if RPWT encounters Kernel Panic, there is no way for heaven to enter, and there is no way to hit the wall: evil:.

Generally, there are many causes of Kernel Panic, but all of them are rare. For example, hardware problems (irq confilct, bad block, high temperature), software problems (wrong mod, kernel Bug), or file system support (root partition of ext4 mounted without built-in ext4 support), hardware changes (such as adding / replacing memory, not supporting architectural cpu), wrong drivers.

Kernel Panic is also manifested in a variety of forms: failed startup, abnormal long-term io operation, abnormal strobe of keyboard lights, error flickering of wireless and other indicators, no response (please distinguish the xorg crash situation), complete lock, black screen, reisub failure, and so on.

In general, the Linux kernel, which adheres to the principle of KISS, will try its best to resolve all errors and run normally, and if Panic occurs in extreme cases, it will display all relevant information on the screen as much as possible-- as for how much, don't ask, Kernel has done its best.

Because Kernel Panic is an extreme situation, some people may not have encountered it since using Linux. So we need to collect all the relevant information to solve the problem. The various outputs after the error are the most direct and efficient (Dump in tty. Please close x). Because Kernel has crashed, it is not possible to find a complete Log. You can try based on the following clues:

/ var/log/messages-when rp breaks out, a lot of relevant information may be recorded. Look for it by timestamp.

Backtracking operation-recall everything Kernel Panic has done before and roll back. (if a program is installed, you can find the installation log at / var/log/pacman.log)

Dump information-the screen output information is the "last words" of the system, please use a digital camera or pen and paper to record. (tty only)

Next, it should be excluded according to the possible cause of the error. Kernel startup parameters should be minimized, no unnecessary parameters should be attached, and all extraneous hardware should be disabled in BIOS. Related log files:

/ var/log/boot

/ var/log/xorg all relevant (reference only)

/ var/log/messages

If you can, you should record all the screen output and check / var/log/messages.

Possible problems and solutions:

Irq conflict (luckily I didn't encounter it), you can try to modify the hardware irq from bios, or upgrade bios, and replace the computer or disable conflicting hardware without taking effect.

Bad balock, try to repair or block bad channel partitions. It is recommended to replace the disk.

Io error, as above, may also be the reason why there is no built-in file system support, recompile the kernel or find a * * version of the kernel installation

Mod, to delete kernel modules that may cause errors (such as vboxdrv), the commands involved are:

Lsmod: lists loaded modules

Modprobe: load module (Black Sun and White Moon Note: here and other commands corresponding to insmod + depmod is better, modprobe is more similar to the upgraded integrated version of the XXXmod series of commands. )

Rmmod: removes modules from the kernel with the same effect as modprobe-r

Modinfo: display module related information

Driver, a card or n card driver can also cause problems.

Due to problems with the hardware itself, it is recommended to test the availability and compatibility of the hardware (e.g. memtest+)

Kernel bug, if you have the ability, it is recommended to use KDB (Kernel debugger) to debug, or recompile the kernel

This is the end of the content of "how to find the cause of Linux crash". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report