In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces how to use watchdog in Linux kernel, which has certain reference value. Interested friends can refer to it. I hope you will gain a lot after reading this article. Let Xiaobian take you to understand it together.
There are three watchdogs in the Linux kernel that need to be carefully nurtured:
\1. /dev/watchdog
2. Softlockup detection mechanism
3. Hardlockup detection mechanism
First look at 1./ dev/watchdog, how should this watchdog be fed? There is a sample code in the linux kernel:
samples/watchdog/watchdog-simple.c
1// SPDX-License-Identifier: GPL-2.0 2#include 3#include 4#include 5#include 6 7int main(void) 8{ 9 int fd = open("/dev/watchdog", O_WRONLY);10 int ret = 0;11 if (fd == -1) {12 perror("watchdog");13 exit(EXIT_FAILURE);14 } 15 while (1) {16 ret = write(fd, "\0", 1); 17 if (ret != 1) {18 ret = -1; 19 break;20 } 21 sleep(10);22 } 23 close(fd);24 return ret;25}
In this example, every 10 seconds will write 0 to the "/dev/watchdog" file, this is the dog feeding process, see this example, it seems that I can not feel the use of this watchdog, but in the actual project, the use is too great, for example:
The central bank of a certain country runs a database program on a Linux server with 4T memory and 320 CPU cores. The database contains the bank account information of all the people in his country. When the database program is running, IO read and write errors occur, or the program bug is stuck, then his people cannot save money and withdraw money. The entire national economy is paralyzed instantly.
At this point, think about Linux system has no mechanism to solve this problem, this time "/dev/watchdog" came,
At this time, you only need to add a sample program similar to the above to the database program, and feed the dog once every 10s.
As long as the database program is stuck, it cannot feed the dog after it is stuck. After 60s by default, the dog will strike and immediately trigger the server restart by default.
Server restart will reload the database program, or the server in the restart process, because the server and its server cluster lost contact, thus triggering the cluster of brain detection, the database program moved to other devices in the cluster to run, at this time to reduce a lot of losses. So this dog/dev/watchdog is too useful.
Let's look at how it works:
#ps -ef | grep watchdogroot 104 2 0 2020 ? 00:00:00 [watchdogd]#ls -l /dev/watchdog*crw------- 1 root root 10, 130 Dec 30 20:04 /dev/watchdogcrw------- 1 root root 247, 0 Dec 30 20:04 /dev/watchdog0
See that there is a kernel thread watchdog, and two character files in the system: /dev/watchdog and/dev/watchdog0
Watchdog real-time scheduling thread is responsible for the specific execution of feeding dog,/dev/watchdog is the kernel provides to the user layer general operating interface file, used to start the dog, feeding dog, query status, etc./ dev/watchdog0 is a specific dog implementation, which can be based on a specific physical device implementation, or a softdog kernel module that simulates a hardware implementation in software (specific usage:modprobe softdog).
Let's see how the softdog kernel module emulates hardware to implement this functionality:
1static int __init softdog_init(void) 2 hrtimer_init(&softdog_ticktock, CLOCK_MONOTONIC, HRTIMER_MODE_REL); 3 softdog_ticktock.function = softdog_fire; 4 5static enum hrtimer_restart softdog_fire(struct hrtimer *timer) 6 emergency_restart(); 7 8static int softdog_ping(struct watchdog_device *w) 9 hrtimer_start(&softdog_ticktock, ktime_set(w->timeout, 0), (60s)10 HRTIMER_MODE_REL);
From the code implementation point of view, it is easy to understand that after turning on the watchdog (open "/dev/watchdog"), the default 60s will trigger the system restart, in the 60s countdown process, only feed the dog (softdog_ping) once, it will recover to 60s before triggering the system restart, so as long as the dog has been fed, emergency_restart() will not be executed, the system will not restart.
Take a look at 2. softlockup detection mechanism and 3. hardlockup detection mechanism.
The softlockup detection mechanism feeds the dog by waking up a migration/N kernel thread per hrtimer on the CPU, and each time migration/N is woken up, it resets a timestamp.
The hardlockup detection mechanism feeds the dog by adding one to a variable each time hrtimer executes.
Thank you for reading this article carefully. I hope the article "How to use watchdog in Linux kernel" shared by Xiaobian is helpful to everyone. At the same time, I hope everyone will support you a lot. Pay attention to the industry information channel. More relevant knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.