In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
Editor to share with you how to achieve Linux server hardware running status and fault email alert monitoring script, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
Monitor hardware health
Shell monitors the cpu,memory,load average, records it to the log, and notifies the administrator by email when the load is under pressure.
Principle:
1. Get the value of cpu,memory,load average
two。 Determine whether the value exceeds the custom range, for example (CPU > 90% philosophy Memory2)
3. If the value is out of range, send an email to notify the administrator. It is sent at intervals and is sent only once an hour.
4. Writes a numeric value to log.
5. Set crontab to run every 30 seconds.
ServerMonitor.sh
#! / bin/bash # system monitoring, recording cpu, memory, load average, and notifying administrator by email when the specified value is exceeded # * config start * * # current directory path ROOT=$ (cd "$(dirname" $0) ")" Pwd) # current server name HOST=$ (hostname) # log file path CPU_LOG= "${ROOT} / logs/cpu.log" MEM_LOG= "${ROOT} / logs/mem.log" LOAD_LOG= "${ROOT} / logs/load.log" # Notification email list NOTICE_EMAIL='admin@admin.com' # cpu,memory Loadaverage records the last notification email time CPU_REMARK='/tmp/servermonitor_cpu.remark' MEM_REMARK='/tmp/servermonitor_mem.remark' LOAD_REMARK='/tmp/servermonitor_loadaverage.remark' # notification email interval time REMARK_EXPIRE=3600 NOW=$ (date +% s) # * * config end * function start * * # get CPU occupied function GetCpu () {cpufree=$ (vmstat 1 5 | sed-n'3) $p' | awk'{x = x + $15} END {print xamp5}'| awk-F. '{print $1}') cpuused=$ ((100-$cpufree)) echo $cpuused local remark remark=$ (GetRemark ${CPU_REMARK}) # check whether CPU occupies more than 90% if ["$remark" = ""] & & ["$cpuused"-gt 90] Then echo "Subject: ${HOST} CPU uses more than 90% (date +% YMY% mmi% d''% H:%M:%S)" | sendmail ${NOTICE_EMAIL} echo "$(date +% s)" > "$CPU_REMARK" fi} # get memory usage function GetMem () {mem=$ (free-m | sed-n'3M) used=$ (echo $mem | awk-F'{print $3}) ') free=$ (echo $mem | awk-F' {print $4}') total=$ (($used + $free)) limit=$ (($total/10)) echo "${total} ${used} ${free}" local remark remark=$ (GetRemark ${MEM_REMARK}) # check whether the memory footprint exceeds 90% if ["$remark" = "] & [" $limit "- gt" $free "] Then echo "Subject: ${HOST} Memory uses more than 90% (date +% YMY% mmi% d''% H:%M:%S)" | sendmail ${NOTICE_EMAIL} echo "$(date +% s)" > "$MEM_REMARK" fi} # get load average function GetLoad () {load=$ (uptime | awk-F 'load average:' {print $2}') M1 $(echo $load | awk-F') '{print $1}') m5room$ (echo $load | awk-F','{print $2}') m15room$ (echo $load | awk-F') '{print $3}') echo "${M1} ${M15}" m1ubilite $(echo $M1 | awk-F'.'{print $1}') local remark remark=$ (GetRemark ${LOAD_REMARK}) # check if the load is under pressure if ["$remark" = "] & [" $m1u "- gt" 2 "] Then echo "Subject: ${HOST} Load Average more than 2 $(date +% YMY% MMI% d'% H:%M:%S)" | sendmail ${NOTICE_EMAIL} echo "$(date +% s)" > "$LOAD_REMARK" fi} # get the last email sent time function GetRemark () {local remark if [- f "$1"] & & [- s "$1"] Then remark=$ (cat $1) if [$(($NOW-$remark))-gt "$REMARK_EXPIRE"] Then rm-f $1 remark= "fi else remark=" fi echo $remark} # * * function end * * cpuinfo=$ (GetCpu) meminfo=$ (GetMem) loadinfo=$ (GetLoad) echo "cpu: ${cpuinfo}" > > "${CPU_LOG}" echo "mem: ${meminfo}" > > "${MEM_LOG}" echo "load: ${loadinfo}" > "${LOAD_LOG}" exit 0
Monitor whether the website is abnormal
A script that shell monitors the website for abnormalities, and automatically sends an email to the administrator if there are any abnormalities.
Process:
1. Check whether the http_code returned by the website is equal to 200. if not, it is regarded as an exception.
two。 Check the visit time of the website, more than MAXLOADTIME (10 seconds) is considered an exception.
3. After sending the notification email, record the sending time in / tmp/monitor_load.remark and do not repeat within one hour, for example, clear / tmp/monitor_load.remark after one hour.
#! / bin/bash SITES= ("http://web01.example.com"" http://web02.example.com") # website to be monitored NOTICE_EMAIL='me@example.com' # administrator email MAXLOADTIME=10 # access timeout setting REMARKFILE='/tmp/monitor_load.remark' # has a notification email been sent when recording If it has been sent, it will not be sent within one hour ISSEND=0 # whether there is an email sent EXPIRE=3600 # the number of seconds between each email sent NOW=$ (date +% s) if [- f "$REMARKFILE"] & & [- s "$REMARKFILE"] Then REMARK=$ (cat $REMARKFILE) # Delete expired email delivery time record file if [$(($NOW-$REMARK))-gt "$EXPIRE"]; then rm-f ${REMARKFILE} REMARK= "" fi else REMARK= "" fi # loop determines each site for site in ${SITES [*]} Do printf "start to load ${site}\ n" site_load_time=$ (curl-o / dev/null-s-w "time_connect:% {time_connect}\ ntime_starttransfer:% {time_starttransfer}\ ntime_total:% {time_total}"${site}") site_access=$ (curl-o / dev/null-s-w% {http_code} "${site}") time_total=$ {site_load_time##* :} printf "$(date'+% Y-%m-%d% HGV% MRV% S')\ n" printf "site load time\ n$ {site_load_time}\ n" printf "site access:$ {site_access}\ n" # not send if ["$REMARK" = ""] Then # check access if ["$time_total" = "0.000"] | | ["$site_access"! = "200"]; then echo" Subject: ${site} can access $(date +% Ymuri% mmi% d''% H:%M:%S) "| sendmail ${NOTICE_EMAIL} ISSEND=1 else # check load time if [" ${time_total%%.*} "- ge ${MAXLOADTIME}] Then echo "Subject: ${site} load time total:$ {time_total} $(date +% YMY% MMI% d''% H:%M:%S)" | sendmail ${NOTICE_EMAIL} ISSEND=1 fi done # record the sending time if after sending an email ["$ISSEND" = "1"] Then echo "$(date +% s)" > $REMARKFILE fi exit 0 is all the contents of this article "how to implement the monitoring script of Linux server hardware running status and fault email alerts". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.