Detailed explanation of performance Monitoring of designated process in Linux system based on python 07/11 Update SLTechnology News&Howtos

Detailed explanation of performance Monitoring of designated process in Linux system based on python

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

There are many tools, components and programs to monitor the Linux server, but there are many processes running on one server at the same time, especially when doing performance testing, multiple services may be deployed on one server. If you only monitor the CPU and memory of the entire server, when a service has performance problems, it can not be effectively and accurately located (of course, it can also be achieved through other tools). It is necessary to monitor only the specified processes. The requirements were clear, so I started to do a performance monitoring script.

I. the overall train of thought

1. In order to start and stop monitoring conveniently, check the monitoring results at any time when you want to view the monitoring results, and start a service with flask. You can start and stop monitoring and view the monitoring results at any time by sending get requests.

2. Enable multi-thread monitoring to control whether to monitor cpu, memory and IO.

3. In order to reduce the dependence on other components, the monitoring results are written to the log.

4. In order to easily view the monitoring results, return the results directly as html.

II. Configuration file

Config.py

IP = '127.0.0.1'PORT =' 5555'LEVEL = 'INFO' # log levelBACKUP_COUNT = 9 # log backup counterLOG_PATH =' logs' # log pathINTERVAL = 1 # interval, run command interval.SLEEPTIME = 3 # interval, when stopping monitor, polling to start monitor when satisfying condition.ERROR_TIMES = 5 # times, number of running command. When equal, automatically stopped monitor.IS_JVM_ALERT = True # Whether to alert when the frequency of Full GC is too high.IS_MONITOR_SYSTEM = True # Whether to monitor system's CPU and Memory.IS_MEM_ALERT = True # Whether to alert when memory is too low. Alert by sending email.MIN_MEM = 2 # Minxium memory, uint: G0: don't clear cache, 1: clear page caches, 2: clear dentries and inodes caches, 3: include 1 and 2 # echo 1 > / proc/sys/vm/drop_cachesECHO = 0SMTP_SERVER = 'smtp.sina.com' # SMTP serverSENDER_NAME =' Zhang San'# sender nameSENDER_EMAIL = 'zhangsan@qq.com' # sender's emailPASSWORD =' UjBWYVJFZE9RbFpIV1QwOVBUMDlQUT09' # email password, base64 encode.RECEIVER_NAME = 'baidu_all' # receiver nameRECEIVER_EMAIL = [' zhangsan@qq.com' 'zhangsi@qq.com'] # receiver's emailDISK =' device1' # Which disk your application runsSTART_TIME = 'startTime.txt' # Store the time of start monitoring.FGC_TIMES =' FullGC.txt' # Store the time of every FullGC time.# htmlHTML ='{} 'ERROR =' {}

'HEADER = 'Performance Monitor (pid= {})' ANALYSIS ='{}'

IP and PORT: the server IP and port on which the service is enabled must be on the same server as the monitored service

BACKUP_COUNT: default is 9, that is, only the monitoring results of the last 9 days are retained

INTERVAL: the interval between two monitoring sessions. Default is 1s, which is mainly used for cpu and memory monitoring. When monitoring multiple ports or processes at the same time, please set this value to a lower value.

ERROR_TIMES: the number of command execution failures. When it is greater than this number, the monitoring will be stopped automatically. It is mainly used to monitor the specified process. If the process is killed, the monitoring must be stopped automatically, and the monitoring must be manually triggered to start again. If monitoring the specified port, when the port process is killed, the monitoring will be stopped, and if the port is restarted, the monitoring will start automatically.

IS_JVM_ALERT: only for java applications. If FullGC is frequent, email alerts. For general performance tests, the frequency of FullGC should not be less than 3600 seconds.

IS_MONITOR_SYSTEM: whether to monitor the total CPU utilization and remaining memory of the system

IS_MEM_ALERT: whether to remind you by email when the remaining memory of the system is too low

MIN_MEM: allows the system to have the minimum remaining memory in G

ECHO: whether to release the cache when the system's remaining memory is too low; 0: no, 1: page cache, 2: dentries and inodes cache, 3: release 1 and 2

DISK: disk number. If you want to monitor IO, you need to enter the disk number and use the df-h file name to see which disk the current file is hanging from.

START_TIME: record the time of each manual trigger to start monitoring

FGC_TIMES: record the time of each FullGC for troubleshooting

III. Interfaces and services

Server.py

Server = Flask (_ name__) permon = PerMon () # enable multithreading t = [threading.Thread (target=permon.write_cpu_mem, args= ()), threading.Thread (target=permon.write_io, args= ())] for i in range (len (t)): t [I] .start () # start monitoring # http://127.0.0.1:5555/runMonitor?isRun=1&type=pid&num=23121&totalTime=3600@server.route('/runMonitor', Methods= ['get']) def runMonitor ():. # draw the monitoring result # http://127.0.0.1:5555/plotMonitor?type=pid&num=23121@server.route('/plotMonitor', methods= [' get']) def plotMonitor (): .server.run (port=cfg.PORT, debug=True, host=cfg.IP) # enable the service

You can start and stop monitoring and view the monitoring results by entering the corresponding url in the browser address bar.

Url passes parameters:

1. Start monitoring

Http://127.0.0.1:5555/runMonitor?isRun=1&type=pid&num=23121&totalTime=3600

IsRun:1: start monitoring; 0: stop monitoring

Type and num:type=pid indicate that num is the process number, type=port indicates that num is the port number; multiple ports or processes can be monitored at the same time, and multiple ports or processes are separated by English commas

TotalTime: the total monitoring time (in seconds). If totalTime is not passed, it will be monitored all the time by default.

2. View the monitoring results

Http://127.0.0.1:

5555/plotMonitor?type=port&num=23121&system=1&startTime=2019-08-03 08:08:08&duration=3600

When type and num:type=pid, indicate that num is the process number, type=port, and that num is the port number

System: check the monitoring results of the system; if type and num are passed, you can only see the monitoring results of the process regardless of whether the value of sysytem is passed or not; if you do not pass type and num, only system, you can view the monitoring results of the system

StartTime: view the start time of monitoring results

Duration: length of time to view monitoring results (in seconds)

If startTime and duration are not passed, all the results since the last monitoring started will be viewed by default. If you need to view the monitoring results within a certain period of time, you need to send startTime and duration. The time range for checking monitoring results is from startTime to startTime+duration.

Note: if the service has been restarted within a period of time when the monitoring result is checked, the process number will change. If you still enter the process number before the restart, you can only view the monitoring results of the corresponding process number within the corresponding time period. In general, the port number will not change easily. It is recommended to enter the port number when viewing the monitoring results.

IV. Monitoring

Performance_monitor.py

Use the top command to monitor CPU and memory, use the jstat command to monitor JVM memory (java applications only), use the iotop command to monitor the process read and write disk, use the iostat command to monitor the disk IO, use the netstat command to check the process according to the port, and use the ps command to check the length of service startup. Therefore, the server must support the above commands, if not, please install.

Note: since a process can open multiple threads, when you view the IO of a process, you cannot see any IO;, but when you view a thread IO opened by the process, you can see the IO, but the thread is changing all the time. Therefore, monitoring the IO of the specified process is not supported at this time.

5. Check the monitoring results

Draw_performance.py

1. Draw CPU diagram, memory and JVM diagram, IO diagram and handle number diagram respectively.

2. To facilitate the statistics of CPU and IO usage, calculate the percentile

3. To facilitate the statistics of garbage collection information, calculate the ygc, fgc and their respective frequencies of java applications.

The effect of the monitoring result is as follows:

VI. Extension function

Extern.py has two functions

1. Port transfer process

Try: result = os.popen (f'netstat-nlp | grep {port} | tr-s ""'). Readlines () res = [line.strip () for line in result if str (port) in line] p = res [0] .split ('') pp = p [3] .split (':') [- 1] if str (port) = = pp: pid = p [- 1] .split ('/') [0] except Exception as err: logger.logger.error (err)

2. Find the log containing the monitoring results

Overall thinking:

(1) find all log files that contain this time according to the start time and end time entered

(2) find out all the logs containing the monitoring results according to the log files found.

(3) when drawing, traverse all the logs found.

Supplement

1. In order to easily check the time when the last monitoring started, the time of each monitoring start is written into the startTime.txt file.

2. In order to facilitate the troubleshooting of possible problems in java applications, write the time of each FullGC into the FullGC.txt file.

Project address: https://github.com/leeyoshinari/performance_monitor

Summary

The above is the editor to introduce to you the python-based Linux system designated process performance monitoring, I hope to help you, if you have any questions, please leave me a message, the editor will reply you in time. Thank you very much for your support to the website!

If you think this article is helpful to you, you are welcome to reprint it, please indicate the source, thank you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.