In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces why to optimize the performance of linux, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.
Why do you need performance optimization
Perhaps you want to support higher throughput, want less latency, or improve resource utilization, which is one of the goals of performance optimization. One caveat, however, is not to optimize performance prematurely. If there are no performance problems at the moment, why waste this effort? Some of the current coding habits that help improve performance can be maintained at all times.
target
Comprehensive performance optimization is not an easy task. This series of articles is not intended to introduce performance optimization principles or specific algorithm optimizations. Designed to share some of the techniques commonly used in practice, but also focused on CPU aspects.
How to find performance bottlenecks
The first step in solving performance problems is to find performance problems. How to quickly find performance problems? For this article, how do you find the code that keeps CPU busy? Why are we talking about the code that keeps CPU busy?
For example, you may only need one CPU time slice to accomplish something, but because the code is not good enough, you still need multiple CPU time slices. As a result, CPU is so busy that it cannot continue to improve its efficiency.
Top
I believe everyone has used this command, and you can see some of the status of the process in real time. Its usage has been introduced in many articles, but this article does not intend to introduce it. We can see the CPU consumed by a process through the top command, but the high CPU usage does not mean it has a performance problem, or it is possible that CPU is effectively running at high speed and is not shitty in the manger.
Quick discovery
We must have all heard of the 82 rule, and similarly, 80% of performance problems focus on 20% of the code. So as long as we find this 20% of the code, we can effectively solve some performance problems.
This article uses the perf command, which is powerful and supports a lot of parameters, but it doesn't matter, and I'm not going to cover them all in this article.
There may be no perf command on the system, and ubuntu can be installed using the following methods:
Sudo apt install linux-tools-common
Example
Let's go straight to the example. The example is simple, just capitalizing the letters of a string. Of course, many people may see at a glance where there is a performance problem, but it doesn't matter, this example is just to illustrate the application of perf.
/ / toUpper.c # include # define MAX_LEN 1024 void printCostTime (struct timeval * start,struct timeval * end) {if (NULL = = start | | NULL = = end) {return;} long cost = (end- > tv_sec-start- > tv_sec) * 1000 + (end- > tv_usec-start- > tv_usec) / 1000; printf ("cost time:% ld ms\ n", cost) } int main (void) {srand (time (NULL)); int min = 'a'; int max =' zonal; char * str = malloc (MAX_LEN); / / exit if (NULL = = str) {printf ("failed\ n") if the application fails; return-1;} unsigned int i = 0; while (I
< MAX_LEN)//生成随机数 { str[i] = ( rand() % ( max - min ) ) + min; i++; } str[MAX_LEN - 1] = 0; struct timeval start,end; gettimeofday(&start,NULL); for(i = 0;i < strlen(str) ;i++) { str[i] = toupper( str[i] ); } gettimeofday(&end,NULL); printCostTime(&start,&end); free(str); str = NULL; return 0; } 编译成可执行程序并运行: $ gcc -o toUpper toUpper.c $ ./toUpper 这个时候我们用top查看结果发现toUpper程序占用CPU 100%: $ top -p `pidof toUpper` PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24456 root 20 0 5248 2044 952 R 100.0 0.0 0:07.13 toUpper 打开另外一个终端,执行命令: $ perf top -p `pidof toUpper` Samples: 1K of event 'cycles:ppp', Event count (approx.): 657599945 Overhead Shared Object Symbol 99.13% libc-2.23.so [.] strlen 0.19% [kernel] [k] perf_event_task_tick 0.11% [kernel] [k] prepare_exit_to_usermode 0.10% libc-2.23.so [.] toupper 0.09% [kernel] [k] rcu_check_callbacks 0.09% [kernel] [k] reweight_entity 0.09% [kernel] [k] task_tick_fair 0.09% [kernel] [k] native_write_msr 0.09% [kernel] [k] trigger_load_balance 0.00% [kernel] [k] native_apic_mem_write 0.00% [kernel] [k] __perf_event_enable 0.00% [kernel] [k] intel_bts_enable_local 其中pidof命令用于获取指定程序名的进程ID。 看到结果了吗?可以很清楚地看到,strlen函数占用了整个程序99%的CPU,那这个CPU的占用是否可以优化掉呢?我们现在都清楚,显然是可以的,在对每一个字符串进行大写转换时,都进行了字符串长度的计算,显然是没有必要,可以拿到循环之外的。 同时我们也关注到,这里面有很多符号可能完全没见过,不知道什么含义了,比例如reweight_entity,不过我们知道它前面有着kernel字样,因此也就明白,这是内核干的事情,仅此而已。 这里实时查看的方法,当然你也可以保存信息进行查看。 $ perf record -e cycles -p `pidof toUpper` -g -a 执行上面的命令一段时间,用于采集相关性能和符号信息,随后ctrl+c中止。默认当前目录下生成perf.data,不过这里面的数据不易阅读,因此执行: $ perf report + 100.00% 0.00% toUpper [unknown] [k] 0x03ee258d4c544155 + 100.00% 0.00% toUpper libc-2.23.so [.] __libc_start_main + 99.72% 99.34% toUpper libc-2.23.so [.] strlen 0.21% 0.02% toUpper [kernel.kallsyms] [k] apic_timer_interrupt 0.19% 0.00% toUpper [kernel.kallsyms] [k] smp_apic_timer_interrupt 0.16% 0.00% toUpper [kernel.kallsyms] [k] ret_from_intr 0.16% 0.00% toUpper [kernel.kallsyms] [k] hrtimer_interrupt 0.16% 0.00% toUpper [kernel.kallsyms] [k] do_IRQ 0.15% 0.15% toUpper libc-2.23.so [.] toupper 0.15% 0.00% toUpper [kernel.kallsyms] [k] handle_irq 0.15% 0.00% toUpper [kernel.kallsyms] [k] handle_edge_irq 0.15% 0.00% toUpper [kernel.kallsyms] [k] handle_irq_event 0.15% 0.00% toUpper [kernel.kallsyms] [k] handle_irq_event_percpu 0.14% 0.00% toUpper [kernel.kallsyms] [k] __handle_irq_event_percpu 0.14% 0.01% toUpper [kernel.kallsyms] [k] __hrtimer_run_queues 0.13% 0.00% toUpper [kernel.kallsyms] [k] _rtl_pci_interrupt 其中-g参数为了保存调用调用链,-a表示保存所有CPU信息。 因此就可以看到采样信息了,怎么样是不是很明显,其中的+部分还可以展开,看到调用链。 例如展开的部分信息如下: - 100.00% 0.00% toUpper libc-2.23.so [.] __libc_start_main - __libc_start_main 99.72% strlen 当然了,实际上你也可以将结果重定向到另外一个文件,便于查看: $ perf report >Result $more result # Event count (approx.): 23881569776 # # Children Self Command Shared Object Symbol # . . . .. . # 100.00% 0.005% toUpper [unknown] [k] 0x03ee258d4c544155 |-0x3ee258d4c544155 _ libc_start_main |-- 99.72%--strlen 100.00% 0.00% toUpper libc-2.23.so [.] _ _ Libc_start_main |-- _ _ libc_start_main |-- 99.72%--strlen 99.72% 99.34% toUpper libc-2.23.so [.] Strlen |-- 99.34%--0x3ee258d4c544155 Thank you for reading this article carefully. I hope the article "Why to optimize performance in linux" shared by the editor will be helpful to everyone. At the same time, I also hope that you will support and follow the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.