In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
11.1 what is a built-in function
Built-in functions, as the name implies, are functions implemented within the compiler. These functions, like keywords, can be used directly, and you don't have to use the header file corresponding to # include like the standard library function.
The name of the function of the built-in function, usually starting with _ _ builtin. These functions are mainly used within the compiler and are mainly used for the compiler. The main uses of built-in functions are as follows.
Used to deal with variable-length parameter list; used to handle program running exception; program compilation optimization, performance optimization; view the underlying information in the function operation, stack information, etc.; C standard library function built-in version.
Because built-in functions are defined internally by the compiler and are mainly called by compiler-related tools and programs, these functions are not documented and change frequently. These functions are not recommended for program developers.
However, some functions are very helpful for us to understand the underlying information of the program, compile and optimize, and these functions are often used in the Linux kernel, so it is necessary to understand some built-in functions commonly used in the Linux kernel.
11.2 commonly used built-in functions
_ _ builtinreturnaddress (LEVEL)
This function is used to return the return address of the current function or caller. The parameter LEVEl of the function represents the functions at different levels in the function call chain, and each value represents the following meaning.
0: returns the return address of the current function; 1: returns the return address of the current function caller; 2: returns the return address of the current function caller;...
Let's write a test program next.
Void f (void) {int * p; p = _ _ builtin_return_address (0); printf ("f return address:% p\ n", p); p = _ _ builtin_return_address (1); printf ("func return address:% p\ n", p); p = _ _ builtin_return_address (2) Printf ("main return address:% p\ n", p); printf ("\ n");} void func (void) {int * p; p = _ _ builtin_return_address (0); printf ("func return address:% p\ n", p); p = _ _ builtin_return_address (1); printf ("main return address:% p\ n", p) Printf ("\ n"); f ();} int main (void) {int * p; p = _ _ builtin_return_address (0); printf ("main return address:% p\ n", p); printf ("\ n"); func (); printf ("goodbye!\ n"); return 0;}
During the call of the C language function, the field information such as the return address and register of the current function will be saved in the stack, and then it will jump to the called function to execute. When the execution of the called function is finished, according to the return address saved in the stack, you can directly return to the original function to continue execution.
In this program, the main () function calls the func () function, and before the main () function jumps to the func () function execution, the address of the next statement that the program is running (as shown in the following code) is saved on the stack before func () is executed; this statement, skips to the func () function to execute. After the execution of func (), how do I return to the main () function? Quite simply, assign the return address saved to the stack to the PC pointer, and you can return directly to the main () function and proceed.
Printf ("goodbye!\ n")
Each layer of function call pushes the next instruction address of the current function, that is, the return address, onto the stack. Each layer of function calls constitute a chain of function calls. Within each layer of the function, we can use the built-in function to print the return address of each function on the call chain. The running result of the program is as follows.
Main return address:0040124Bfunc return address:004013C3main return address:0040124Bf return address:00401385func return address:004013C3main return address:0040124B
_ _ builtinframeaddress (LEVEL)
In the process of function call, there is also a concept of "stack frame". Each time the function is called, the field of the current function (return address, register, etc.) is saved in the stack, and each layer of function call saves its own field information on its own stack. This stack is the stack frame of the current function, and each stack frame has a start address and an end address, which represents the stack information of the current function. A multi-layer function call will have multiple stack frames, and the starting address of the previous stack frame will be saved in each stack frame, so that each stack frame forms a call chain. Many debuggers, GDB, including our built-in function, actually obtain all kinds of information at the bottom of the function by backtracking the function stack frame call chain. For example, return address I, call relationship, and so on. In ARM system, two registers, FP and SP, are used to point to the start and end addresses of the current function stack frame, respectively. When the function continues to call or returns, the values of the two registers also change, always pointing to the start and end addresses of the current function stack frame.
We can look at the stack frame address of the function through the built-in function _ _ builtinframeaddress (LEVEL).
0: view the stack frame address of the current function 1: view the stack frame address of the current function caller.
Write a program to print the stack frame address of the current function.
Void func (void) {int * p; p = _ _ builtin_frame_address (0); printf ("func frame:%p\ n", p); p = _ _ builtin_frame_address (1); printf ("main frame:%p\ n", p);} int main (void) {int * p; p = _ builtin_frame_address (0); printf ("main frame:%p\ n", p); printf ("\ n") Func (); return 0;}
The running result of the program is as follows.
Built-in function of main frame:0028FF48func frame:0028FF28main frame:0028FF4811.3 C Standard Library
Within the GNU C compiler, some built-in functions similar to C standard library functions are implemented. These functions are similar to C standard library functions, with the same function name, except that they are preceded by a prefix _ _ builtin. If you don't want to use the C library function, you can also add a prefix and use the corresponding built-in function directly.
Common standard library functions are as follows:
Memory-related functions: memcpy, memset, memcmp Mathematical functions: log, cos, abs, exp string processing functions: strcat, strcmp, strcpy, strlen print functions: printf, scanf, putchar, puts
Next we write a Mini Program and use the built-in functions corresponding to the C standard library.
Int main (void) {char a [100]; _ _ builtin_memcpy (a, "hello world!", 20); _ _ builtin_puts (a); return 0;}
The running result of the program is as follows.
Hello world!
Through the running results, we can see that using the built-in function corresponding to the C standard library, we can also copy and print the string and realize the function of the C standard library function.
11.4 built-in function: _ _ builtinconstantp (n)
There are also some built-in functions within the compiler, which are mainly used for compilation optimization and performance optimization, such as the _ _ builtinconstantp (n) function. This function is mainly used to determine whether the parameter n is constant at compile time. If it is a constant, the function returns 1; otherwise, the function returns 0. This function is often used in macro definitions for compilation and optimization. A macro definition that may be implemented differently depending on whether the macro's parameter is a constant or a variable. Macros like this are often seen in the kernel.
# define _ dma_cache_sync (addr, sz, dir)\ do {\ if (_ _ builtin_constant_p (dir))\ _ _ inline_dma_cache_sync (addr, sz, dir);\ else\ _ arc_dma_cache_sync (addr, sz, dir) \}\ while (0)
Many calculations or operations may have more optimized implementations when the parameters are constant, and in this macro definition, we have implemented two versions. We can choose different versions flexibly according to whether the parameter is constant or not.
11.5 built-in function: _ _ builtin_expect (exp,c)
The built-in function _ _ builtin_expect is also often used for compilation and optimization. This function has two parameters, and the return value is one of them, which is still exp. The main point of this function is to tell the compiler that there is a good chance that the value of the parameter exp is c. The compiler may then make some code optimizations on branch prediction based on this hint.
The parameter c is independent of the return value of this function. No matter what the value of c is, the return value of the function is exp.
Int main (void) {int a; a = _ builtin_expect (3Magi 1); printf ("a =% d\ n", a); a = _ builtin_expect (3p 10); printf ("a =% d\ n", a); a = _ builtin_expect (3100); printf ("a =% d\ n", a); return 0;}
The running result of the program is as follows.
A = 3a = 3a = 3
The main purpose of this function is the branch prediction optimization of the compiler. Inside modern CPU, there is cache as a cache device. The speed of CPU is very high, while the speed of external RAM is relatively slow, so there will be some performance bottlenecks when CPU reads and writes data from memory RAM. In order to improve the efficiency of program execution, CPU will cache certain instructions or data through cache, the internal buffer of CPU. When CPU reads and writes data in memory RAM, it will first go to cache to see if it can be found. If it is found, it will read and write directly; if it cannot be found, cache will re-cache some of the memory data. CPU reads and writes cache much faster than memory RAM, so in this way, the performance of the system can be improved.
So how does cache cache in-memory data? To put it simply, it is based on the principle of spatial proximity. For example, if CPU is executing an instruction, in the next instruction cycle, CPU will most likely execute the next instruction of the current instruction. If cache caches the following instructions in cache at this time, the next instruction cycle CPU can directly fetch, translate and execute in cache, thus greatly improving the operation efficiency.
But sometimes there are accidents. For example, when a program encounters function calls, if branches, goto jumps and other program structures in the process of execution, it will jump to other addresses for execution, then the instructions cached in cache are not the instructions to be obtained by CPU. At this point, we say that cache missed, and cache will re-cache the correct instruction code for CPU to read, which is the basic flow of cache work.
With this theoretical basis, when we encounter the program structure of choosing branches such as if/switch, we can write the branches with high probability in front, so that when the program is running, because of the high probability, most of the time does not need to jump, and the program is equivalent to a sequential structure, thus improving the hit rate of cache. Some related macros, such as likely and unlikely, have been implemented in the kernel to remind programmers to optimize the program.
11.6 likely and unlikely in the kernel
In the Linux kernel, two macros are defined using the _ _ builtin_expect built-in function.
# define likely (x) _ builtin_expect (!! (X), 1) # define unlikely (x) _ builtin_expect (!! (x), 0)
The main function of these two macros is to tell the compiler that the probability of a certain branch is so high or so low that it is almost impossible to happen. According to this prompt, the compiler will do some compilation optimization of score prediction. One detail in these two macro definitions is to undo the macro parameter x twice in order to convert the parameter x to a Boolean type, and then compare it with 1 and 0 to tell the compiler that it is highly likely that x is true or false.
Let's give you an example to feel some compilation changes in the branch prediction of the compiler after using these two macros.
/ expect.cint main (void) {int a; scanf ("% d", & a); if (% d ", 1) {printf ("% d ", 1); printf ("% d ", 2); printf ("\ n ");} else {printf ("% d ", 5); printf ("% d ", 6); printf ("\ n ") } return 0;}
In this program, the program executes different branch code depending on the value of the variable a we enter. We then disassemble the program to generate the corresponding assembly code.
$arm-linux-gnueabi-gcc expect.c$ arm-linux-gnueabi-objdump-D a.out 00010558: 10558: e92d4800 push {fp, lr} 1055c: e28db004 add fp, sp, # 4 10560: e24dd008 sub sp, sp, # 8 10564: e59f308c ldr R3, [pc, # 10564] 10568: e5933000 ldr R3, [R3] 1056c: e50b3008 str R3, [fp, #-8] 10570: e24b300c sub R3 Fp, # 12 10574: e1a01003 mov R1, R3 10578: e59f007c ldr R0, [pc, # 10578] 1057c: ebffffa5 bl 10418 10580: e51b300c ldr R3, [fp, #-12] 10584: e3530000 cmp R3, # 010588: 1a000008 bne 105b0 1058c: e3a01001 mov R1, # 1 10590: e59f0068 ldr R0, [pc # 104] 10594: ebffff90 bl 103dc 10598: e3a01002 mov r1, # 2 1059c: e59f005c ldr r0, [pc, # 92] 105a0: ebffff8d bl 103dc 105a4: e3a0000a mov r0, # 10 105a8: ebffff97 bl 1040c 105ac: ea000007 b 105d0 105b0: e3a01005 mov r1, # 5 105b4: e59f0044 ldr r0, [pc # 68] 105b8: ebffff87 bl 103dc 105bc: e3a01006 mov r1, # 6 105c0: e59f0038 ldr r0, [pc, # 56] 105c4: ebffff84 bl 103dc
Looking at the disassembly code of the main function, we can see that the structure of the assembly code is based on the order of our if/else branches to generate the corresponding assembly code (see 10588:bne 105b0 jump). Let's go on to change the code to modify the if branch with unlikely, telling the compiler that the if branch is unlikely to happen, or impossible.
/ / expect.cint main (void) {int a; scanf ("% d", & a); if (unlikely (astat0)) {printf ("% d", 1); printf ("% d", 2); printf ("\ n");} else {printf ("% d", 5); printf ("% d", 6); printf ("\ n") } return 0;}
Add the-O2 optimization parameter to compile the program, and disassemble the generated executable file a.out.
$arm-linux-gnueabi-gcc-O2 expect.c $arm-linux-gnueabi-objdump-D a.out00010438: 10438: e92d4010 push {R4, lr} 1043c: e59f4080 ldr R4, [pc, # 128] 10440: e24dd008 sub sp, sp, # 8 10444: e5943000 ldr R3, [R4] 10448: e1a0100d mov R1, sp 1044c: e59f0074 ldr R0, [pc, # 116] 10450: e58d3004 str R3 [sp, # 4] 10454: ebfffff1 bl 10420 10458: e59d3000 ldr r3, [sp] 1045c: e3530000 cmp r3, # 0 10460: 0a000010 beq 104a8 10464: e3a02005 mov r2, # 5 10468: e59f105c ldr r1, [pc, # 92] 1046c: e3a00001 mov r0, # 1 10470: ebffffe7 bl 10414 10474: e3a02006 mov r2, # 6 10478: e59f104c ldr r1 [pc, # 76] 1047c: e3a00001 mov r0, # 1 10480: ebffffe3 bl 10414 10484: e3a0000a mov r0, # 10 10488: ebffffde bl 10408 1048c: e59d2004 ldr r2, [sp, # 4] 10490: e5943000 ldr r3, [r4] 10494: e3a00000 mov r0, # 0 10498: e1520003 cmp r2, r3 1049c: 1a000007 bne 104c0 104a0: e28dd008 add sp, sp # 8 104a4: e8bd8010 pop {r4, pc} 104a8: e3a02001 mov r2, # 1 104ac: e59f1018 ldr r1, [pc, # 24] 104b0: e1a00002 mov r0, r2 104b4: ebffffd6 bl 10414 104b8: e3a02002 mov r2, # 2 104bc: eaffffed b 10478
We use unlikely modification on the if branch conditional expression to tell the compiler that the branch is less likely to occur. When the compiler turns on optimized compilation, through the generated disassembly code (10460:beq 104a8), we can see that the compiler puts the assembly code of the if branch with low probability behind and the assembly code of the else branch in front of it, thus ensuring that most of the time when the program is executed, it does not need to jump, but directly executes the branch code with high probability below in order.
In the Linux kernel, you will find many places decorated with likely and unlikely macros, so you should know what they are for.
This tutorial is adapted from the C language embedded Linux Advanced programming Video tutorial No. 05, the electronic version of the book can join the QQ group: 475504428 download, more embedded video tutorials, you can follow:
Official account of Wechat: Otaku tribe (armlinuxfun)
51CTO College-Mr. Wang Litao: http://edu.51cto.com/sd/d344f
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 269
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.