What is the inline optimization method in Go 07/06 Update SLTechnology News&Howtos

What is the inline optimization method in Go

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what is the inline optimization method in Go". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the inline optimization method in Go"?

What is inline?

Inline inlining expands a short function where it is called. In the early days of computer development, this optimization was done manually by programmers. Inlining has now become one of the steps in the basic optimization process that is automatically implemented during compilation.

Why is inlining important?

There are two reasons. The first is that it eliminates the overhead of the function call itself. The second is that it enables the compiler to perform other optimization strategies more efficiently.

Cost of function calls

In any language, calling a function 1 can be costly. When parameters are grouped into registers or stacks (depending on the ABI), the inversion process when the result is returned will be expensive. Introducing a function call causes the program counter to jump from one point of the instruction flow to another, which may cause the pipeline to lag. There is usually a pre-processing preamble inside the function, which needs to prepare new stack frames for function execution, as well as a similar pre-processing epilogue, which needs to free up stack frame space before returning to the caller.

In Go, function calls consume additional resources to support dynamic stack growth. When entering a function, the stack space available to goroutine is compared with the amount of space needed by the function. If the available space is different, the pre-processing jumps to the logic of the runtime runtime, increasing the stack space by copying the data to a new, larger space. When this copy is complete, the runtime jumps back to the original function entry and performs a stack space check, which now passes and the function call continues to execute. In this way, goroutine can apply for a small stack space at the beginning and more space if necessary. two

This check consumes very little, with only a few instructions, and because the stack of goroutine grows geometrically, it rarely fails. In this way, the branch prediction unit of a modern processor can hide the consumption of stack space checks by assuming that the check will be successful. When the processor mispredicts the stack space check and has to abandon its speculative operations, the cost of pipeline lag is less than the resources consumed by the operations needed to increase the stack space of the goroutine.

Although modern processors can use predictive execution techniques to optimize the overhead of generics and Go-specific elements in each function call, those overhead cannot be completely eliminated, so there will be performance consumption during the necessary work of each function call. The cost of a function call itself is fixed, and it is more expensive to call small functions than larger functions, because they do less useful work during each call.

Therefore, the way to eliminate these overhead must be to eliminate the function call itself, as the Go compiler does, under some conditions by replacing the function call with the content of the function. This process is called inlining because it expands the function body at the function call.

Optimization opportunities for improvement

Dr. Cliff Click describes inlining as an optimization measure made by modern compilers, such as constant propagation, which is as basic as dead code elimination as constant propagation. In fact, inlining allows the compiler to look deeper, allowing the compiler to observe the context of the specific function being called, and to see logic that can continue to be simplified or completely eliminated. Because inlining can be performed recursively, this optimization decision can be made not only in each independent function context, but also in the entire function call chain.

Inlining in practice

The following example demonstrates the impact of inlining:

Package main import "testing" / / go:noinlinefunc max (a, b int) int {if a > b {return a} return b} var Result int func BenchmarkMax (b * testing.B) {var r int for i: = 0; I

< b.N; i++ { r = max(-1, i) } Result = r} 运行这个基准，会得到如下结果：3 % go test -bench=. BenchmarkMax-4 530687617 2.24 ns/op 在我的 2015 MacBook Air 上 max(-1, i) 的耗时约为 2.24 纳秒。现在去掉 //go:noinline 编译指令，再看下结果： % go test -bench=. BenchmarkMax-4 1000000000 0.514 ns/op 从 2.24 纳秒降到了 0.51 纳秒，或者从 benchstat 的结果可以看出，有 78% 的提升。 % benchstat {old,new}.txtname old time/op new time/op deltaMax-4 2.21ns ± 1% 0.49ns ± 6% -77.96% (p=0.000 n=18+19) 这个提升是从哪儿来的呢？首先，移除掉函数调用以及与之关联的前置处理 4 是主要因素。把 max 函数的函数体在调用处展开，减少了处理器执行的指令数量并且消除了一些分支。现在由于编译器优化了 BenchmarkMax，因此它可以看到 max 函数的内容，进而可以做更多的提升。当 max 被内联后，BenchmarkMax 呈现给编译器的样子，看起来是这样的： func BenchmarkMax(b *testing.B) { var r int for i := 0; i < b.N; i++ { if -1 >

I {r =-1} else {r = I}} Result = r}

Run the benchmark again and take a look at the performance of the manual inline version and the compiler inline version:

% benchstat {old,new} .txtname old time/op new time/op deltaMax-4 2.21ns ±1% 0.48ns ±3%-78.14%

Now the compiler can see the results of inlining max in BenchmarkMax and can perform optimizations that could not be performed before. For example, the compiler notices that the initial value of I is 0 and only does self-increment, so all comparisons with I can assume that I is not negative. So the conditional expression-1 > I is never true. five

After proving that-1 > I is never true, the compiler can simplify the code to:

Func BenchmarkMax (b * testing.B) {var r int for i: = 0; I < b.N; iBN + {if false {r =-1} else {r = I}} Result = r}

And because the branch is a constant, the compiler can remove branches that will not be reached in the following ways:

Func BenchmarkMax (b * testing.B) {var r int for i: = 0; I < b.N; iTunes + {r = I} Result = r}

Thus, through the optimization process of inlining and unlocking by inline, the compiler simplifies the expression r = max (- 1, I) to r = I.

Inline restrictions

The inlining I discuss in this article is called leaf inline leaf inlining: the behavior of expanding the lowest function in the function call stack at the point where it is called. Inlining is a recursive process, and when the function An is inlined to the function A that calls it, the compiler will inline the resulting code inline to the caller of A. For example, the following code:

Func BenchmarkMaxMaxMax (b * testing.B) {var r int for i: = 0; I < b.N; iBN + {r = max (max (- 1, I), max (0, I))} Result = r}

The code runs as fast as in the previous example, because the compiler can repeatedly inline the above code and simplify the code to an r = I expression.

At this point, I believe you have a deeper understanding of "what is the inline optimization method in Go". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.