Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Why programmers like to use a lot of if else instead of switch

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

Why programmers like to use a large number of if else but not switch, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail, people with this need can come to learn, I hope you can get something.

Preface

It also happens to be looking at the Dubbo source code recently, and then found a very strange code, which happens to have something to do with this switch and if else!

Let's take a look at this code, which belongs to ChannelEventRunnable, this runnable is created by Dubbo IO threads, and throws this task into the business thread pool.

See, state = = ChannelState.RECEIVED is taken out as an independent if, while the other state is judged in the switch.

At that time, I scanned back and forth in my mind to think about what was wrong with this, but my face was so shallow and ignorant.

So began a wave of adventure!

It turned out to be CPU branch prediction.

Of course, the problem is to ask the search engine, generally speaking, I will search the major engines at the same time, we do not say who is better than who, anyway, sometimes du Niang is still good, for example, the results given by this search degree Niang are relatively high, and google is relatively low.

In general, I like to search for things on the official website first, and then let go when I can't find them, so I search site:xxx.com key first.

You see, there it is. Perfect!

Let's first take a look at what this blog says on the official website, and then analyze it in detail.

The blog on the official website of Dubbo

Modern CPU supports branch prediction (branch prediction) and instruction pipeline (instruction pipeline), which can greatly improve the efficiency of CPU. For simple if jumps, CPU can make branch prediction better. But for switch jumps, CPU doesn't have much to do. Switch essentially takes addresses from an array of addresses and redirects them based on the index.

In other words, if is a jump instruction, if it is a simple jump instruction, CPU can use branch prediction to pre-execute the instruction, while switch needs to find the corresponding address according to a similar array structure based on the value, and then jump, so CPU prediction will not help.

Then, because after a channel is established, its state is ChannelState.RECEIVED in more than 99.9% cases, so this state is singled out, so that the CPU branch prediction mechanism can be used to improve the efficiency of code execution.

The code of Benchmark is also given, that is, 100W state are randomly generated, and 99.99% of them are ChannelState.RECEIVED, and then compare them in the following two ways (the name of the example on the benchSwitch official website is wrong. I didn't find it until I proofread the article at first).

Although the blog also gives its comparison results, but I still come to run locally to see how the results are. In fact, JMH does not recommend running in ide, but I am lazy and run directly in idea.

It turns out that independent code through if is more efficient (note that throughput is measured here), and the blog also suggests that this technique can be placed in places where performance requirements are stringent, that is, it is not necessary to do so in general.

So far, we know that this conclusion is correct, but we still need to analyze it in depth. First, we need to see what is the difference between if and switch, and then what exactly does CPU branch prediction and instruction pipelining do, and why are there these two things?

If vs switch execution efficiency

Let's start with a simple demo to take a look at the execution efficiency of if and switch, that is, to add a code that is all controlled by if else, while switch and if + switch remain unchanged to see how efficient they are compared (at this time, the RECEIVED is still more than 99.9%).

Execution result

Let's take a look at the result of the execution:

Boy, I have run several times, and this full if is much better than if + switch, so should all the source code be changed to if else? you see, the throughput is so high that it will not look like if and switch is a bit irrelevant now.

I changed the value generated by state to random, and then ran to see what the result was:

I have run many times or the throughput of if is the highest, but the whole if is the best.

Decompile if and switch

In my impression, this switch should be better than if, regardless of CPU branch prediction, just from the bytecode point of view, let's take a look at the respective generated bytecode.

Decompilation of switch

Take a look at the decompilation of switch and intercept the key parts.

In other words, switch generates a tableswitch. After getting the value, the above getstatic can directly look up the table according to the index, and then jump to the corresponding row for execution, that is, the time complexity is O (1).

For example, if the value is 1, then skip directly to 64 lines of execution, and if it is 4, jump directly to 100 lines.

There are some small details about switch. When the values in swtich are discontinuous and there is a large gap, lookupswitch is generated. According to the saying on the Internet, it is dichotomy to query (I have not verified it). The time complexity is O (logn). It can not be found directly according to the index. I think the appearance of the generated lookup should be dichotomous, because it is sorted by value.

In addition, when the values in switch are not contiguous but the gap is small, tableswtich will still be generated, but some values will be populated. For example, in this example, the values in my switch are 1, 3, 5, 7, 9, and it automatically fills in 2, 4, 6, 8 all refer to the rows skipped by default.

Decompilation of if

Let's take a look at the decompilation of if:

You can see that if takes out variables and conditions for comparison every time, while switch takes a variable and then looks up the table and jumps directly to the correct row. From this point of view, the efficiency of switch should be better than if. Of course, if if passes the judgment for the first time, it will be directly goto, and will not perform any of the following judgments.

Therefore, from the point of view of the generated bytecode, the efficiency of switch should be greater than that of if, but from the perspective of test results, the efficiency of if is higher than that of switch, whether it is randomly generated state or 99.99% of the same state.

First of all, the optimization of CPU branch prediction is positive, so if if is still better than switch in random cases, I am not sure why. It may be what optimization operation JIT has done, or is it that the benefit of branch prediction success is greater than that of prediction failure in random cases?

Is it because I have too few enumerated values to reflect the effect of switch? However, in random cases, switch should not be weaker than if. I added 7 enumerated values, a total of 12 values were tested again, and the results are as follows:

It seems that the distance has been drawn closer, I think there is a chance, so I memorized the 26 letters, to tell you the truth or to sing the letters.

After expanding the number of branches, another wave of tests was conducted, and this time swtich was successful and finally better than if.

Beside the question: I think there is also a comparison between if and switch on the Internet. The result of their comparison is that switch is better than if. First of all, jmh is not written correctly, defining a constant to test if and switch, and the result of the test method is written without consumption. I do not know what the code will be optimized into by JIT. Dozens of lines may be directly optimized to a certain value of return.

Summarize the test results

After comparing so much, let's make a summary.

For hot branches

First of all, for hot branches to extract from the switch using if independent judgment, making full use of the convenience of CPU branch prediction is indeed better than pure swtich, from our code test results, the throughput is roughly twice as high.

In the case of hot branches

In the case of hot branches, the throughput is improved more when it is changed to pure if judgment instead of if + swtich. It is 3.3times of pure switch and 1.6times of if + switch.

In the case of random branching

In the case of random branching, there is not much difference between the three, but the pure if is the best.

However, from the bytecode point of view, the switch mechanism should be more efficient, whether it is O (1) or O (logn), but not from the point of view of test results.

I am not sure why if is superior to switch in the case of few selection conditions. It may be that the cost of lookup table is greater than the benefit in the case of less value. If you know anything, you can leave a message at the end of the article.

In the case of many selection conditions, switch is better than if, no matter how many choices I do not measure, people are interested can test their own, but the trend is like this.

CPU branch prediction

Next, let's take a look at how this branch prediction is done and why there is such a thing as branch prediction, but before we talk about branch prediction, we need to introduce the instruction pipeline (Instruction pipelining), that is, the pipeline of modern microprocessors.

CPU essentially refers to execution, and fetch execution let's take a look at the next five steps, namely, getting instruction (IF), instruction decoding (ID), executing instruction (EX), memory access (MEM), and writing back the result (WB). Let's take a look at a picture on Wikipedia.

Of course, there may actually be more steps, anyway, it means that you need to go through so many steps, so one execution can be divided into many steps, so so many steps can be parallel to improve the efficiency of processing.

Instruction pipeline

So the instruction pipeline is an attempt to keep each part of the processor busy with some instructions, by dividing the incoming instructions into a series of continuous steps, executed by different processor units, and different instruction parts are processed in parallel.

Just like the assembly line of our factory, my Altman's foot is put together and immediately put together an Altman's foot. I will not wait for one Altman's to be assembled before I assemble the next Altman.

Of course, it is not so rigid, it is not necessarily executed sequentially, some instructions are waiting and the subsequent instructions do not rely on the previous results, so they can be executed in advance, which is called out-of-order execution.

Let's go back to our branch predictions.

This code, like our lives, is always faced with choices. We don't know where to go until we make a choice, but in fact, we find that this code often takes the same choice, so we come up with a branch predictor to predict the trend and execute the instructions all the way ahead.

What if the prediction is wrong? This is different from our life, it can throw away all the results of previous execution and do it again, but it also has an impact, that is, the deeper the pipeline, the more mistakes, the more waste, and the wrong prediction delay is between 10 and 20 clock cycles, so there are side effects.

To put it simply, the branch predictor is used to predict the instructions to be executed in the future, and then pre-executed, so that when it is really needed, we can get the results directly and improve the efficiency.

Branch prediction

Branch prediction is divided into many kinds of prediction methods, including static prediction, dynamic prediction, random prediction and so on, and there are 16 kinds from Wikipedia.

Static prediction

Let me briefly talk about the three kinds I mentioned. Static prediction is a hothead. Just like the English multiple-choice questions, I choose A for whatever questions you ask, that is to say, it will predict a trend, unforgettable, simple and rude.

Dynamic prediction

Dynamic prediction will determine the direction of prediction based on historical records. For example, if the first few choices are all true, then I will take these instructions to be executed by true. If I change to be false in the last few times, then I will become these instructions to be executed by false, which is actually using the locality principle.

Random prediction

Random prediction can be known by looking at the name, which is another way of making multiple choice questions in English. Guess randomly and choose a direction to execute directly.

If you are interested in studying it yourself, by the way, Google Zero Project and other researchers announced a catastrophic security vulnerability called Spectre in 2018, which can use CPU's branch to predict the leakage of sensitive information, so we will not expand here. A link will be attached at the end of the article.

Then there is an attack called BranchScope, which also uses predictions to execute, so every time a new thing comes out, it always brings advantages and disadvantages.

Now that we know what instruction pipelining and branch prediction are, and why Dubbo is so optimized, but the article is not over, I would also like to mention this very famous question of stackoverflow and take a look at this number.

Why is it faster to process ordered arrays than unordered arrays?

This question was raised at the beginning of the blog post, and obviously it has something to do with the branch prediction. Since you have seen it, you can simply analyze it again, and you can answer this question in your mind. After all, we all know the answer. See if the train of thought is clear.

It is the following code that loops faster after the array is sorted.

Then all kinds of gods popped up. Let's take a look at what the first favorite boss said.

As soon as you open your mouth, go straight to the point.

You are a victim of branch prediction fail.

Then there is the picture above, and it looks like the old driver.

He said, let's go back to the 19th century, when there was no long-distance communication and the radio was not yet popular, if you are a switchman at a railway intersection, how do you know which side to take when the train is coming?

The consumption of the train to stop and restart is very big, every time you stop at the fork, and then you ask him, where are you going, and then pull the way, and then restart is very time-consuming, how to do? Guess!

If you guess correctly, the train doesn't have to stop and keep going. If you guess wrong, stop the car, back up and change lanes.

So it depends on whether you can guess or not! Try to turn a bike into a motorcycle.

Then the boss pointed out the assembly code corresponding to the key code, that is, the jump instruction, which corresponds to the fork in the train, so it's time to choose a path.

Later, I will not analyze, we should all know that the sorted array must be greater than 128 when the value is greater than 128, so the result of each branch prediction is correct! Therefore, the execution efficiency is very high.

The unsorted array is out of order, so most of the time, the prediction error will be wrong, and the prediction error will have to empty the instruction pipeline, and then do it again, of course, it will be slow.

So the boss said that you are the victim of the wrong prediction of the branch.

In the end, the modification plan given by the boss is that we don't need if, and we can't afford to hide if we can't afford it? Direct use of bit operation to achieve this function, I will not analyze the specific, to show you the boss's proposal to modify the plan.

Swtich is superior to if in terms of bytecode, but from the test results, it can show an advantage in the case of many branches, generally still can not beat if.

Then we also know what the instruction pipeline is, which is actually combined with reality, the pipeline is fast enough, and then the pre-execution of branch prediction is also a way to improve efficiency, of course, you have to guess right, otherwise the side effects of branch prediction errors can not be ignored, so the requirements for branch predictors are also very high.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report