Why to process sorted arrays faster than unsorted arrays in Java 07/02 Update SLTechnology News&Howtos

Why to process sorted arrays faster than unsorted arrays in Java

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the relevant knowledge of why Java deals with sorted arrays faster than unsorted arrays, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value, I believe you will have something to gain after reading this Java article on why sorted arrays are faster than unsorted arrays, let's take a look.

First of all, let's take a look at the problem. Here is a very simple piece of code that randomly generates some numbers, sums the elements greater than 128, records and prints the summation time.

Import java.util.Arrays

Import java.util.Random

Public class Main

{

Public static void main (String [] args)

{

/ / Generate data

Int arraySize = 32768

Int data [] = new int [arraySize]

Random rnd = new Random (0)

For (int c = 0; c

< arraySize; ++c) data[c] = rnd.nextInt() % 256; // !!! With this, the next loop runs faster Arrays.sort(data); // Test long start = System.nanoTime(); long sum = 0; for (int i = 0; i < 100000; ++i) { // Primary loop for (int c = 0; c < arraySize; ++c) { if (data[c] >

= 128)

Sum + = data [c]

}

System.out.println ((System.nanoTime ()-start) / 1000000000.0)

System.out.println ("sum =" + sum)

}

My run result: tested on the premise of sorting the array and not sorting the array, it takes an average of 10 ms more time to sort than to sort first. This is not a coincidence, but an inevitable result.

The problem lies in the if judgment. The reason for this result has already been mentioned in the underlying interpretation of the order, conditions and loop sentences in the old text, but there are no specific examples to illustrate it.

In order to understand this problem, we need to have a certain understanding of the assembly line. The computer is stream-driven, executing one instruction after another, and executing an instruction has to go through six stages: fetch, decode, execute, access memory, write back, and update (different partitions contain different stages).

The hardware used in the six stages is basically different. If one instruction is executed and then another instruction is executed, then there will be a lot of hardware idle during this period of time. In order to make the computer faster, then the hardware can not be stopped, so there is pipeline technology.

Pipeline technology achieves parallel processing of several instructions by overlapping instructions. The following figure shows the timing of three-stage instructions, that is, an instruction is divided into three stages. In phase B of the first instruction, the hardware related to phase An is idle, so the phase An of the second instruction can be operated in advance.

Obviously, this design greatly improves the efficiency of instruction execution, and you may find a problem if you don't know what the next instruction is. If you don't know what the next instruction is, then the early stage will be in vain, and the assembly line will fail. Yes, that's what led to the opening question.

There are three situations that cause problems with the pipeline: 1, data correlation, the latter instruction needs to use the operation result of the previous instruction; 2, control correlation, such as unconditional jump, the address of the jump can only be known in the decoding stage, so the instruction pipeline that has been removed after the jump needs to be emptied. 3, the structure is related, because some instructions need long clock cycle (such as floating-point operation, etc.) and occupy the hardware for a long time, so that the subsequent instructions can not enter the decoding stage, that is, they are competing for the same set of hardware.

The translation of if (data [c] > = 128in the code) into machine language is a jump instruction, and the processor doesn't know which branch to jump to beforehand, so do you wait until you know before you start fetching the next instruction? The processor chooses to pretend to know which branch to jump to (not modest, but really pretending to know), and if you're lucky and don't, then waste a little time doing it again.

If there is no sorted array, the elements are arranged randomly, and the result of each data [c] > = 128 is also random, so the previous experience cannot be used for reference, so there is still a 50% chance of making a mistake in the next execution. If you guess wrong, it will take time to correct the mistake, and naturally more time will be wasted.

For an ordered array, you also need to guess several times at first, but you find that there is a rule, ah, every time you jump to the same branch, so you can basically guess every time, when you traverse to the boundary with 128. You can't guess until you get to the boundary with 128, but after a few guesses, you find that it's regular again, and every time you go to the same branch.

Although you can guess wrong, the probability of guessing wrong when sorted is much less than that when unsorted. The final result is that the sorted array is faster than the unsorted array. The reason is that a large number of control-related phenomena have occurred in the pipeline. The following is a bit more popular to deepen the understanding.

The girl whom he had loved for many years suddenly told you that she also liked you. You were so excited that you couldn't sleep for three days and nights. You decided to drive to her city and stay with her, but there were many forks on the road. You can only use the so-and-so map to navigate. As an old driver and with the mood of seeing your lover right away, you drive so fast that you don't care what kind of ticket.

Map positioning can no longer keep up with your speed, in order to get there as soon as possible, you always randomly choose a way forward when you encounter a fork in the road. unfortunately, your choice is not necessarily right (we assume that the highway can be reversed). If you go the wrong way, you have to go back to the bifurcation point, which corresponds to the unsorted situation.

Now the fork in the road is regular, telling you to start walking straight to one side, and when you get to a certain place, you will go straight to the other side. You just need to take some time to explore whether to start to go left or right, and which place in the middle will change direction. By contrast, you can save a lot of time and see your lover as soon as possible, which corresponds to the situation in order.

This is the end of the article on "Why sorted arrays are faster than unsorted arrays in Java". Thank you for reading! I believe you all have a certain understanding of the knowledge of why sorted arrays are faster than unsorted arrays in Java. If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.