What is the principle of hash method of HashMap in Java? 07/03 Update SLTechnology News&Howtos

What is the principle of hash method of HashMap in Java?

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what is the principle of hash method of HashMap in Java". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what is the principle of HashMap's hash method in Java"?

Take a look at the source code of the hash method (HashMap in JDK 8):

Static final int hash (Object key) {int h; return (key = = null)? 0: (h = key.hashCode ()) ^ (h > 16);}

What on earth is this code for?

As we all know, key.hashCode () is used to get the hash value of the key, which in theory is an int type, ranging from-2147483648 to 2147483648. For a mapping space that adds up to about 4 billion, hash collisions generally do not occur as long as the hash values are mapped uniformly and loosely.

But the problem is that there is no room for memory in an array of 4 billion lengths. Before HashMap expansion, the initial size of the array is only 16, so this hash value cannot be used directly. Use the previous modular operation with the length of the array and use the remainder to access the array subscript.

There are two aspects of modular operation.

The concepts of modular operation ("Modulo Operation") and remainder operation ("Remainder Operation") overlap but are not completely consistent. The main difference is that the operation is different when dividing negative integers. Module is mainly used in computer terminology, while remainder is more of a mathematical concept.

One is when you put in HashMap (in the putVal method):

Final V putVal (int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {HashMap.Node [] tab; HashMap.Node p; int n, I; if ((tab = table) = = null | | (n = tab.length) = = 0) n = (tab = resize ()) .length; if ((p = tab [I = (n-1) & hash]) = = null) tab [I] = newNode (hash, key, value, null);}

One is when get from HashMap (in the getNode method):

Final Node getNode (int hash, Object key) {Node [] tab; Node first, e; int n; K k; if ((tab = table)! = null & & (n = tab.length) > 0 & & (first = tab [(n-1) & hash])! = null) {}}

Among them, (n-1) & hash is a modular operation, which makes an "and" operation on the sum of hash values (array length-1).

Maybe you are wondering: shouldn't we use% in modular operation? Why use &?

This is because the & operation is more efficient than%, and when b is to the n power of 2, there is such a formula.

A% b = a & (bMui 1)

Replacing b with 2n 2 ^ n 2n is:

Let's verify that if a = 14 ^ 3 and b = 8, that is, 23 2 ^ 3 23 ~ 2 ~ 3.

The binary of 14% 8prime14 is the binary of 1110recorder 8 and the binary of 0111pint 8-1 = 7 is 0111pyrrine 011110110, that is, 0* 20 ^ 020mm 1 * 21 2 ^ 1 21mm 1 * 22 ^ 2 2220 * 23 2 ^ 3 236 04m 6 14% 8 is also exactly equal to 6.

This also explains why the length of the HashMap array is taken to the power of 2.

Because (array length-1) is exactly equivalent to a "low bit mask"-the low bits of this mask had better be all 1, so that the operation makes sense, otherwise the result must be 0, then the operation is meaningless.

If the corresponding bit in an and b is 1 at the same time, the result bit is 1, otherwise it is 0.

The whole power of 2 happens to be even, the even-1 is odd, and the last bit of odd is 1, which ensures that the last bit of hash & (length-1) may be 0 or 1 (depending on the value of h), that is, the result after & operation may be even or odd, so the uniformity of hash value can be guaranteed.

& the result of the operation is to return all the high bits of the hash to zero, leaving only the low values, which are used for array subscript access.

Suppose a hash value is 10100101 11000100 00100101 and we use it to do modular operations. Let's take a look at the results. The initial length of the HashMap is 16 (the internal is an array), 16-1 to 15, and the binary is 00000000 00000000 00001111 (the high bit is filled with 0):

10100101 11000100 00100101

& 00000000 00000000 00001111

00000000 00000000 00000101

Because the high order of 15 is all 0, the high bit after the operation must be 0, leaving only 4 low bits 0101, which is 5 in decimal system, that is, the key with a hash of 10100101 11000100 00100101 is placed in the fifth bit of the array.

After understanding the modular operation, let's look at the source code of the put method:

Public V put (K key, V value) {return putVal (hash (key), key, value, false, true);}

And the source code of the get method:

Public V get (Object key) {HashMap.Node e; return (e = getNode (hash (key), key)) = = null? Null: e.value;}

They all call the hash method before calling putVal and getNode:

Static final int hash (Object key) {int h; return (key = = null)? 0: (h = key.hashCode ()) ^ (h > 16);}

So why call the hash method before modular operation?

Look at the picture below.

A hash value of 11111111 11111111 11110000 1110 1010, move it 16 bits to the right (h > 16), exactly 0000000000000000000011111111111111111111111111), and the result is 1111111111111100001100010101

XOR (^) operation is based on binary bit operation and is represented by the symbol XOR or ^. The operation rule is: if the same value is 0, the different value is 1.

Due to the mixture of the high and low bits of the original hash, the randomness of the low bits is increased (mixed with some high-order features, and the high-order information is retained).

The result is calculated with the array length-1 (00000000 00000000 00000000 00001111), and the subscript is 00000000000000000000000101, which is 5.

Remember when we assumed that a hash was worth 10100101 11000100 00100101? Before calling the hash method, the result of the modular operation with 15 is also 5. We might as well take a look at the result of the modular operation after calling hash.

A hash value of 00000000 10100101 11000100 00100101 (complete 32 bits), move it 16 bits to the right (h > 16), exactly 0000000000000000000000010100101, and then do XOR operation (h ^ (h > 16)), the result is 0000000010100101 00111011 10000000

The result is calculated with the array length-1 (00000000 00000000 00000000 00001111), and the subscript is 00000000000000000000000000, that is, 0.

To sum up, the hash method is used to optimize the hash value, moving the hash value to the right by 16 bits, which is exactly half of its own length, and then doing an XOR operation with the original hash value, which mixes the high and low bits of the original hash value and increases the randomness.

To put it bluntly, the hash method is to increase randomness, make data elements more evenly distributed, and reduce collisions.

At this point, I believe you have a deeper understanding of "what is the principle of the hash method of HashMap in Java". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.