Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use JAVA source code to parse hashcode method

2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about how to use JAVA source code to analyze hashcode methods. The article is rich in content and analyzes and describes for you from a professional point of view. I hope you can get something after reading this article.

In the process of development, we may often come into contact with the method of hashcode to generate hash codes, so how is the underlying implementation? What should I pay attention to when using it?

The underlying implementation of the hashcode () method

Hashcode () is the method of Object:

@ HotSpotIntrinsicCandidatepublic native int hashCode ()

It is a native method and decorated with the @ HotSpotIntrinsicCandidate annotation, proving that it has an efficient implementation in HotSpot based on CPU instructions.

Refer to the source code synchronizer.cpp for specific implementation:

Static inline intptr_t get_next_hash (Thread* self, oop obj) {intptr_t value = 0; if (hashCode = = 0) {value = os::random ();} else if (hashCode = = 1) {intptr_t addr_bits = cast_from_oop (obj) > > 3; value = addr_bits ^ (addr_bits > > 5) ^ GVars.stw_random;} else if (hashCode = 2) {value = 1 } else if (hashCode = = 3) {value = + + GVars.hc_sequence;} else if (hashCode = = 4) {value = cast_from_oop (obj);} else {unsigned t = self- > _ hashStateX; t ^ = (t _ hashStateX = self- > _ hashStateY; self- > _ hashStateY = self- > _ hashStateZ; self- > _ hashStateZ = self- > _ hashStateW; unsigned v = self- > _ hashStateW V = (v ^ (v > > 19)) ^ (t > 8); self- > _ hashStateW = v; value = v;} value & = markWord::hash_mask; if (value = = 0) value = 0xBAD; assert (value! = markWord::no_hash, "invariant"); return value;}

You can see that according to the value of the global variable hashcode, you can decide which strategy to use to generate the hash value, and check globals.hpp to see which kind of variable it is:

Experimental (intx, hashCode, 5, "(Unstable) select hashCode generation algorithm")

It is found to be a JVM variable of experimental, so if you want to modify it, you must add additional parameters, as shown below:

-XX:+UnlockExperimentalVMOptions-XX:hashCode=2

Also, the hashCode defaults to 5.

Is the hash value recalculated every time the hashcode () method is called?

For classes that do not override the hashcode () method, each time the instance calls the hashcode () method, only the hash value is calculated for the first time, and then the hash value is stored in the tag word (MarkWord) of the object header.

If you enter a variety of lock states, it will be cached elsewhere, usually stored in the thread that acquired the lock, and restoring no lock (that is, releasing the lock) will change back to the original hash value.

About the object header structure, and object storage structure, if you are interested, you can refer to: Java GC detailed explanation-1. Understand the structure of Java objects

-XX:hashCode=0 uses the Park-Miller pseudorandom number generator to generate the hash value if (hashCode= = 0) {value = os::random ();}

Call the random method of os to generate a random number. This method is implemented as follows: os.cpp:

/ / initial seed. Default is 1volatile unsigned int os::_rand_seed = 1. Static int random_helper (unsigned int rand_seed) {/ * standard, well-known linear congruential random generator with * next_rand = (16807*seed) mod (2 minutes 31-1) * see * (1) "Random Number Generators: Good Ones Are Hard to Find", * S.K. Park and K.W. Miller, Communications of the ACM 31:10 (Oct 1988), * (2) "Two Fast Implementations of the 'Minimal Standard' Random * Number Generator" David G. Carta, Comm. ACM 33, 1 (Jan 1990), pp. 87-88. * / const unsigned int a = 16807; const unsigned int m = 2147483647; const int q = m / a; assert (Q = 127773, "weird math"); const int r = m% a; assert (r = = 2836, "weird math"); / / compute az= 2 ^ 31p + q unsigned int lo = a * (rand_seed & 0xFFFF); unsigned int hi = a * (rand_seed > 16); lo + = (hi & 0x7FFF) m) {lo & = m; + lo } lo + = hi > > 15; / if (p / Q) overflowed, ignore the overflow and increment (p / Q) if (lo > m) {lo & = m; + + lo;} return lo;} int os::random () {/ / Make updating the random seed thread safe. While (true) {unsigned int seed = _ rand_seed; unsigned int rand = random_helper (seed); / / CAS update if (Atomic::cmpxchg (& _ rand_seed, seed, rand) = = seed) {return static_cast (rand);}

Where random_helper is the realization of the generating formula of random numbers, the formula is: here, aq16807, cx0, m = 2 ^ 31-1

Because these random numbers are all using the same generator, CAS updates the same seed, there may be performance problems if there are a large number of generated new objects and all call the hashcode () method. There is no problem with repeatedly calling the hashcode () method of the same object, because it was mentioned earlier that there is a cache.

-XX:hashCode=1 or 4 based on object pointer OOPs

The OOPs (Ordinary Object Pointers) object pointer is part of the object header. About the object header structure, and object storage structure, if you are interested, you can refer to: Java GC detailed explanation-1. Understand the structure of Java objects. It can be simply understood as a description of the address of the object in memory.

Else if (hashCode = = 1) {/ / This variation has the property of being stable (idempotent) / / between STW operations. This can be useful in some of the 1-0 / / synchronization schemes. Intptr_t addr_bits = cast_from_oop (obj) > > 3; value = addr_bits ^ (addr_bits > > 5) ^ GVars.stw_random;} else if (hashCode = = 4) {value = cast_from_oop (obj);}

Cast_from_oop is very simple, is to obtain the oop implementation base class oopDesc point address (oopDesc describes the basic composition of OOP, interested can refer to: Java GC detailed explanation-1. Understand the structure of Java objects):

Template inline T cast_from_oop (oop) {return (T) (CHECK_UNHANDLED_OOPS_ONLY ((oopDesc*)) o);}

When-XX:hashCode=4, directly use the address of oop as the hash value. -XX:hashCode=1 is transformed, and the Stop The World (STW) stw_random changes every time it occurs. Through this addr_bits ^ (addr_bits > > 5) ^ GVars.stw_random transformation, the hash collision is reduced and the hash value is more hashed.

To learn more about Stop the world, please refer to: JVM related-SafePoint and Stop The World complete solution (based on OpenJDK version 11)

-XX:hashCode=2 sensitivity test, constant as 1else if (hashCode= = 2) {value = 1; / / for sensitivity testing}

It is mainly used to test whether some collections are sensitive to hash values.

-XX:hashCode=3 self-increasing sequence else if (hashCode= = 3) {value = + + GVars.hc_sequence;} struct SharedGlobals {/ / omitted DEFINE_PAD_MINUS_SIZE (1, DEFAULT_CACHE_LINE_SIZE, sizeof (volatile int) * 2); / / Hot RW variable-- Sequester to avoid false-sharing volatile int hc_sequence; DEFINE_PAD_MINUS_SIZE (2, DEFAULT_CACHE_LINE_SIZE, sizeof (volatile int));}; static SharedGlobals GVars

Every time a new object is created, the hash value is called, which is a self-increment of + 1. As you can see, the hashing is very poor and it is easy to hash collisions.

-XX:hashCode=5 implements else {/ / Marsaglia's xor-shift scheme with thread-specific state / / This is probably the best overall implementation-- we'll / / likely make this the default in future releases by default. Unsigned t = self- > _ hashStateX; t ^ = (t _ hashStateX = self- > _ hashStateY; self- > _ hashStateY = self- > _ hashStateZ; self- > _ hashStateZ = self- > _ hashStateW; unsigned v = self- > _ hashStateW; v = (v ^ (v > > 19)) ^ (t > > 8); self- > _ hashStateW = v; value = v;}

The algorithm adopted is Marsaglia's xor-shift random number generation method. It is mainly a fast and good hashing algorithm proposed in this paper.

Special hash values cause problems in some scenarios

We often use the hash value of an object or a field, get the subscript by taking the module for the length of an array, and take out the object corresponding to the subscript of the array for further processing. This is common in load balancing, task scheduling, and thread allocation. Is there something wrong with the following code?

/ / gets the absolute value of the hash value of the string userId int index = Math.abs (userId.hashCode ()); / / returns the subscript object return userAvatarList.get (index% userAvatarList.size ()) .getUrl () after the hash value is modularized.

Usually, in most cases, there is no problem, but assume that userId is a string with a hash value of Integer.MIN_VALUE:

System.out.println ("polygenelubricants" .hashCode ()); System.out.println ("GydZG_" .hashCode ()); System.out.println ("DESIGNING WORKHOUSES" .hashCode ())

Output:

-2147483648-2147483648-2147483648

For these values, if you take absolute values with Math.abs (), we know that Math.abs (Integer.MIN_VALUE) is still equal to Integer.MIN_VALUE because of the underlying implementation:

Public static int abs (int a) {return (a < 0)?-a: a;}

-Integer.MIN_VALUE and Integer.MIN_VALUE are equal. The Integer.MIN_VALUE module is still negative, so that when the object corresponding to the subscript is taken, an exception will be reported.

Therefore, you need to change it to:

Int index = Math.abs (userId.hashCode ()% userAvatarList.size ()); return userAvatarList.get (index). GetUrl (); this is how to parse hashcode with JAVA source code. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report