Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A method step to improve biased lock performance by four times by overriding the hashCode () method

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "the method step of improving the biased lock performance by 4 times by rewriting the hashCode () method". In the daily operation, I believe that many people have doubts about the method and steps to improve the lock performance by 4 times by rewriting the hashCode () method. The editor consulted all kinds of materials and sorted out a simple and useful operation method. I hope it will be helpful to answer your doubts about how to improve the performance of biased locks by four times by rewriting the hashCode () method! Next, please follow the editor to study!

1 Mini riddle

In my work last week, I submitted minor changes to a class, implementing the toString () method to make the log easier to understand. To my surprise, this change resulted in a 5% reduction in unit test coverage for classes. I know that all the new code is covered by existing tests. So what's wrong? When comparing coverage reports, a sharp-eyed colleague noticed that hashCode () was tested before the change, but not after the change. This makes sense: the default toString () method calls the hashCode () method.

Public String toString () {return getClass () .getName () + "@" + Integer.toHexString (hashCode ());}

After overriding toString (), custom hashCode () is no longer called. I missed a test.

Everyone knows the toString () method, but...

2 how is the default hashCode () method implemented?

The value returned by the default hashCode () method is called the identity hash code (identity hash code). From now on, I'll use this term to distinguish it from overriding the hash code returned by the hashCode () method. Note: even if the class overrides hashCode (), you can still get the identification hash code of the object o through System.identityHashCode (o).

Using an integer representation of the memory address as an identity hash code is common sense and is implied by the J2SE document:

…… This is usually achieved by converting the internal address of an object to an integer, but this implementation technique is not required by the Java programming language.

However, there seems to be a problem because the method convention requires:

During the execution of a Java application, when the hashCode () method is called multiple times on the same object, the hashCode () method must return the same value, regardless of the timing of the call.

Consider that JVM relocates objects (for example, in the GC cycle caused by promotion or compression). After calculating the identity hash code of the object, we must be able to get this value back in some way, even if the object relocation occurs.

One possibility is to get the current memory location of the object the first time hashCode () is called, and then save it with the object, such as to the object header. In this way, even if the object is moved to a different location, it still has the original identification hash code. A hidden danger of this approach is that it cannot prevent two different objects from having the same identity hash code. But the Java specification allows this to happen.

The best way to confirm is to look at the source code. Unfortunately, the default java.lang.Object::hashCode () is a local method. Listing 2: Object::hashCode is the local method

Public native int hashCode (); 3 the real hashCode () please come out

It is important to note that the implementation of the identity hash code depends on JVM. Because I only talk about OpenJDK source code, I always refer to the specific implementation of OpenJDK when I mention JVM. The code link points to the Hotspot subdirectory of the code repository. I think most of this code also applies to Oracle JVM, although it may (in fact) be different in individual places (more on this later).

OpenJDK defines the hashCode () entry point, in the source code src/share/vm/prims/jvm.h and src/share/vm/prims/jvm.cpp:

508 JVM_ENTRY (jint, JVM_IHashCode (JNIEnv* env, jobject handle)) 509 JVMWrapper ("JVM_IHashCode"); 510 / / as implemented in the classic virtual machine; return 0 if object is NULL511 return handle = = NULL? 0: ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null (handle)); 512 JVM_END

Identity_hash_value_for also calls ObjectSynchronizer::FastHashCode (), which is called by some other place, such as System.identityHashCode ().

708 intptr_t ObjectSynchronizer::identity_hash_value_for (Handle obj) {709 return FastHashCode (Thread::current (), obj ()); 710}

One might simply think that what ObjectSynchronizer::FastHashCode () does is similar:

If (obj.hash () = = 0) {obj.set_hash (generate_new_hash ());} return obj.hash ()

But it is actually a function that contains hundreds of lines of code and looks much more complex. However, we can find some "if-not-exists-generate if not" code, such as:

685 mark = monitor- > header ();... 687 hash = mark- > hash (); 688 if (hash = = 0) {689 hash = get_next_hash (Self, obj);... 701}. 703 return hash

This seems to confirm our hypothesis. Now let's ignore the monitor for a while, as long as we know that it can provide the object header. The object header is saved in the variable mark. Mark is a pointer to an instance of markOop, and markOop represents a mark word at the low address in the header of the object. So the algorithm for hashCode () is to try to get the hash code recorded in the tag word. If not, generate one with get_next_hash (), save it, and return it.

4 identify the generation of hash codes

As we can see, the hash code is generated by get_next_hash (). This function provides six calculation methods and chooses which one to use based on the global configuration of hashCode.

Use random numbers.

Calculated based on the memory address of the object.

Hard-coded as 1 (for testing).

Generated from a sequence.

Converts to the int type using the memory address of the object.

Use thread state in combination with xorshift.

Which is the default method? OpenJDK 8 uses method 5, based on global.hpp:

1127 product (intx, hashCode, 5,\ 1128 "(Unstable) select hashCode generation algorithm")\

OpenJDK 9 uses the same default values. Looking at previous versions, both OpenJDK 7 and 6 used the first method: random numbers.

So, unless I find the wrong source code, the implementation of the default hashCode () method in OpenJDK has nothing to do with the object memory address, at least since OpenJDK 6.

5 object head and synchronization

Let's review a few areas that we haven't considered before. First of all, ObjectSynchronizer::FastHashCode () seems too complex, using more than 100 lines of code to perform what we consider to be trivial "get or generate" (get-or-generate) operations. Second, what is the tube process and why does it have an object head?

Looking at the structure of markup words is a good starting point for progress. In OpenJDK, it looks like this:

30 / / The markOop describes the header of an object.31 / 32 / / Note that the mark is not a real oop but just a word.33 / / It is placed in the oop hierarchy for historical reasons.34 / / 35 / / Bit-format of an object header (most significant first Big endian layout below): 36 / 37 / 32 bits:38 / /-39 / / hash:25-> | age:4 biased_lock:1 lock:2 (normal object) 40 / / JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object) 41 / / size:32- -- > | (CMS free block) 42 / / PromotedObject*:29-> | promo_bits:3-> | (CMS promoted object) 43 / / 44 / / 64 bits:45 / /-46 / / unused:25 hash:31-- > | unused:1 Age:4 biased_lock:1 lock:2 (normal object) 47 / / JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object) 48 / / PromotedObject*:61-> | promo_bits:3-> | (CMS promoted object) 49 / / size:64- -- > | (CMS free block) 50 / / 51 / / unused:25 hash:31-> | cms_free:1 age:4 biased_lock:1 lock:2 (COOPs & & normal object) 52 / / JavaThread*:54 epoch:2 cms_free:1 age:4 biased_lock:1 lock:2 (COOPs & & biased object) 53 / / narrowOop: 32 unused:24 cms_free:1 unused:4 promo_bits:3-> | (COOPs & & CMS promoted object) 54 / / unused:21 size:35-> | cms_free:1 unused:7-> | (COOPs & & CMS free block)

The format of markup words is slightly different on 32-bit and 64-bit machines. There are two variations of the latter, depending on whether compressed object pointers (Compressed Object Pointer) are enabled. Both Oracle JVM and OpenJDK 8 are enabled by default.

Therefore, the object header may be associated with a block of memory or an actual object, and there are multiple states. In the simplest case ("normal object"), the identification hash code is stored directly in the low address of the object header.

In other states, however, the object header contains a pointer to JavaThread or PromotedObject. What's more complicated is: if we put a unique hash code in a "normal object", will it be removed? Move to where? If the object is biased (biased), where can we get or set the identification hash code? What is a biased biased object?

Let's try to answer these questions.

6 bias lock

Biased objects appear to be the result of biased locks. This is a feature enabled by default since HotSpot 6 in an attempt to reduce the cost of locking objects. Locking operations are expensive, and their implementation usually relies on atomic CPU instructions (CAS) to safely handle lock and unlock requests from different threads. According to observation, in most applications, most objects are locked by only one thread, so the cost of atomic operations is often wasted. To avoid this, JVM with biased locks allows threads to set objects to be "biased" to themselves. If an object is biased, the thread can lock and unlock the object without atomic instructions. As long as there are no threads competing for the same object, we will get a performance improvement.

The biased lock (biased_lock bit) in the object header indicates whether the object is biased towards the thread that the JavaThread* points to. Lock positioning (lock bit) indicates whether the object is locked.

Precisely because the biased lock implementation of OpenJDK needs to write a pointer to the tag word, it needs to relocate the real tag word (which contains the identification hash code).

This explains the extra complexity in FasttHashCode. The object header contains not only the identification hash code, but also the lock state (such as a pointer to the lock holder thread). So we need to consider all the circumstances and find the location where the identification hash code is stored.

Let's read FasttHashCode. The first thing we found out was:

601 intptr_t ObjectSynchronizer::FastHashCode (Thread * Self, oop obj) {602 if (UseBiasedLocking) {610 if (obj- > mark ()-> has_bias_pattern ()) {. 617 BiasedLocking::revoke_and_rebias (hobj, false, JavaThread::current ());... 619 assert (! obj- > mark ()-> has_bias_pattern (), "biases should be revoked by now"); 620} 621}

Wait, it just undoes the existing bias and disables the bias lock on the object (false means don't try to reset the bias). Looking at the next few lines, this is indeed an invariant:

637 / / object should remain ineligible for biased locking638 assert (! mark- > has_bias_pattern (), "invariant")

If I read it correctly, this means that simply requesting the object's identity hash code will disable biased locks, which will force locked objects to use expensive atomic instructions, even if there is only one thread.

7 Why does the bias lock conflict with the identification hash code?

To answer this question, we must know where the tag word (including the identification hash code) may exist, depending on the state of the object's lock. The following diagram from HotSpot Wiki shows the conversion process: my (unreliable) reasoning is as follows.

For the four states at the top of the figure, OpenJDK will be able to use a "light" lock. In the simplest case (no lock), this means that the identification hash code and other data are placed directly in the tag word of the object:

46 / / unused:25 hash:31-- > | unused:1 age:4 biased_lock:1 lock:2 (normal object)

In more complex cases, it needs this space to hold the pointer to the "lock object". Therefore, the tag word will be "replaced" and placed somewhere else.

Since only one thread tries to lock the object, the pointer actually points to a memory location in the thread stack. This has two advantages: fast access (no contention or memory access coordination) and the ability to let the thread determine that it has a lock (because the memory location points to its own stack).

But this does not work in all cases. If there is object contention (such as synchronization statements that many threads execute), we will need a more complex structure that can hold not only a copy of the object header, but also a set of waiters. If the thread executes object.wait (), there is a similar need for a waiting list.

This richer data structure is ObjectMonitor, which is called a "heavyweight" pipe in the figure. The object header no longer points to the "replaced tag word", but to an actual object (pipe procedure). At this point, the access identity hash code requires an "inflate the monitor": tracking the pointer to get the object, reading or modifying the field that contains the replaced token. This operation is more expensive and needs to be coordinated.

FasttHashCode does have work to do.

L640 to L680 process the lookup object header and check the cached identity hash code. I believe there is a fast path to detect situations where there is no need to expand the tube.

Starting with L682, you need to grit your teeth:

/ / Inflate the monitor to set hash code683 monitor = ObjectSynchronizer::inflate (Self, obj); 684 / / Load displaced header and check it has hash code685 mark = monitor- > header ();.. 687 hash = mark- > hash ()

At this point, if the identification hash code exists (hash! = 0), JVM can return directly. Otherwise, you need to get the hash code from get_next_hash () and store it securely in the object header saved by ObjectMonitor.

This seems to provide a reasonable explanation for why calling hashCode () on an object that does not override the default implementation causes the object to not meet the conditions for biased locks:

In order to keep the identification hash code of the object unchanged after relocation, the identification hash code needs to be stored in the object header.

The thread that requests to identify the hash code does not necessarily care whether the object is locked, but they actually share the data structure used by the locking mechanism. This mechanism is a complex monster that not only changes itself, but also moves (replaces) the object's head.

Biased locks can be locked and unlocked without using atomic operations. Biased locks are efficient if only one thread locks the object. We can record the lock status in the tag word. I'm not 100% sure, but I think that since other threads may read identity hash codes, even if only one thread needs to be locked, there will be contention for markup words and atomic operations will be needed to ensure accuracy. This negates the whole meaning of biased locks.

8 Review

The default hashCode () implementation (identity hash code) is independent of the memory address of the object, at least in OpenJDK. In OpenJDK 6 and 7, it is a randomly generated number. In OpenJDK 8 and 9, it is a number based on thread state. Here is a test that comes to the same conclusion.

Prove that the "implementation-dependent" warning is not empty: Azul Zing does generate identity hash codes from the memory address of the object.

In HotSpot, the identification hash code is generated only once and then cached in the tag word of the object header.

Zing uses different schemes to ensure the consistency of hash codes after object relocation. They save the value that identifies the hash code only when the object is redirected. At this time, the hash code is saved in the pre-header.

In HotSpot, calling the default value of hashCode () or System.identityHashCode () loses the lock bias of the object.

This means that if you synchronized objects that are not contended, it is best to override the default hashCode () implementation, otherwise you will miss the JVM optimization.

In HotSpot, you can disable bias locks for a single object.

This is very useful. I've seen applications use too many biased locks in competing producer-consumer queues, which brings more trouble than benefits, so we disabled this feature altogether. In fact, we can do this by calling System.identityHashCode () on a specific object or class.

I found that HotSpot does not have a flag to choose the default hashCode generator, so testing other generators may require compiling the source code.

To be honest, I didn't look at it carefully. Michael Rasmussen kindly points out that-XX:hashCode=2 can be used to change the default value. Thank you!

9 benchmark test

I wrote a simple JMH tool to verify these conclusions.

Benchmarking does something like this:

Object.hashCode (); while (true) {synchronized (object) {counter++;}}

The first configuration (withIdHash) synchronizes on objects that use identity hash codes, and we expect that a call to hashCode () will result in biased locks being disabled. The second configuration (withoutIdHash) implements a custom hash code, so biased locks are not disabled. Each configuration is run with one thread and then with two threads (with the suffix "Contended").

By the way, we must enable-XX:BiasedLockingStartupDelay=0, or JVM will wait 4 seconds before triggering the optimization, which will affect the test results.

First execution:

Benchmark Mode Cnt Score Error Units BiasedLockingBenchmark.withIdHash thrpt 100 35168021 ±230252 ops/ms BiasedLockingBenchmark.withoutIdHash thrpt 100 173742468 ±4364491 ops/ms BiasedLockingBenchmark.withIdHashContended thrpt 100 22478109 ±1650649 ops/ms BiasedLockingBenchmark.withoutIdHashContended thrpt 100 20061973 ±786021 ops/ms

We can see that using custom hash codes makes locking and unlocking loops four times faster than using identity hash codes (biased locks are disabled). When two threads compete for locks, the offset lock is disabled, so there is no significant difference between the two hashing methods.

The second run disables biased locks (- XX:-UseBiasedLocking) in all configurations.

Benchmark Mode Cnt Score Error Units BiasedLockingBenchmark.withIdHash thrpt 100 37374774 ±204795 ops/ms BiasedLockingBenchmark.withoutIdHash thrpt 100 36961826 ±214083 ops/ms BiasedLockingBenchmark.withIdHashContended thrpt 10018349906 ±1246372ops/ms BiasedLockingBenchmark.withoutIdHashContended thrpt 10018262290 ±1371588 ops/ms

The hashing method no longer has any impact, and withoutIdHash has lost its advantage.

(all the benchmark tests were run on a 2.7 GHz Intel Core i5 computer. )

10 references

These guesses, as well as my understanding of the JVM source code, come from the patchwork of different materials about layouts, biased locks, and so on. The main information is:

Https://blogs.oracle.com/dave/entry/biased_locking_in_hotspot

Http://fuseyism.com/openjdk/cvmi/java2vm.xhtml

Http://www.dcs.gla.ac.uk/~jsinger/pdfs/sicsa_openjdk/OpenJDKArchitecture.pdf

Https://www.infoq.com/articles/Introduction-to-HotSpot

Http://blog.takipi.com/5-things-you-didnt-know-about-synchronization-in-java-and-scala/#comment-1006598967

Http://www.azulsystems.com/blog/cliff/2010-01-09-biased-locking

Https://dzone.com/articles/why-should-you-care-about-equals-and-hashcode

Https://wiki.openjdk.java.net/display/HotSpot/Synchronization

Https://mechanical-sympathy.blogspot.com.es/2011/11/biased-locking-osr-and-benchmarking-fun.html

11 Appendix: benchmark code package com.github.srvaroa.jmh;import org.openjdk.jmh.annotations.*;import org.openjdk.jmh.infra.Blackhole;import java.util.concurrent.TimeUnit;@State (Scope.Benchmark) @ OutputTimeUnit (TimeUnit.MILLISECONDS) @ Warmup (iterations = 4) @ Fork (value = 5, jvmArgsAppend = {"- XX:-UseBiasedLocking", "- XX:BiasedLockingStartupDelay=0"}) public class BiasedLockingBenchmark {int unsafeCounter = 0; Object withIdHash; Object withoutIdHash @ Setup public void setup () {withIdHash = new Object (); withoutIdHash = new Object () {@ Override public int hashCode () {return 1;}}; withIdHash.hashCode (); withoutIdHash.hashCode () } @ Benchmark public void withIdHash (Blackhole bh) {synchronized (withIdHash) {bh.consume (unsafeCounter++);} @ Benchmark public void withoutIdHash (Blackhole bh) {synchronized (withoutIdHash) {bh.consume (unsafeCounter++) } @ Benchmark @ Threads (2) public void withoutIdHashContended (Blackhole bh) {synchronized (withoutIdHash) {bh.consume (unsafeCounter++);} @ Benchmark @ Threads (2) public void withIdHashContended (Blackhole bh) {synchronized (withIdHash) {bh.consume (unsafeCounter++) At this point, the study on the "method step to improve lock performance by four times by rewriting the hashCode () method" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report