What are the knowledge points of JVM virtual machine memory model and efficient concurrency 07/01 Update SLTechnology News&Howtos

What are the knowledge points of JVM virtual machine memory model and efficient concurrency

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what are the JVM virtual machine memory model and efficient concurrent knowledge points". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the JVM virtual machine memory model and efficient concurrent knowledge points"?

Java memory model, namely Java Memory Model, referred to as JMM, is an abstract concept, or a protocol, which is used to solve the problem of memory access in the process of concurrent programming, and can be compatible with different hardware and operating systems at the same time. The principle of JMM is similar to that of hardware consistency. In the implementation of hardware consistency, each CPU has a cache, and each CPU reads and writes data to shared memory by interacting with its own cache.

As shown in the following figure, in the Java memory model, all variables are stored in main memory. Each Java thread has its own working memory, and a copy of the variables used by the thread is saved in the working memory. The thread reads and writes the variables in the working memory, so it can not directly manipulate the main memory or access the working memory of other threads. When the value of a variable between threads must pass through the main memory.

When communication is to be completed between two threads An and B, there are two steps to go through:

Thread A reads the shared variable from main memory into thread A's working memory and operates, and then rewrites the data back to main memory.

Thread B reads the latest shared variables from main memory

The volatile keyword enables each volatile variable to be forced to flush to main memory, making it visible to each thread.

It should be noted that the division of JMM and Java memory regions is a different conceptual level. More appropriately, JMM describes a set of rules that control how variables in the program are accessed in shared data areas and private data areas. In JMM, main memory belongs to shared data area, which should include heap and method area to some extent, while private data area of working memory data thread should include program counter, virtual machine stack and local method stack to some extent.

Interactive operation between memory

The above introduces the principle of interaction between main memory and working memory and communication between threads in JMM, but as to how to transfer variables between each memory, JMM defines eight operations to implement the specific interaction protocol between main memory and working memory:

Lockunlockreadloaduseassignstorewrite

If you want to copy a variable from the main memory to the working memory, you need to perform the read and load operations as directed, and if you synchronize the variables from the working memory back to the main memory, you need to perform the store and writ e operations sequentially. The Java memory model only requires that the above two operations must be performed sequentially, but there is no guarantee that they must be performed continuously. That is, between read and load, between store and write, other instructions can be inserted. For example, when accessing variables an and b in main memory, the possible order is read a, read b, load b, load a.

The Java memory model also states that the following rules must be met when performing the above eight basic operations:

One of the read and load, store and write operations is not allowed to appear alone

A thread is not allowed to discard its most recent assign operation, that is, variables must be synchronized to main memory after they have been changed in working memory

A thread is not allowed to synchronize data from working memory back to main memory for no reason (no assign operation has occurred)

A new variable can only be born in main memory, and it is not allowed to use an uninitialized variable (load or assign) directly in working memory. That is, before you can perform use and store operations on a variable, you must first perform assign and load operations

A variable allows only one thread to lock it at a time, and lock and unlock must appear in pairs.

If a lock operation is performed on a variable, the value of the variable will be emptied in working memory, and the value of the variable will need to be initialized by a load or assign operation before the execution engine can use it

If a variable is not locked by a lock operation in advance, it is not allowed to perform a unlock operation on it, nor is it allowed to unlock a variable locked by another thread.

Before you can unlock a variable, you must first synchronize the variable to main memory (perform store and write operations).

In addition, the virtual machine also makes some special regulations on the voliate keyword and long and double.

Two functions of voliate keyword

Ensure the visibility of variables: when a variable modified by the voliate keyword is modified by one thread, other threads can get the modified results immediately. When a thread writes data to a variable modified by the voliate keyword, the virtual machine forces it to be flushed to main memory by value. When a thread uses a value modified by the voliate keyword, the virtual machine forces it to read from main memory.

Masking instruction reordering: instruction reordering is a means for compilers and processors to optimize programs efficiently. It can only ensure that the results of program execution are correct, but can not guarantee that the operation order of the program is consistent with the code order. This is not a problem in single threading, but it can be a problem in multithreading. A very classic example is to add a voliate to a field at the same time in a singleton method to prevent instruction reordering. To illustrate this point, take a look at the following example.

Let's take the following program as an example to illustrate how voliate prevents instruction reordering:

Public class Singleton {private volatile static Singleton singleton; private Singleton () {} public static Singleton getInstance () {if (singleton = = null) {/ / 1 sychronized (Singleton.class) {if (singleton = = null) {singleton = new Singleton (); / / 2} return singleton }} copy the code

In fact, when the program executes to 2, if we do not modify the variable singleton with the voliate keyword, it may cause an error. This is because the process of initializing an object with the new keyword is not an atomic operation, it is divided into three steps:

Allocate memory to singleton

Call the constructor of Singleton to initialize member variables

Point the singleton object to the allocated memory space (singleton will be non-null after performing this step)

If there is instruction reordering optimization in the virtual machine, the order of steps 2 and 3 cannot be determined. If thread A first enters the synchronous code block and executes 3 instead of 2, it is because singleton is no longer null. At this point, thread B arrives at 1, determines that the singleton is not null and returns it to use, because at this time the Singleton is not actually initialized, it will naturally make an error.

However, it is important to note that there is a problem with the use of volatile double-check locks in versions prior to jdk 1.5. The reason is that the JMM (Java memory model) before Java 5 is flawed, and even declaring variables as volatile can not completely avoid reordering, mainly because there is still a reordering problem in the code before and after volatile variables. This problem of volatile blocking reordering was fixed in jdk 1.5 (JSR-133), when jdk enhanced the semantics of volatile and added a read-write memory barrier to volatile objects to ensure visibility, and 2-3 became a code order without being rearranged by CPU, so you can rest assured to use volatile after that.

Special requirements for long and double

In addition to the voliate keyword, the virtual machine also makes some special provisions for long and double: it allows variable read and write operations of long and double types that are not modified by volatile to be divided into two 32-bit operations. In other words, the reading and writing of long and double is non-atomic, it is divided into two steps. However, you can guarantee the atomicity of reading and writing to them by declaring them as voliate.

The principle of prior occurrence (happens-before) & as-if-serial

The Java memory model is defined by various operations, and JMM defines a partial order relationship for all operations in the program, that is, the first occurrence principle (Happens-before). It is the main basis to judge whether there is competition in the data and whether the thread is safe. To ensure that the thread performing operation B sees the results of operation A, the Happens-before relationship between An and B must be satisfied, or JVM can sort them at will.

The principle of prior occurrence mainly includes the following items. When the relationship between two variables satisfies any of the following relations, we can judge that there is a sequential and serial execution between them.

Program order rules (Program Order Rule) manage locking rules (Monitor Lock Rule) volatile variable rules (Volatile Variable Rule) Thread start rules (Thread Start Rule) Thread termination rules (Thread Termination Rule) Thread interrupt rules (Thread Interruption Rule) object termination rules (Finilizer Rule) transitivity (Transitivity)

There is no relationship between different operation time sequence and the first occurrence principle, the two can not be inferred from each other, the measurement of concurrent security issues can not be disturbed by the time sequence, everything should be based on the first occurrence principle.

If two operations access the same variable and one of the two operations is a write operation, then the two operations have data dependencies. There are three cases: 1). Write after reading; 2). Write after writing; 3). Read after writing, all three operations are data dependent, and reordering will have an impact on the final execution result. When reordering, the compiler and processor obey the data dependency, and the compiler and processor do not change the execution order of the two operations that have a data dependency relationship.

And then there is as-if-serial semantics: no matter how much you reorder (compilers and processors to provide parallelism), the execution results of (single-threaded) programs cannot be changed. Compilers, runtime, and processors must all follow as-if-serial semantics. As-if-serial semantics ensures that the execution results of programs in a single thread will not be changed, and happens-before relations ensure that the execution results of correctly synchronized multithreaded programs will not be changed.

Prior occurrence principle (happens-before) and as-if-serial semantics are the principles followed by virtual machines to optimize the parallelism of providers in order to ensure that the execution results remain unchanged. The former is suitable for multi-threaded situations and the latter is suitable for single-threaded environments.

Here I would like to recommend a Java advanced group: 725633148 will share some videos recorded by senior architects: (there are Spring,MyBatis,Netty source code analysis, principles of high concurrency, high performance, distributed, micro-service architecture, JVM performance optimization, distributed architecture), etc., which have become essential knowledge systems for architects to get free of charge.

2. Implementation of Java thread 2.1 Java thread

On Window and Linux systems, the implementation of Java threads is based on an one-to-one thread model. The so-called one-to-one model is actually a thread model that calls the system kernel indirectly through programs at the language level, that is, when we use Java threads, the Java virtual machine is transferred to the kernel thread of the current operating system to complete the current task. You need to understand a term here, Kernel-Level Thread,KLT, which is a thread supported by the operating system kernel (Kernel). This thread is switched by the operating system kernel, and the kernel schedules the thread by operating the scheduler and maps the thread's tasks to each processor. Each kernel thread can be seen as a separate part of the kernel, which is why the operating system can handle multitasking at the same time. Because the multithreaded program we write belongs to the language level, the program generally does not directly call the kernel thread, but is replaced by a lightweight process (Light Weight Process), which is also a thread in the usual sense. Because each lightweight process is mapped to a kernel thread, we can call the kernel thread through the lightweight process, and then the task is mapped to each processor by the operating system kernel. This one-to-one relationship between lightweight processes and kernel threads is called an one-to-one threading model.

As shown in the figure, each thread will eventually be mapped to the CPU for processing, and if there are multiple cores in the CPU, then a CPU will be able to execute multiple thread tasks in parallel.

2.2 Thread safety

There are three ways to ensure the thread safety of programs in Java: 1). Mutually exclusive synchronization; 2). Nonblocking synchronization; 3). No synchronization.

Mutually exclusive synchronization

The most basic way to use synchronization in Java is to use the sychronized keyword, which, after compilation, forms monitorenter and monitorexit bytecode instructions before and after synchronizing the code block. Both bytecodes require a parameter of type reference to indicate the object to be locked and unlocked. If the object parameter is specified explicitly in the Java program, the object will be used, otherwise the object instance or Class object will be removed as a locked object depending on whether the sychronized modifies the instance method or the class method.

Synchronized is inherently reentrant: according to the requirements of the virtual machine, when executing the sychronized instruction, you must first try to acquire the lock of the object. If the object is not locked, or if the current thread already has a lock on the object, add 1 to the lock counter, subtract the lock counter by 1 when the monitorexit instruction is executed accordingly, and release the lock when the counter is 0. If the weak acquisition of an object lock fails, the current thread blocks waiting until the object lock is released by another thread.

In addition to using sychronized, we can also use ReentrantLock in JUC to achieve synchronization, which is similar to sychronized, but the differences are mainly shown in the following three aspects:

Wait interruptible: when the thread holding the lock does not release the lock for a long time, the waiting thread can choose to give up waiting

Fair lock: when multiple threads wait for the same lock, they must acquire the lock in turn according to the time order of applying for the lock; while the unfair lock cannot guarantee that any waiting thread can acquire the lock when the lock is released. Sychronized itself is not a fair lock, while ReentrantLock is unfair by default, which can be required to be fair through the constructor.

Locks can be bound to multiple conditions: ReentrantLock can bind multiple Condition objects, while sychronized has to add a lock to associate with multiple conditions, and ReentrantLock only needs to call newCondition multiple times.

Before JDK1.5, sychronized was worse than ReentrantLock in multithreaded environment, but above JDK1.6, the virtual machine optimized the performance of sychronized, and performance was no longer the main factor in using ReentrantLock instead of sychronized.

Non-blocking synchronization

The so-called non-blocking synchronization means that there is no need to suspend the thread in the process of realizing synchronization, which is relative to mutually exclusive synchronization. Mutually exclusive synchronization is essentially a pessimistic concurrency strategy, while non-blocking synchronization is an optimistic concurrency strategy. Many concurrent builds in JUC are based on the principle of CAS. The so-called CAS is Compare-And-Swape, which is similar to optimistic locking. But unlike the optimistic lock that we are familiar with, it involves three values when judging: "new value", "old value" and "value in memory". An infinite loop is used when it is implemented, and each time the "old value" is compared with the "value in memory". If the two values are the same, it means that the "value in memory" has not been modified by other threads, otherwise it has been modified. You need to re-read the value in memory as "old value", and then judge the "old value" and "value in memory". Until the "old value" is the same as the "value in memory", update the "new value" to memory.

It is important to note that the above CAS operation is divided into three steps, but these three steps must be completed at once, because otherwise, when it is judged that the "value in memory" is equal to the "old value", it may be modified by other threads to write the "new value" to memory. A series of methods Native, such as compareAndSwapInt in sun.misc.Unsafe in JDK, are used to accomplish this operation. Also note that there are some problems with the above CAS operation:

AtomicReference non-synchronization scheme

The so-called asynchronous scheme does not need synchronization, for example, some sets belong to immutable sets, so it is not necessary to synchronize them. There are some methods, its function is a function, which is more common in functional programming, this function can predict the output through input, and the variables involved in the calculation are local variables, so there is no need for synchronization. Another is thread local variables, such as ThreadLocal and so on.

2.3 Lock optimized spin lock and adaptive spin

Spin lock is used to solve the problem of thread switching in the process of mutex synchronization, because thread switching itself has some overhead. If the physical machine has more than one processor that allows two or more threads to execute in parallel at the same time, we can ask the later thread that requests the lock to "wait a moment" without giving up the processor's execution time. see if the thread holding the lock will release the lock soon. In order for the thread to wait, we simply have the thread perform a busy loop (spin), a technique known as spin locking.

Spinlock was introduced in JDK 1.4.2, but it is off by default, can be turned on using the-XX:+UseSpinnin g parameter, and has been turned on by default in JDK 1.6. Although spin waiting itself avoids the overhead of thread switching, it takes up processor time, so if the lock is occupied for a short time, the effect of spin waiting will be very good, on the contrary, if the lock is occupied for a long time, then the spinning thread will only consume processor resources in vain, instead of doing any useful work, but will lead to a waste of performance.

We can specify the number of spins with the parameter-XX:PreBlockSpin, and the default value is 10. An adaptive spin lock is introduced in JDK 1.6. Adaptation means that the spin time is no longer fixed, but is determined by the spin time on the same lock and the state of the lock's owner. If, on the same lock object, spin wait has just successfully acquired the lock, and the thread holding the lock is running, the virtual machine will think that the spin is likely to succeed again, which in turn will allow the spin wait to last relatively longer, such as 100 cycles. On the other hand, if spin is rarely successfully acquired for a lock, the spin process may be omitted when acquiring the lock in the future, so as to avoid wasting processor resources.

Here is an example of an implementation of a spin lock:

Public class SpinLock {private AtomicReference sign = new AtomicReference (); public void lock () {Thread current = Thread.currentThread (); while (! sign.compareAndSet (null, current));} public void unlock () {Thread current = Thread.currentThread (); sign.compareAndSet (current, null);}} copy the code

As we can see from the above example, the spin lock is locked and released through the CAS operation by comparing whether the period value meets the expectation. In the lock method, if the value in sign is null, the token lock is released, otherwise the lock is occupied by other threads and needs to wait through a loop. In the unlock method, the waiting thread is notified that the lock has been released by setting the value in sign to null.

Lock coarsening

The concept of lock coarsening should be easier to understand, that is, the locking and unlocking operations that are connected many times are combined into one, and multiple consecutive locks are extended into a larger lock.

Public class StringBufferTest {StringBuffer sb = new StringBuffer (); public void append () {sb.append ("a"); sb.append ("b"); sb.append ("c");} copy the code

Here, each call to the sb.append () method requires locking and unlocking, and if the virtual machine detects a series of locking and unlocking operations on the same object, it will be merged into a larger range of locking and unlocking operations, that is, locking the first append () method and unlocking it after the last append () method ends.

Lightweight lock

Lightweight locks are used to solve the problem of performance consumption of heavyweight locks in the process of mutual exclusion. The so-called heavyweight locks are locks implemented by the sychronized keyword. Synchronized is implemented through a monitor lock (monitor) inside the object. But the essence of the monitor lock depends on the Mutex Lock of the underlying operating system. The operating system needs to switch from user mode to core state to switch between threads, which is very expensive, and the transition between states takes a relatively long time.

First of all, there is a part in the object header called Mark word, which stores the object's run-time data, such as hash code, GC age, and so on, in which 2bit is used to store lock flag bits.

When the code enters the synchronous block, if the lock state of the object is unlocked (the lock flag bit is "01"), the virtual machine will first establish a space called lock record (Lock Record) in the stack frame of the current thread to store a copy of the current Mark Word of the lock object. After the copy is successful, the virtual machine will use the CAS operation to attempt to update the Mark Word of the object to the pointer to the Lock Record and point the owner pointer in the Lock Record to the Mark word of the pair. And change the lock mark bit of the object's Mark Word to "00", indicating that the object is locked. If the update operation fails, the virtual machine will first check whether the Mark Word of the object points to the stack frame of the current thread. If it means that the current thread already has the lock of the object, it can go directly to the synchronization block to continue execution. Otherwise, if multiple threads compete for locks, the lightweight lock will expand to the heavyweight lock, the lock flag will become "10", the pointer to the heavyweight lock (mutex) is stored in Mark Word, and the thread waiting for the lock will also enter the blocking state. The current thread tries to use spin to acquire the lock, which is the process of using a loop to acquire the lock in order to prevent the thread from blocking.

We can see from the above that when a thread acquires a lightweight lock on an object, the object's Mark Word points to the Lock Record in the thread's stack frame, and the Lock Record in the stack frame also points to the object's Mark Word. The Lock Record in the stack frame is used to determine which object's lock is already held by the current thread, while the object's Mark Word is used to determine which thread holds the current object's lock. When a thread tries to acquire a lock on an object, it will first determine whether the current object is locked by the lock flag bit, and then determine whether the current thread acquiring the object lock is the current thread through the CAS operation.

Lightweight locks are not designed to replace heavy locks because they add additional CAS operations in addition to locking, so lightweight locks can be slower than traditional heavyweight locks in competitive situations.

Bias lock

When an object is first instantiated, there is no thread to access it. It is biased, meaning that it now thinks that only one thread can access it, so when the first thread accesses it, it favors that thread. At this point, the object holds a bias lock, biased towards the first thread. This thread uses the CAS operation when changing the object header to a biased lock, and changes the ThreadID in the object header to its own ID, and then accesses the object again, only needs to compare the ID, and no longer needs to use CAS to operate.

Once a second thread accesses the object, because the bias lock will not be released actively, the second thread can see the biased state of the object, which indicates that there is already competition on the object. Check whether the thread that originally held the lock on the object is still alive. If you hang up, you can make the object unlocked, and then re-bias the new thread, if the original thread is still alive. The operation stack of that thread is executed immediately to check the usage of the object, and if you still need to hold a bias lock, the bias lock is upgraded to a lightweight lock (this is when the bias lock is upgraded to a lightweight lock). If it is no longer in use, you can restore the object to an unlocked state and then reorient it.

Lightweight locks think that competition exists, but the degree of competition is very mild, usually two threads will stagger the operation of the same lock, or wait a little (spin), the other thread will release the lock. But when the spin exceeds a certain number of times, or when one thread is holding the lock, one is spinning, and the third is visiting, the lightweight lock expands to a heavy lock, and the heavyweight lock blocks all threads except the thread that owns the lock, preventing CPU from idling.

If locks are always accessed by multiple different threads in most cases, then bias mode is superfluous and can be optimized by-XX:-UserBiaseLocking.

The proposal of lightweight lock and biased lock is based on the fact that in most cases, the thread that acquires an object lock is the same thread, and it is more efficient than the heavyweight lock in this case. when locks are always accessed by multiple different threads, they are not necessarily more efficient than heavy locks. Therefore, they are not proposed to replace heavy locks, but they are more efficient than heavyweight locks in some scenarios, so we can set whether or not to enable them through virtual machine parameters according to the scenarios we apply.

Thank you for your reading, the above is the content of "what are the JVM virtual machine memory model and efficient concurrency knowledge points". After the study of this article, I believe you have a deeper understanding of what the JVM virtual machine memory model and efficient concurrency knowledge points have, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.