How to understand the underlying principle semantics of Java volatile memory barrier 04/26 Update SLTechnology News&Howtos

How to understand the underlying principle semantics of Java volatile memory barrier

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article shows you how to understand the semantics of the underlying principle of Java volatile memory barrier, which is concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

1. Introduction of the volatile keyword and the characteristics of the underlying principle 1.volatile (memory semantics)

When a variable is defined as volatile, it will have two features: the first is to ensure the visibility of the variable to all threads, where "visibility" means that when a thread changes the value of the variable, the new value is immediately known to other threads. Ordinary variables cannot do this, and the values of ordinary variables need to be passed through main memory when they are passed between threads. For example, thread A modifies the value of a normal variable and then writes back to main memory, and another thread B reads the main memory after thread A has finished writing back, so that the new variable value is visible to thread B.

The second semantics of using volatile variables is to prohibit instruction reordering optimization. Ordinary variables will only ensure that the correct results can be obtained in all places that rely on the assignment results during the execution of the method, but can not guarantee that the order of variable assignment operations is the same as that in the program code. Because this is not perceived during method execution by the same thread, this is the so-called "serial semantics within threads" (Within-Thread As-If-Serial Semantics) described in the Java memory model.

Basic principle of 2.volatile

The variables modified by the volatile keyword can guarantee visibility and ordering, but not atomicity. So what is the underlying principle of the volatile keyword? We can look at the underlying principle of volatile by looking at the assembly instruction of Java code: querying the assembly instruction of Java code requires setting the JVM allow parameter:-XX:+UnlockDiagnosticVMOptions-XX:+PrintAssembly-Xcomp If you need to add Hsdis plug-ins to jdk if your jdk version is less than or equal to 8, copy two files (hsdis-amd64.dll,hsdis-i386.dll) in the plugin directory to% JAVA_HOME%\ jre\ bin\ server, then run your Java program, and you can see a pile of assembly instruction code output in the console.

Public class Singleton {private volatile static Singleton myinstance; public static Singleton getInstance () {if (myinstance = = null) {synchronized (Singleton.class) {if (myinstance = = null) {myinstance = new Singleton (); / / object creation process, which can be divided into three steps} return myinstance } public static void main (String [] args) {Singleton.getInstance ();}}

Shown above is a standard double lock detection (Double Check Lock,DCL) singleton code that can observe the difference in assembly code generated with and without the volatile keyword. When you do not add the volatile keyword, output instructions in the console to search myinstance, you can see the following two lines

0x00000000038064dd: mov% r10dpene 0x68 (% rsi)

0x00000000038064e1: shr $0x9jie% RSI

0x00000000038064e5: movabs $0xf1d8000jue% Rax

0x00000000038064ef: movb $0x0, (% rsi,%rax,1); * putstatic myinstance

;-com.it.edu.jmm.Singleton::getInstance@24 (line 22)

After adding the volatile keyword, it looks like this:

0x0000000003cd6edd: mov% r10dpene 0x68 (% rsi)

0x0000000003cd6ee1: shr $0x9jie% RSI

0x0000000003cd6ee5: movabs $0xf698000Ji% Rax

0x0000000003cd6eef: movb $0x0, (% rsi,%rax,1)

0x0000000003cd6ef3: lock addl $0x0, (% rsp); * putstatic myinstance

;-com.it.edu.jmm.Singleton::getInstance@24 (line 22)

By comparison, it is found that the key change lies in the variables modified by volatile. After the assignment (the previous movb $0x0, (% rsi,%rax,1) is the assignment operation), an extra "lock addl $0x0, (% rsp)" operation is performed, which acts as a memory barrier (Memory Barrier or Memory Fence), which means that when reordering, the subsequent instructions cannot be reordered to the position in front of the memory barrier, and only one processor accesses memory. No memory barrier is required. But if two or more processors access the same piece of memory, and one of them is observing the other, a memory barrier is needed to ensure consistency.

The key here is the lock prefix, which writes the processor's cache to memory, and this write action will also cause other processors or other kernels to invalidate their cache (I state of the Invalidate,MESI protocol). This operation is equivalent to doing a "store and write" operation on the variables in the cache as described in the Java memory mode described earlier. So through such an operation, the previous modification of the volatile variable can be immediately visible to other processors. A lower-level implementation of the lock directive: if the cache line is supported, the cache lock (MESI) is added; if the cache lock is not supported, the bus lock is added.

II. Volatile-- visibility

After volatile modifies the variable, you can ensure visibility, which is demonstrated by a program example:

Public class VolatileVisibilitySample {private volatile boolean initFlag = false; static Object object = new Object (); public void refresh () {this.initFlag = true; System.out.println ("Thread:" + Thread.currentThread (). GetName () + ": modify the shared variable initFlag");} public void load () {int I = 0 While (! initFlag) {/ / synchronized (object) {/ / System.out.println ("thread:" + Thread.currentThread (). GetName () + "current thread sniffed a change in the state of initFlag" + I);} public static void main (String [] args) throws InterruptedException {VolatileVisibilitySample sample = new VolatileVisibilitySample () Thread threadA = new Thread (()-> {sample.refresh ();}, "threadA"); Thread threadB = new Thread (()-> {sample.load ();}, "threadB"); threadB.start (); Thread.sleep (2000); threadA.start ();}}

You can see that before the shared variable is modified by volatile, the output of "the current thread sniffs a change in the state of initFlag" in the method called in thread B cannot be printed, which means that thread A changes initFlag to true, but thread B does not get the latest value, and the program has been running empty in a loop. At this point, the JMM operation is as follows: although thread A changed initFlag to true and will eventually synchronize back to main memory, the initFlag read in loop in thread B has always been read from working memory, so there will be a dead loop and cannot exit.

After adding the volatile modifier, the sentence "the current thread senses a change in the state of initFlag" will be printed, because after adding the volatile keyword, there will be a lock instruction, using the cache consistency protocol, thread B will always sniff whether the initFlag has been changed, thread A will synchronize back to main memory immediately after modifying initFlag, and thread B will be informed to change the cache line state to I (invalid state). Need to re-read from the main memory. As shown in the following figure:

We modify the load () method of the above code-remove the volatile keyword and add the synchronized synchronization block, that is, if it is modified to the following situation, it will achieve the same effect as adding the volatile keyword. This is because the lock synchronization block is added, CPU allocates time slices, and thread context switching is caused by lock competition, re-reading the variables of main memory.

Public void load () {int I = 0; while (! initFlag) {synchronized (object) {ionization;}} System.out.println ("Thread:" + Thread.currentThread (). GetName () + "current thread senses a change in the state of initFlag" + I);} III. Volatile-- cannot guarantee atomicity

Because volatile variables can only guarantee visibility, we still need to ensure atomicity by adding locks (using locks in synchronized, java.util.concurrent, or atomic classes) in computing scenarios that do not meet the following two rules:

The result of the operation does not depend on the current value of the variable, or ensures that only a single thread modifies the value of the variable.

Variables do not need to participate in invariant constraints with other state variables

Here's an example: 10 threads, each added 1000 times (counter++ is not an atomic operation. You can view the underlying instructions through the javap command, and you can see that there are operations such as loading variable data, putting variables on top of the Operand stack, performing addition operations, and so on. Run several times and find that sometimes the result is less than 10000. Here's an analysis:

1. First of all, when counter is not modified by volatile: because 10 threads add one to the variable at the same time, and write to the main memory after each operation, it will overwrite the operation results of other threads, so the running result may be less than 10000.

When adding volatile modification to 2.counter: after adding volatile modification, the variable will be synchronized back to main memory immediately after modification, always sniffing whether other threads have modified the variable, and then re-reading the variable from main memory after modification. But precisely because the MESI cache consistency protocol takes effect when the volatile keyword is added, when a variable adds 1, it needs to synchronize back to main memory, which locks the cache line, notifies other threads that the variable has been modified, and changes the local cache line to I invalid state, so that the result of the thread local plus 1 operation that is changed to invalid state is discarded and is not written back to main memory, that is, it is added for nothing. So the running result may also be less than 10000.

If you want to achieve atomic operations, you can lock them through synchronized,ReentrantLock, or use AtomicInteger for atomic operations.

Public class VolatileAtomicSample {private static volatile int counter = 0; public static void main (String [] args) throws InterruptedException {for (int I = 0; I

< 10; i++) { Thread thread = new Thread(()->

{for (int j = 0; j)

< 1000; j++) { counter++; } }); thread.start(); } Thread.sleep(1000); System.out.println(counter); }}四、volatile--禁止指令重排1.指令重排重排序是指编译器和处理器为了优化程序性能而对指令序列进行重新排序的一种手段。java语言规范规定JVM线程内部维持顺序化语义。即只要程序的最终结果与它顺序化情况的结果相等，那么指令的执行顺序可以与代码顺序不一致，此过程叫指令的重排序。指令重排序的意义是什么？JVM能根据处理器特性（CPU多级缓存系统、多核处理器等）适当的对机器指令进行重排序，使机器指令能更符合CPU的执行特性，最大限度的发挥机器性能。下图为从源码到最终执行的指令序列示意图指令重排主要有两个阶段： 1.编译器编译阶段：编译器加载class文件编译为机器码时进行指令重排 2.CPU执行阶段： CPU执行汇编指令时，可能会对指令进行重排序 2.as-if-serial语义 as-if-serial语义的意思是：不管怎么重排序（编译器和处理器为了提高并行度），（单线程）程序的执行结果不能被改变。编译器、runtime和处理器都必须遵守as-if-serial语义。为了遵守as-if-serial语义，编译器和处理器不会对存在数据依赖关系的操作做重排序，因为这种重排序会改变执行结果。但是，如果操作之间不存在数据依赖关系，这些操作就可能被编译器和处理器重排序。通过一个程序代码，演示一下指令重排的效果：只有x=0并且y=0的情况下才会跳出循环 public class VolatileReOrderSample { private static int x = 0, y = 0; private static int a = 0, b =0; static Object object = new Object(); public static void main(String[] args) throws InterruptedException { int i = 0; for (;;){ i++; x = 0; y = 0; a = 0; b = 0; Thread t1 = new Thread(new Runnable() { @Override public void run() { a = 1; x = b; } }); Thread t2 = new Thread(new Runnable() { @Override public void run() { b = 1; y = a; } }); t1.start(); t2.start(); t1.join(); t2.join(); String result = "第" + i + "次 (" + x + "," + y + "）"; if(x == 0 && y == 0) { System.err.println(result); break; } else { System.out.println(result); } } }} 通过分析，会有三种可能的输出：[0,1]，[1,0]，[1,1]。输出可能1--[0,1]：线程1先执行完，线程2再执行，则会出现x=0，y=1 输出可能1--[1,0]：线程2先执行完，线程1再执行，则会出现x=1，y=0 输出可能1--[1,1]：线程1、线程2交替执行，a=1,b=1，然后执行x=1,y=1，则会出现x=1,y=1 当运行之后会发现上面分析的三种情况确实出现了，但是程序最终跳出了循环，也就是出现了x=0并且y=0的情况，这说明出现了指令重排的情况，即线程1中a=1 x=b的指令出现了顺序调整或线程2中b=1 y=a的指令出现了顺序调整。当我们给变量a和b添加volatile关键字修饰后(private volatile static int a = 0, b =0;)，再次运行发现程序一直在循环输出，没有出现x=y=0的情况从而退出循环。 volatile可以禁止指令重排的原因是因为添加了lock指令，会添加内存屏障。五、volatile与内存屏障（Memory Barrier）1.内存屏障(Memory Barrier）内存屏障(Memory Barrier）又称内存栅栏，是一个CPU指令，它的作用有两个，一是保证特定操作的执行顺序，二是保证某些变量的内存可见性（利用该特性实现volatile的内存可见性）。由于编译器和处理器都能执行指令重排优化。如果在指令间插入一条Memory Barrier则会告诉编译器和CPU，不管什么指令都不能和这条Memory Barrier指令重排序，也就是说通过插入内存屏障禁止在内存屏障前后的指令执行重排序优化。Memory Barrier的另外一个作用是强制刷出各种CPU的缓存数据，因此任何CPU上的线程都能读取到这些数据的最新版本。总之，volatile变量正是通过内存屏障（lock指令）实现其在内存中的语义，即可见性和禁止重排优化。上面的程序示例：synchronized+volatile实现的DCL模式的单例模式，就是利用了volatile禁止指令重排的特性。因为myinstance = new Singleton();这句代码本质上是有三步：1.为对象分配内存空间；2.实例化对象数据；3.将引用指向对象实例的内存空间。如果第一个线程执行创建对象时出现了指令重排，比如3排到了2之前，那么线程2在最外层代码判断myinstance!=null为true返回对象引用，但是实际上这时候对象尚未初始化完成，这样是有问题的，需要通过添加volatile关键字去禁止指令重排。 2.volatile的内存语义实现前面提到过重排序分为编译器重排序和处理器重排序。为了实现volatile内存语义，JMM会分别限制这两种类型的重排序类型。下图是JMM针对编译器制定的volatile重排序规则表。

For example, the last cell in the third line means that in a program, when the first operation is a read or write of a normal variable, and if the second operation is volatile, the compiler cannot reorder these two operations.

From the picture above, we can see:

When the second operation is volatile write, no matter what the first operation is, it cannot be reordered. This rule ensures that operations before volatile writing are not reordered by the compiler after volatile writing.

When the first operation is a volatile read, no matter what the second operation is, it cannot be reordered. This rule ensures that operations after volatile read are not reordered by the compiler before volatile read.

When the first operation is volatile write and the second operation is volatile read, it cannot be reordered.

In order to implement the memory semantics of volatile, when generating bytecode, the compiler inserts a memory barrier in the instruction sequence to prevent certain types of processors from reordering. It is almost impossible for the compiler to find an optimal arrangement to minimize the total number of insertion barriers. To this end, JMM adopts a conservative strategy. The following is a JMM memory barrier insertion strategy based on a conservative strategy.

Insert a StoreStore barrier in front of each volatile write operation.

Insert a StoreLoad barrier after each volatile write operation.

Insert a LoadLoad barrier after each volatile read operation.

Insert a LoadStore barrier after each volatile read operation.

The above memory barrier insertion strategy is very conservative, but it can ensure that the correct volatile memory semantics can be obtained in any processor platform and any program.

The following is a schematic diagram of the instruction sequence generated by volatile after inserting the memory barrier under a conservative strategy, as shown in the figure.

The StoreStore barrier in the image above ensures that all normal writes preceding it are visible to any processor before volatile is written. This is because the StoreStore barrier ensures that all normal writes above are flushed to main memory before volatile writes.

The StoreLoad barrier behind volatile writes is to prevent volatile writes from being reordered with possible volatile read / write operations.

The following is a schematic diagram of the instruction sequence generated by volatile reading after inserting the memory barrier under a conservative strategy.

The LoadLoad barrier in the image above is used to prevent the processor from reordering the upper volatile reads with the normal reads below. The LoadStore barrier is used to prevent the processor from reordering the upper volatile read with the lower normal write.

The above memory barrier insertion strategies for volatile writes and volatile reads are very conservative. In actual execution, the compiler can omit unnecessary barriers according to specific circumstances, as long as the write-read memory semantics of volatile is not changed.

6. JMM's special rule definition of volatile.

Finally, we define the special rules for the definition of volatile variables in the Java memory model. Assuming that T represents a thread and V and W represent two volley variables respectively, the following rules need to be met when performing read, load, use, assign, store, and write operations:

Thread T can perform use actions on variable V only if the previous action performed by thread T on variable V is load, and thread T can perform load actions on variable V only if the last action performed by thread T on variable V is use. The use action of thread T to variable V can be considered to be associated with the load and read actions of thread T to variable V, and must occur continuously and together.

This rule requires that each time V is used in working memory, the latest value must be refreshed from the main memory to ensure that the changes made to variable V by other threads can be seen.

Thread T can perform store actions on variable V only if the previous action performed by thread T on variable V is assign, and thread T can perform assign actions on variable V only if the last action performed by thread T on variable V is store. The assign action of thread T to variable V can be considered to be associated with the store and write actions of thread T to variable V, and must occur continuously and together.

This rule requires that every time V is modified in working memory, it must be synchronized back to main memory to ensure that other threads can see the changes they have made to variable V.

It is assumed that action An is a use or assign action performed by thread T on variable V, that action F is an load or store action associated with action A, and that action P is a read or write action against variable V corresponding to action F Similarly, it is assumed that action B is a use or assign action performed by thread T on variable W, that action G is an load or store action associated with action B, and that action Q is an read or write action against variable W corresponding to action G. If A precedes B, then P precedes Q.

This rule requires that variables modified by volatile will not be reordered and optimized by instructions, thus ensuring that the order in which the code is executed is the same as that of the program.

The above content is how to understand the underlying principle semantics of Java volatile memory barrier. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.