Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to apply CAS algorithm in JDK

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

Today, I would like to share with you the relevant knowledge of how to apply the CAS algorithm in JDK. The content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article.

1. What is CAS?

CAS:Compare and Swap, that is, compare and exchange.

Jdk5 adds the concurrent package java.util.concurrent.*,. The following classes use the CAS algorithm to implement an optimistic lock that is different from the synchronouse synchronous lock. Before JDK 5, the Java language relied on the synchronized keyword to ensure synchronization, which is an exclusive lock and a pessimistic lock.

2. Understanding of CAS algorithm

As far as CAS is concerned, CAS is an unlocked algorithm. CAS has three operands, the memory value V, the old expected value A, and the new value B to be modified. If and only if the expected value An is the same as the memory value V, change the memory value V to B, otherwise do nothing.

The pseudo code for CAS comparison and exchange can be expressed as:

Do {backup old data; construct new data based on old data;} while (! CAS (memory address, backup old data, new data)) 3. CAS overhead

As mentioned earlier, CAS (compare and swap) is an CPU instruction-level operation with only one atomic step, so it's very fast. And CAS avoids the problem of asking the operating system to determine the lock, without bothering the operating system, it can be done directly inside the CPU. But doesn't CAS have any expenses? No, no! There is a case of cache miss. This problem is complicated. First of all, you need to understand the hardware architecture of CPU:

The figure above shows an 8-core CPU computer system. Each CPU has a cache (cache, register inside the CPU). There is also an interconnection module in the chip, so that the two cores in the core can communicate with each other. The system interconnection module in the center of the diagram allows the four chips to communicate with each other and connects the core to the main memory. Data is transmitted through the system in "cache lines", which correspond to a block of bytes in memory of a power of 2, usually between 32 and 256 bytes. When CPU reads a variable from memory into its register, the cache line containing the variable must first be read into the CPU cache. Similarly, when CPU stores a value in a register into memory, it must not only read the cache line containing that value into the CPU cache, but must also ensure that no other CPU has a copy of the cache line.

For example, if CPU0 is performing a CAS operation on a variable and the cache line of the variable is in the CPU7 cache, the following simplified sequence of events occurs:

CPU0 checked the local cache and did not find the cache line.

The request was forwarded to the interconnection module of CPU0 and CPU1, checked the local cache of CPU1, and no cache line was found.

The request is forwarded to the system interconnection module, and the other three chips are checked, and it is found that the cache line is held by the core where CPU6 and CPU7 are located.

The request is forwarded to the interconnection module of CPU6 and CPU7, check the cache of the two CPU, and find the cache line in the cache of CPU7.

CPU7 sends the cache line to the interconnect module to which it belongs and flushes the cache line in its own cache.

The interconnection module of CPU6 and CPU7 sends the cache line to the system interconnection module.

The system interconnection module sends the cache line to the interconnection module of CPU0 and CPU1.

The interconnection module of CPU0 and CPU1 sends the cache line to the cache of CPU0.

CPU0 can now CAS variables in the cache

The above is the overhead of flushing different CPU caches. In the best case, the CAS operation takes about 40 nanoseconds, more than 60 clock cycles. The "best case" here means that the CPU that performs the CAS operation on a variable happens to be the last CPU to operate on the variable, so the corresponding cache line is already in the CPU cache, similarly, the lock operation in the best case (a "round trip pair" includes acquiring the lock and then releasing the lock) takes more than 60 nanoseconds, more than 100 clock cycles. The "best case" here means that the data structure used to indicate the lock is already in the cache to which the CPU that acquired and released the lock belongs. Lock operations are more time-consuming than CAS operations because of an in-depth understanding of parallel programming

Two atomic operations are required in the data structure for the lock operation. Cache misses consume about 140 nanoseconds, more than 200 clock cycles. The CAS operation that needs to query the old value of the variable when storing the new value consumes about 300 nanoseconds, more than 500 clock cycles. Come to think of it, CPU can execute 500 normal instructions in the time of a CAS operation. This shows the limitations of fine-grained locks.

The following is a performance comparison between cache miss cas and lock:

4. The application of CAS algorithm in JDK.

Atomic class variables, such as AtomicXXX in java.util.concurrent.atomic, use these underlying JVM support to provide an efficient CAS operation for reference types of numeric types, and most classes in java.util.concurrent use these atomic variable classes directly or indirectly when they are implemented.

The source code for AtomicInteger.incrementAndGet () in Java 1.7 is:

Thus it can be seen that the implementation of AtomicInteger.incrementAndGet uses optimistic locking technology, invokes the CAS algorithm in the sun.misc.Unsafe-like library, and uses CPU instructions to achieve lock-free self-increment. Therefore, the self-increasing efficiency of AtomicInteger.incrementAndGet is twice as high as that of synchronized.

These are all the contents of the article "how to apply CAS algorithm in JDK". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report