What is the implementation principle of ConcurrentHashMap? 07/12 Update SLTechnology News&Howtos

What is the implementation principle of ConcurrentHashMap?

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article will explain in detail what the implementation principle of ConcurrentHashMap is, and the content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Comparison between 1.HashTable and ConcurrentHashMap

HashTable itself is thread-safe, and anyone who has written a Java program knows how to achieve thread safety by adding the Synchronized keyword. One of the disadvantages of locking the whole table to achieve synchronization is that it makes the program very inefficient. This is why ConcurrentHashMap was introduced in Java after 1.5.

As can be seen from the figure, HashTable locks on the entire Hash table, while ConcurrentHashMap adds locks on segment (on each segment), so that when we operate on segment1, we can also manipulate the data in segment2, which makes it much more efficient.

The internal structure of 2.ConcurrentHashMap

ConcurrentHashMap has three main structures: the whole Hash table, segment (segment), and HashEntry (node). Each segment is equivalent to a HashTable.

(1) HashEntry class

Each HashEntry represents a node in the Hash table. In its defined structure, we can see that except for the value value which does not define final, the rest are defined as final type. We know that the domain modified by the keyword final in Java becomes the final domain. Once a variable modified with the keyword final is assigned, it cannot be changed, and the tag also known as the modifier is constant. This means that when we delete or add a node, we have to re-establish the Hash chain from scratch, because the next reference value needs to change.

Because of this feature, the data inserted into the Hash chain is inserted from scratch. For example, insert A ~ ~ B ~ C into an empty bucket, and the structure after insertion is:

(2) segment class

The Segment class inherits from the ReentrantLock class, which enables the Segment object to act as a lock. Each Segment object is used to guard several buckets it contains (in the member object table).

Table is an array of HashEntry objects. Each array member of the table array is a bucket of the hash mapping table.

The count variable is a counter that represents the number of Segment objects contained in the table array (a linked list of several HashEntry) managed by each HashEntry object. Each Segment object has a count object that represents the total number of HashEntry objects contained in this Segment. Note that the reason for including a counter in each Segment object, rather than using a global counter in ConcurrentHashMap, is to avoid a "hotspot domain" that affects the concurrency of ConcurrentHashMap.

ConcurrentHashMap class

By default, each ConcurrentHashMap class creates 16 concurrent segment, each segment contains multiple Hash tables, and each Hash chain is made up of HashEntry nodes.

3. Using detached locks to realize concurrent write operations among multiple threads (1) implementation of Put method

The whole code is easy to understand through comments, and it's worth noting that the locking here is for a specific segment, not for the entire ConcurrentHashMap. The Put method can be seen from the source code that the new data is inserted from the head of the linked list.

(2) implementation of Get method

The read method in ConcurrentHashMap does not need to be locked, and all modification operations will write count variables in the last step when making structural changes, through this mechanism to ensure that get operations can get almost the latest structural updates.

(3) the realization of Remove method.

The whole operation is performed with a segment lock, and the row before the blank line is mainly located to the node e to be deleted. Next, if this node does not exist, return null directly, otherwise you will copy the node before e, and the tail node points to the next node of e. The nodes after e do not need to be copied, they can be reused.

What is the for loop in the middle for? From the point of view of the code, all the entry after positioning is cloned and put back together, but is it necessary? Do you clone the previous element every time you delete an element? This is actually determined by the invariance of entry. If you take a closer look at the definition of entry, you can find that all the properties except value are modified with final, which means that you can't change it after you set the next field for the first time. Instead, you clone all the nodes before it. As for why entry is set to be immutable, it has something to do with immutable access that does not need to be synchronized to save time.

Perform the original linked list before deletion:

New linked list after deletion

Note: the new linked list is in clone. The order is reversed and A-> B becomes B-> A.

(4) the realization of containsKey method.

The containsKey method is relatively easy to operate because it does not need to read the value.

4. Summary

In the mode of using locks to coordinate concurrent access between multiple threads, reducing the competition for locks can effectively improve concurrency. There are two ways to reduce competition for locks:

Reduce the frequency of requesting the same lock.

Reduce the time it takes to hold the lock.

The high concurrency of ConcurrentHashMap mainly comes from three aspects:

Separate locks are used to achieve deeper shared access between multiple threads.

Use the immutability of the HashEntery object to reduce the need for locking by threads performing read operations during traversal of the linked list.

Coordinate the memory visibility of read / write operations between different threads through write / read access to the same Volatile variable.

The use of detached locks reduces the frequency of requesting the same lock.

ConcurrentHashMap has mainly made two improvements in jdk1.8.

Improvement one: cancel the segments field, directly use transient volatile HashEntry [] table to save data, and use table array elements as locks, so as to lock each row of data and further reduce the probability of concurrent conflicts.

Improvement 2: change the original data structure of table array + unidirectional linked list to the structure of table array + unidirectional linked list + red-black tree. For hash tables, the core capability is to distribute the key hash evenly in the array. If the hash after hash is uniform, the length of each queue in the table array is mainly 0 or 1. But the actual situation is not always so ideal. Although the default load factor of the ConcurrentHashMap class is 0.75, there are still some cases where the queue length is too long when the amount of data is too large or bad luck. If the one-way list is still used, then the time complexity of querying a node is O (n). Therefore, for lists with more than 8 (default), the red-black tree structure is adopted in jdk1.8, so the time complexity of the query can be reduced to O (logN), which can improve performance.

What is the implementation principle of ConcurrentHashMap is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.