Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement ConcurrentHashMap internally

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

How to carry out the internal implementation of ConcurrentHashMap, in view of this problem, this article introduces the corresponding analysis and solutions in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

ConcurrentHashMap can be said to be one of the most widely used concurrent data structures, as such a core basic component, not only to meet our functional requirements, but also to meet the performance requirements. And implementing a high-performance thread-safe HashMap is by no means easy.

As an internal implementation of JDK8, ConcurrentHashMap is a successful model with many things to learn and pay tribute to.

When I searched for this class in the project, I found that a lot of project code and source code were used. Why is it so popular? It's moral in the end. Bah.

Let's take a look at the internal implementation of ConcurrentHashMap to experience its subtlety.

Internal data structure of ConcurrentHashMap

In JDK8, the internal implementation of ConcurrentHashMap has undergone earth-shaking changes. Here is an introduction to the internal implementation of ConcurrentHashMap based on JDK8.

In terms of static data structures, ConcurrentHashMap consists of the following:

Int sizeCtl

This is a multi-functional field that can be used to record the number of threads participating in the Map extension, as well as the expansion threshold of the new table.

CounterCell [] counterCells

Used to record the number of elements, this is an array, using an array to record, in order to avoid conflicts that may arise when multi-threaded contention occurs. Using an array, when so many threads modify the number at the same time, it is very likely to actually manipulate different units in the array, thus reducing competition.

Node [] table

Where the Map content is actually stored, a map is actually an array of Node, and each Node contains information about key and value.

Node [] nextTable

When the table needs to be expanded, the new data is populated into the nextTable, that is, nextTable is the expanded Map.

These are the core elements of ConcurrentHashMap, the most noteworthy of which is that Node,Node is not as simple as you think. The following figure shows the class family structure of Node:

As you can see, a Node in Map is not a simple Node object. In fact, it could be a Node object, a Treebin or a ForwardingNode.

So when is Node, when is TreeBin, and when is a ForwardingNode?

In fact, in most scenarios, Node is still used. From the Node data structure, it is not difficult to see that Node is actually a linked list, that is, a normal Map may look like this:

In the image above, the green part represents the Node array, in which the element is Node, the head of the linked list. When two elements collide in the data, they are placed in a slot in the form of a linked list.

When the array slot corresponds to a linked list, finding key in a linked list can only use simple traversal, which is acceptable when there is not much data, and this simple traversal is a bit slow when the conflicting data is compared.

Therefore, in the specific implementation, when the length of the linked list is greater than or equal to 8, the linked list will be treed, that is, into a red-black tree. As shown in the following figure, one of the slots becomes a tree, which is TreeBin (using TreeNode to construct a whole tree in TreeBin).

When the capacity of the array is almost full, that is, when the capacity exceeds 75%, the array still needs to be expanded. In the process of capacity expansion, if the old array has been copied, the elements in the old array will be replaced with ForwardingNode objects, indicating that the data in the current slot has been processed and does not need to be processed. In this way, when multiple threads participate in the expansion at the same time, there will be no conflict.

Implementation of put () method

Now let's take a look at put (), the most important method of HashMap:

Public V put (K key, V value)

It is responsible for storing the given key and value pairs into HashMap, and its work consists of the following steps:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

If the array is not initialized, try initializing the array

If the capacity is currently being expanded, participate in the help expansion (call the helpTransfer () method)

Put the given key,value into the corresponding slot

Total number of statistical elements

Trigger capacity expansion operation

According to the above four main steps, let's explain them in detail in turn:

If the array is not initialized, try initializing the array

The initialization data generates an Node array:

Node [] nt = (Node []) new Node [n]

By default, n is 16. At the same time, set sizeCtl to n-(n > 2); this means that sizeCtl is 75% of n, which means that the size of Map, that is, the load factor of ConcurrentHashMap is 0.75. (to avoid conflicts, the capacity of Map is 75% of the array. If this threshold is exceeded, the capacity will be expanded.)

If you are currently expanding, participate in helping to expand else if ((fh = f.hash) = = MOVED) tab = helpTransfer (tab, f)

If the hash of a node is MOVE, it means that it is a ForwardingNode, that is, it is currently in the process of capacity expansion. In order to complete the expansion as soon as possible, the current thread will participate in the expansion instead of waiting for the expansion operation to be completed. Such a tight and meticulous operation is precisely the reason for the high performance of ConcurrentHashMap.

The f.hash==MOVE in the code is semantically equivalent to f instanceof ForwardingNode, but the judgment of integer equality is much more efficient than instanceof, so this is also a limit optimization of performance.

Put the given key,value into the corresponding slot

In most cases, you should get to the point where you put key and value into an array. Something like this will be used in this operation:

Node f; synchronized (f) {if (slot is a linked list) insert linked list else if (slot is red-black tree) insert tree if (linked list length is greater than 8 [TREEIFY_THRESHOLD])

As you can see, this uses the synchronized keyword to lock the Node object. Since in most cases, different threads will most likely operate on different Node, there should not be much competition here.

And as the array becomes larger and larger, the probability of competition becomes smaller and smaller, so ConcurrentHashMap has excellent parallelism.

Total number of statistical elements

To have a high-performance size () method, ConcurrentHashMap uses a separate method to count the total number of elements, which is counted in the CounterCell array:

CounterCell [] counterCells; @ sun.misc.Contended static final class CounterCell {volatile long value; CounterCell (long x) {value = x;}}

CounterCell uses pseudo-sharing optimization and has high read and write performance. The sum of the value of all the members in the counterCells is the size of the entire Map. Arrays are also used here to prevent conflicts.

If you simply use a variable, then multiple threads accumulate a counter, there will inevitably be competition, now scattered into an array, the competition is much smaller and more concurrency-friendly.

The main logic of accumulation is as follows:

If (as = = null | | (m = as.length-1) < 0 | | / / different threads are mapped to different array elements to prevent conflicts (a = as [ThreadLocalRandom.getProbe () & m]) = = null | | / use CAS to directly increase the corresponding data! (uncontended = U.compareAndSwapLong (a, CELLVALUE, v = a.value, v + x)) / / if there is a competition, you will try again here If the competition is serious, the CounterCell [] array will be expanded to reduce the competition and trigger the expansion operation.

Finally, ConcurrentHashMap will check whether the current Map size exceeds the threshold, and if so, it will expand the capacity.

The expansion process of ConcurrentHashMap is very ingenious, it does not completely disrupt the existing element position, but moves half of the elements to the new space after the array is expanded by 2 times.

All elements are divided into low nodes and high nodes according to whether the high bit is 1 or not:

/ / n is the array length, and the array length is the power of 2, so it must be 1000 10000 100000. Here low nodes are strung together, and high nodes are strung together if ((ph & n) = = 0) ln = new Node (ph, pv, ln); else hn = new Node (ph, contention, pv, hn)

Next, reposition the elements:

/ / the low node remains in the current position setTabAt (nextTab, I, ln); / / the high node is placed in the new location after the expansion, which is away from the old position n setTabAt (nextTab, I + n, hn); / / after the expansion is completed, fill setTabAt (tab, I, fwd) with ForwardingNode

The following figure shows a possible expansion from 8 to 16:00. Note that the new location is always n slots after the old location (n is the size of the original array).

The advantage of this is that the position of each element does not need to be recalculated, and since the position of the high class will naturally appear at the + n position, because it is always bitwise and for nMuth1 (which must be a binary number like 1111 11111 111111) when looking for it.

Implementation of get () method

Compared with the put () method, the get () method is simpler. The steps are as follows:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

The corresponding slot (n-1) & h is obtained according to the hash value.

If the first element key of the current slot is the same as the requested one, return directly

Otherwise, call the find () method of Node to find

ForwardingNode.find () is used for ForwardingNode

TreeBin.find () is used for red-black trees

For chain phenotypic slots, look for the corresponding key sequentially

ConcurrentHashMap can be said to be a model of concurrent design, in JDK8, ConcurrentHashMap can be said to be once again reborn, the new architecture and implementation has brought a flying experience (ConcurrentHashMap in JDK7 is still implemented by segment), read carefully, there are still a lot of gains.

This is the end of the answer to the question on how to implement ConcurrentHashMap internally. I hope the above content can be of some help to you. If you still have a lot of doubts to solve, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report