Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand ConcurrentHashMap

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to understand ConcurrentHashMap". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to understand ConcurrentHashMap.

1 introduction

Although HashMap is very efficient and easy to use, it is a pity that it is not thread-safe (for example, various + + operations are not thread-safe). Read the source code to know that HashMap does not control concurrency. But it doesn't hurt that Doug Lea provides us with the utility class ConcurrentHashMap, a concurrent version of HashMap. Compared with Hashtable, which simply adds the synchronized keyword to methods to control concurrency (all methods share a lock resource, and only one thread can call at a time), ConcurrentHashMap achieves real concurrent calls.

Like HashMap, ConcurrentHashMap has changed in Java 7 and Java 8: in Java 7, a segmented lock Segment, which inherits ReentrantLock, is used to achieve concurrency (I have previously written an article on ReentrantLock source code analysis. AQS source code in-depth analysis of the exclusive mode-ReentrantLock lock feature details), each Segment stores a structure similar to HashMap (HashEntry). So the operations in one Segment and the other Segment are not interfering with each other, which means that the concurrency of ConcurrentHashMap depends on the number of Segment (default is 16). However, it still feels that the concurrency granularity is too coarse to use this way, so it has been improved in Java 8.

ConcurrentHashMap in Java 8 completely abandons the previous data structure and uses the same array + linked list + red-black tree structure as HashMap, which makes it lighter as a whole. Concurrency control is implemented through * * CAS + synchronized + Unsafe class direct manipulation address (volatile semantics) * *, and the concurrency granularity is reduced to the bucket (that is, the concurrency granularity is the length of the array. Note that it is wrong to say that the concurrency granularity is on the Node node. Because it is impossible for one thread to modify one node on the linked list while another thread modifies another node on the linked list at the same time. The source code locks the first node on the linked list, which is only a superficial meaning, but the real meaning is to lock the whole bucket (linked list). Its source code structure is also similar to HashMap, except that it has been changed to support concurrency in some key code that will have concurrency problems (major changes have been made in the two methods of multithread counting and multithreading expansion). The structure of the two is similar and it is more convenient for us to look at the differences in the implementation of concurrency and learn how the master handles concurrency. So in a sense, ConcurrentHashMap in Java 8 is an improved version of HashMap in terms of concurrency. But this doesn't mean replacement, because HashMap has higher performance than ConcurrentHashMap in scenarios where concurrency is not needed (such as local variables) (CAS and synchronized are more or less expensive, and there are other code to consider concurrency).

The ReentrantLock source code is also written by Doug Lea. ConcurrentHashMap source code from Java 7 to use ReentrantLock (Segment) to achieve concurrency control, to Java 8 to use CAS + synchronized + Unsafe class direct operation address (volatile semantics), we can see that in this version of Java 8, the performance of synchronized has been optimized (bias lock + lightweight lock + heavy lock). In fact, synchronized can constantly optimize its performance because it belongs to the underlying implementation. While ReentrantLock inherits from AQS, it still belongs to blocking and awakening at the code level (depending on the thread library of the underlying operating system), and the optimization range is not high.

It is also important to note that key and value are allowed to be null in HashMap, but not in ConcurrentHashMap, and null pointer exceptions are thrown. Why is that? First of all, let's explain why value cannot be null. In fact, I have already said in the analysis of the get method in HashMap that it is ambiguous to obtain the corresponding value of the key by the get method, and the result is null. I can't tell whether it's because it's map.put ("key", null); or because the key-value pair itself doesn't exist in HashMap, so it returns null. But in HashMap, I can use the containsKey method to see which of the above cases it belongs to, because HashMap itself assumes that it will be executed correctly in a single thread, so there will be no problem doing so. But in ConcurrentHashMap, if value is also allowed to be null, then I also judge according to the above way, and may write the following code:

1 if (map.containsKey ("key")) {2 return map.get ("key"); 3} else {4 / / do some other processing, such as throwing an exception 5} without key

Suppose there is a key-value pair of "key"-> null in the current ConcurrentHashMap. Just when the first thread finishes executing the containsKey method and returns true, and when the second thread is ready to execute the get method, the second thread deletes the key-value pair, and the first thread get method returns null is ambiguous: I think that the null is returned by the get method because there is a key-value pair "key"-> null, but in fact the null is returned because the key-value pair has been deleted.

Let's talk about the reason why key can't be null. To be honest, I can't find a scenario that explains why key can't be null (as explained above that value can't be null), and here's Doug Lea's explanation of the problem:

He only explained why value cannot be null (as I said above), but in the penultimate paragraph, he said that it is difficult to check that key and value are null. In fact, after reading what he said, I have reason to believe that key can not set the code specification for null or Doug Lea in advance (now that your value is no longer null, key is not null), so as to avoid unnecessary trouble (ambiguity or otherwise). It is also mentioned above that Doug Lea believes that key and value in HashMap cannot be null, and gives a way to distinguish the two differences by wrapping NULL as an empty object when using HashMap in multithreading (it should also be possible to use the Optional class in Java 8). Interestingly, this has something to do with Josh Bloch, another big shot in the Java world and one of the lead authors of HashMap (so is Doug Lea). It can be thought that HashMap was mainly written by Josh Bloch, while ConcurrentHashMap was written by Doug Lea.) the previous ideas were different, but later, Josh Bloch seemed to change his tune:

He agrees that an error may occur when key is null, but is not sure whether value should not be null. And think that if you add this way of wrapping NULL empty objects in the Java source code, it needs to be carefully considered. These questions and answers were released in 2006, but until today's Java 8u261, I still can't find a similar fix in the source code. But anyway, we just need to know that there is such a thing.

2 Constructor 1 / * * 2 * ConcurrentHashMap: 3 * No parameter constructor 4 * null implementation, all parameters follow the default 5 * / 6 public ConcurrentHashMap () {7} 8 9 / * * 10 * Parametric Constructor 11 * / 12 public ConcurrentHashMap (int initialCapacity) {13 / / initialCapacity Nonnegative check 14 if (initialCapacity)

< 0)15 throw new IllegalArgumentException();16 /*17 与HashMap不同的是,这里initialCapacity如果大于等于2的29次方的时候(HashMap这里为超过2的30次方),18 就重置为2的30次方1920 tableSizeFor方法是用来求出大于等于指定值的最小2次幂的(我在HashMap源码分析中详细解释了该方法的执行过程),21 有意思的是,注意在第26行代码处,在HashMap中仅仅就是对设定的数组容量取最小2次幂,而这里首先对设定值*1.5+1后,22 再取最小2次幂,后面会解释为什么会这么做23 */24 int cap = ((initialCapacity >

= (MAXIMUM_CAPACITY > 1))? 25 MAXIMUM_CAPACITY: 26 tableSizeFor (initialCapacity + (initialCapacity > 1) + 1); 27 / * 28 sizeCtl is used to record the status of the current array (similar to threshold in HashMap): 29 1. If-1, the current array is being initialized 302. If it is another negative number, it means that the current array is being expanded. Take the lower 16 bits of the negative number, that is, (1 + n), n represents the number of threads performing the expansion operation 31 (here + 1 is to stagger the value of-1) 32 3. When calling the parameter constructor, it stores the capacity that needs to be initialized. 0345 when the no-parameter constructor is called. The current array is no longer empty, and what is stored at this time is the threshold 35 * / 36 this.sizeCtl = cap;37} for the next expansion.

At line 26 above, you will first set the value 1.5 to 1 (+ 1 is the case after 1.5 if the result has a decimal). Because the final round (the argument passed into the tableSizeFor method must be int), that is, all the decimal parts are truncated, so + 1 is to make up for this difference), and then take the least second power, which is different from the implementation in HashMap (tableSizeFor (initialCapacity) in HashMap), so why on earth is this? In fact, the incoming capacity is not the number of buckets stored, but the number of barrels that need to be expanded. * * for example: 16 * 0.75 = 12. In HashMap, the value we pass in is actually 16, which is the threshold point when we actually need to expand capacity by multiplying the load factor. In ConcurrentHashMap, the value passed in is actually equal to 12, which means that we pass in the threshold value that needs to be expanded. Therefore, in the constructor phase, we need to divide by the load factor to find out the number of real buckets. So in line 26 above, you are actually doing the work of adaptive capacity. Then you may be thinking: no, even if you are adapting, it should be 0.75 of the array capacity / default value. What the heck is 1.5? I guess it may be to improve the execution speed, in fact, / 0.75 is equivalent to 1.333333. This does not seem to be much different from 1.5, but the way of / 0.75 is division with decimals, and 1.5 can be optimized to move to the right. But doing so may lead the calculation to a different value. Let's take an example: for example, if the incoming capacity is 22, the result of / 0.75 is 29.3 tableSizeFor + 1 and then the tableSizeFor result is 32; while the result of * 1.5 is 3Q + 1 and then the tableSizeFor result is 64. As you can see, the final capacity calculated by * 1.5 is obviously wrong, which is equivalent to doubling the capacity (the load factor is equal to the default 0.75, so if you take the power of 2 after 22 / 0.75, the result must be 32 instead of 64). And it actually turns out to be true, and this is actually a bug. You can see the following JDK-8202422 on OpenJDK's bug submission record:

Several pieces of information can be crawled from above: this bug has been available since Java 8 and has been fixed in Java 11.0.1. In that case, let's see what this piece has been changed into. The following is the source code for the ConcurrentHashMap single-parameter constructor in Java 14.0.2:

1 public ConcurrentHashMap (int initialCapacity) {2 this (initialCapacity, LOAD_FACTOR, 1); 3} 4 5 public ConcurrentHashMap (int initialCapacity, 6 float loadFactor, int concurrencyLevel) {7 if (! (loadFactor > 0.0f) | | initialCapacity

< 0 || concurrencyLevel = (long) MAXIMUM_CAPACITY) ?13 MAXIMUM_CAPACITY : tableSizeFor((int) size);14 this.sizeCtl = cap;15 } 可以看到在第11行代码处,已经改为了initialCapacity / loadFactor的方式,解决了这个bug点(Java 8中没有修复这个bug其实也无妨,毕竟多扩容一倍少扩容一倍并不是什么严重的逻辑bug)。 至于为什么要这么做?为什么要和HashMap的实现有所差异?我觉得是因为要淡化自定义负载因子的功能。如果细心的话可以看到,在ConcurrentHashMap的源码中,this.loadFactor的使用几乎没有(仅有的一次使用也是遗留代码),似乎Doug Lea已经不建议我们来自己修改负载因子的值了。虽然仍然可以在构造器阶段传入自定义的loadFactor(向后兼容),但是也仅仅是在该构造器内部中才会有所影响,在后续的初始化以及扩容阶段使用的还是默认的0.75(后面会看到这点),所以说如果传入的自定义负载因子不是0.75的话就显得很鸡肋了。在源码中对此也有所注释: 还有一点需要明确:sizeCtl在为负数表示扩容的时候(不包括-1),严格的定义为取该负数的低16位,即(1 + n),n代表正在执行扩容操作的线程数量(这里+1是为了错开-1这个值)。低16位表示的才是扩容线程数量+1,而高16位为一个生成数组长度所对应的标志位(详见后面的示意图)。而在源码中的注释是这样写的: 可以看到并不准确。 3 put方法 1 /** 2 * ConcurrentHashMap: 3 */ 4 public V put(K key, V value) { 5 return putVal(key, value, false); 6 } 7 8 final V putVal(K key, V value, boolean onlyIfAbsent) { 9 //注意,ConcurrentHashMap中的key和value是不允许为null的,但在HashMap中却可以 10 if (key == null || value == null) throw new NullPointerException(); 11 //计算key的hash,注意,这里必须是一个非负数,详见spread方法中的注释 12 int hash = spread(key.hashCode()); 13 //binCount表示添加当前节点前,这个桶上面的节点数 14 int binCount = 0; 15 //注意这里是个死循环 16 for (Node[] tab = table; ; ) { 17 Node f; 18 int n, i, fh; 19 if (tab == null || (n = tab.length) == 0) 20 //如果当前数组还没有初始化的话,就进行初始化的工作(延迟初始化至该方法中)。然后会跳到下一次循环,添加节点 21 tab = initTable(); 22 /* 23 tabAt方法是Unsafe类中通过volatile方式获得指定地址所对应的值,方式是通过(n - 1) & hash 24 也就是通过(n - 1) & hash的方式来找到这个数据插入的桶位置,和HashMap是一样的 25 */ 26 else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) { 27 /* 28 casTabAt方法是Unsafe类中通过CAS的方式设置值,这里的意思是如果这个桶上还没有数据存在的话, 29 就直接创建一个新的Node节点插入进这个桶就可以了,也就是快速判断。当然如果CAS失败了,会进入 30 到下一次循环中继续判断 31 */ 32 if (casTabAt(tab, i, null, 33 new Node(hash, key, value, null))) 34 break; 35 } else if ((fh = f.hash) == MOVED) 36 /* 37 如果当前桶的第一个节点是ForwardingNode节点的时候(ForwardingNode节点的hash值为MOVED), 38 也就是说当前桶正在被迁移中,就去帮助一起去扩容。等扩容完成后,就更新一下tab,下次循环还是会去插入节点的 39 */ 40 tab = helpTransfer(tab, f); 41 else { 42 //走到这里说明当前这个桶上有节点 43 V oldVal = null; 44 /* 45 注意这里使用了synchronized来锁住了当前桶上的第一个节点,同时也就证明了在Java 8的 46 ConcurrentHashMap中,锁的粒度是在桶(锁住第一个节点也就是在锁住这个桶)这个级别 47 */ 48 synchronized (f) { 49 /* 50 双重检查,可能的一种情况是(我的个人猜测):如果此时有两个线程走到了第48行代码处。第一个线程进入到了 51 synchronized同步语句块中,并插入了新节点,最后触发了扩容操作,此时table已经是一个newTable了 52 然后第二个线程进来,下面的判断条件发现不等(tabAt方法是Unsafe类中直接拿的主内存值,而此时table 53 已经扩容成newTable了。所以此时会找到newTable中i位置处的第一个节点,以此和旧table中i位置处的 54 第一个节点对比(f是局部变量),发现不是同一个位置),于是就会退出同步语句块,进入到下一次循环中 55 不管最终是不是这种解释,在synchronized同步语句块中加上双重检查本身就是一个好的编程习惯 56 */ 57 if (tabAt(tab, i) == f) { 58 /* 59 如果节点是普通的Node节点的话(在spread方法中提到过,如果节点hash值>

If = 0, 60 is a normal Node node) 61 * / 62 if (fh > = 0) {63 / / set binCount=1 64 binCount=1 65 / * 66 in fact, as you can see from the loop below, the fast judgment mode 6768 in HashMap has been removed from ConcurrentHashMap. Note that there is one node per loop on the linked list. BinCount on + 1 (for loop operation mechanism: the first node is not added) 69 * / 70 for (Node e = f ; + + binCount) {71 K ek 72 / / if the hash value of the current node on the bucket is the same as the hash value to insert And if key is the same, 73 if (e.hash = = hash & & 74 ((ek = e.key) = = key | | 75 (ek! = null & & key.equals (ek) {76 oldVal = e.val 77 if (! onlyIfAbsent) 78 / / if onlyIfAbsent is false, the new value overrides the old value 79 e.val = value; 80 break 81} 82 Node pred = e 83 / * 84e points to the next node. If the next node is null, it means that it has been cycled to the last node 85 and the same node has not been found. At this point, the new node to be inserted is inserted at the end (the pred pointer points to the node on 86 of the current node. Because e has now become the next node of the current node) 87 * / 88 if ((e = e.next) = = null) {89 pred.next = new Node (hash, key, 90 value, null) 91 break; 92} 93} 94} else if (f instanceof TreeBin) {95 / / if the node is a red-black tree, 96 Node p 97 / / set binCount=2, which will later explain the meaning of setting 2 here to 98 binCount=2 99 / / execute the insertion node logic of the red-black tree (the analysis of the red-black tree will not be expanded in this article) 100 if ((p = (TreeBin) f) .putTreeVal (hash, key,101 value))! = null) {102 oldVal = p.val 103 if (! onlyIfAbsent) 104 p.val = value 1010 / / binCount! = 0 means either a new node has been added to the linked list Or insert a node in the red-black tree 110 if (binCount! = 0) {111 / / if the number of linked lists has reached the threshold of converting to a red-black tree, then convert 112if (binCount > = TREEIFY_THRESHOLD) 113 / * 114 I have already said in the previous HashMap source code analysis Whether it will really turn into a red-black tree, 115 depends on whether the number of buckets in the current array is greater than or equal to MIN_TREEIFY_CAPACITY. If it is less than that, it will only expand the capacity of 116* / 117treeifyBin (tab, I). 118 if (oldVal! = null) 119 / / return the old value 120 return oldVal;121 / / if a new node is added at the end of the linked list, jump out of the endless loop and enter the following addCount method 122 break Counters + 1 (in this method, the logic of simultaneous expansion and migration of multiple threads) 127 addCount (1L, binCount); 128 return null;129} 130131 / * * 132 * line 12 code: 133134static final int spread (int h) {135return (h ^ (h > > 16)) & HASH_BITS;136}

Take a look at the spread method above: I have explained in the HashMap source code analysis why we use high 16-bit and low 16-bit XOR as the final hash. But in the HashMap source code, here is (key = = null)? 0: (h = key.hashCode ()) ^ (h > 16), because it has been determined before that key cannot be null, so there is no need to judge here. So in fact, the only difference between the calculation here and HashMap is the addition of a "& HASH_BITS" condition at the end.

HASH_BITS is a binary number with 31 ones, that is, Integer.MAX_VALUE. So what is the result of bitwise and later? This won't have any effect if it's a positive number (including 0), but what if h is a negative number? You know, if it is a negative number, the highest bit is 1 (the highest bit is the symbol bit), so the bit and HASH_BITS becomes a positive number. In other words, the "& HASH_BITS" condition is added to ensure that the calculated hash value is non-negative. But why don't you add it to the HashMap source code? Because (table.length-1) & hash is used later to determine the position of the bucket, as previously analyzed in the HashMap constructor, the table.length cannot be 0, and the minimum is 1 (the table.length is indeed 0 when the constructor is called. Note, however, that the condition (table.length-1) & hash is called after the array expansion (initialization) method, when the array already has capacity, so here the minimum table.length-1 is 0, which is non-negative. So you can change the highest symbol bit 1 to 0 after bitwise and hash. Of course, it doesn't matter if the hash itself is greater than or equal to 0. That is to say, in HashMap, the non-negative processing of hash values is delayed to (table.length-1) & hash. But in fact, the "(table.length-1) & hash" method is also used to calculate the bucket position in ConcurrentHashMap, so there is a reason to use the condition "& HASH_BITS" here to process the hash value as a non-negative number in advance. The reason is that in ConcurrentHashMap, a negative hash value has a special meaning:

-1 (MOVED): indicates that the current node is being migrated

-2 (TREEBIN): indicates that the current node is a red-black tree node

-3 (RESERVED): represents the placeholder node that the current node is used in the computeIfAbsent and compute methods

As can be seen in the later source code, when judging whether the current node is an ordinary Node node, it is achieved by determining whether the hash value of the node is > = 0 (if 0). Sc: DEFAULT_CAPACITY;38 / / use the capacity of n from the previous line to create a new Node array 39 @ SuppressWarnings ("unchecked") 40 Node [] nt = (Node []) new Node [n]; 41 table = tab = nt 42 / * 43 after the initialization work is complete, the actual threshold is calculated: n _ load 0.75 (load factor), and then assigned to sizeCtl in the finally44 clause. Note that this is n-(n > 2), which is actually equivalent to nasty 0.75 sc 45, which means that the load factor here will use the default 0.75 instead of the custom value 46 * / 47 (n > 2). 48} 49} finally {50 / * 51 put the following line of code in the finally clause because the @ SuppressWarnings annotation is used at line 39 above to suppress the exception, that is, an exception may be thrown (or an OOM may occur). If an exception is thrown, the sizeCtl will always be-1, and 53 other threads will not be able to complete the initialization work, thus becoming a dead loop. So the meaning of the sizeCtl assignment code in the finally clause is: 54 make sure that even if an exception occurs, the sizeCtl is assigned to the initial value sc, and then let other threads complete the initialization work 55 * / 56 sizeCtl = sc;57} 58 break;59} 60} 61 return tab 62} 5 multithread count

The counting method addCount is called at line 127 of the put method in section 3 above. (the execution logic of the addCount method is divided into two parts, the first half is for counting, and the second half is used for capacity expansion. The expansion is left for analysis in the next section). In HashMap, it is very simple to count + 1 for size, just add it (note that in HashMap, size represents the number of buckets, while here, you need to count the number of all nodes, the two are not the same dimension); but it is difficult to do in ConcurrentHashMap because multithreaded operations are involved. In Java 7, you wait for all Segment locks to be acquired before counting. In other words, all the current threads will be stopped, and then all the nodes will be counted. In this way, the results can be calculated accurately, but the efficiency is very low.

Let's talk about the counting process in Java 8:

In scenarios where there is no concurrency or low concurrency: baseCount is used to record the current number of nodes.

If CAS fails to set baseCount+1, it means that there are multiple threads scrambling to count, so the CounterCell array is used instead. Each thread randomly probes (ThreadLocalRandom.getProbe () & m) to find the location of the CounterCell slot in its CounterCell array (ThreadLocalRandom performs better in concurrent scenarios) and counts on this CounterCell. Finally, it is calculated that the sum of all non-null values in the baseCount and counterCells arrays is the final result.

That is, the first thread is counted on baseCount, while the remaining threads are counted in the CounterCell array. For example, now the baseCount is 16, the first thread has finished adding nodes, and the baseCount+1 has become 17, while the remaining three threads CAS failed and will be assigned a value of 1 in the first, third, and seventh positions of the CounterCell array (this index position is casually cited). At this time, the number of nodes in the array is 17 "1" 1 "1" 20 nodes (the design idea here is actually consistent with the LongAdder class developed by Doug Lea).

Here's why we use random slots and finally summarize all the results to count instead of using CAS preemption? In high concurrency, compared with all threads in the same location CAS to try, failure will continue to try the behavior; divided into multiple array locations to calculate separately, the final summary of the way is obviously smarter (but with it brings more complex logic), to avoid the spin process after failure, but also at the same time all threads are doing counting work.

1 / * * 2 * ConcurrentHashMap: 3 * / 4 private final void addCount (long x, int check) {5 CounterCell [] as; 6 long b, s 7 / * 8 as mentioned in the above note: in scenarios where there is basically no concurrency, baseCount is used to count as long as the CAS setting + 1 is successful. 9 but if the CounterCell array is not empty, it means that there are multiple threads counting at the same time. Or if the CAS setting fails, enter 11 * / 12 if ((as = counterCells)! = null | | 13! U.compareAndSwapLong (this, BASECOUNT, b = baseCount, s = b + x)) {14 CounterCell a; 15 long v; 16 int m; 17 / / uncontended indicates that there is no competition in the flag bit 18 boolean uncontended = true 19 / * 20 if the CounterCell array is empty, or the randomly detected slot position is empty, or if the attempt to set the value + 1 at the detected CounterCell 21 slot position fails (quick attempt), it will enter the fullAddCount method to complete the + 1 operation 22 * / 23 if (as = = null | | (m = as.length-1).

< 0 || 24 (a = as[ThreadLocalRandom.getProbe() & m]) == null || 25 !(uncontended = 26 U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) { 27 //在该方法中会完成最终的计数工作 28 fullAddCount(x, uncontended); 29 /* 30 可以看到在计完数后,这里就退出了,没有走到下面的帮助扩容的逻辑中。为什么?可以想想走到 31 这里的时候,上面经历了两次CAS失败,说明当前是在一个高并发的场景下。如果此时我还去帮助 32 扩容的话,多个线程之间的锁竞争、上下文切换的开销,都会被放大 33 */ 34 return; 35 } 36 /* 37 走到这里说明上面的CAS CounterCell操作成功了,check 0) { 74 /* 75 CounterCell数组已经初始化了的时候,找到随机探测的槽如果为null,那么此时就 76 新创建一个CounterCell 77 */ 78 if ((a = as[(n - 1) & h]) == null) { 79 /* 80 cellsBusy用来表示一个锁资源,0是无锁状态,1是上锁状态 81 当前为0表示此时没有线程在做数组放入CounterCell的过程,也没有正在扩容 82 */ 83 if (cellsBusy == 0) { 84 //创建一个新的CounterCell,将1传进去 85 CounterCell r = new CounterCell(x); 86 //CAS上锁,失败了后面会将冲突标志位collide置为true 87 if (cellsBusy == 0 && 88 U.compareAndSwapInt(this, CELLSBUSY, 0, 1)) { 89 boolean created = false; 90 try { 91 CounterCell[] rs; 92 int m, j; 93 //双重检查(之前见过很多次了) 94 if ((rs = counterCells) != null && 95 (m = rs.length) >

0 & & 96 rs [j = (m-1) & h] = = null) {97 / / the newly created CounterCell above is placed in the array 98 rs [j] = r; 99 created = true Finally {102 / * 103 release the lock (the meaning in the finally clause has been explained in the initTable method above) 105 * / 106 cellsBusy = 0 / / if the creation is successful, it will jump out of the endless loop (that is, the method ends) 109 if (created) 110 break 111 / * 112 indicates that the slot has been set by another thread (note the double check above), 113 then re-cycle at this time to find the next location 114 * / 115 continue 116117118 / / the collision flag bit collide is reset to false, which may go into the expansion logic after avoiding, but proceed to the next attempt 119 collide = false 120} else if (! wasUncontended) 121 / * 122come here to explain that the slot location is not null, and you already know that the last CAS has failed (line 26 code) 123 reset wasUncontended to true at this time, and go through the next cycle 124125wasUncontended = true 126else if (U.compareAndSwapLong (a, CELLVALUE, v = a.value, v + x)) 127 / / at this point, you will try to insert the new value again. If the insertion is successful, you will exit. If the insertion fails, find the next location 128 break. 129else if (counterCells! = as | | n > = NCPU) 130 / * 131counterCells! = as indicates that the CounterCell array is being expanded at this time, and n > = NCPU indicates that the current array capacity has reached or exceeded the maximum number of threads available for JVM. Set collide to false to avoid going into the following expansion logic. 133instead, continue with the next attempt (it also shows here that the length of the CounterCell array cannot be increased indefinitely, up to 134the maximum number of threads available for the current JVM (if it continues to increase, the remaining threads will also be redundant and will only increase consumption)) 135 * / 136 collide = false 137 else if (! collide) 138 / * 139shows that none of the above conditions are met. At this point, reset the conflict flag bit collide from the original false to true,140, and so on, in the next cycle, if the previous is not satisfied, you will go to the following expansion logic 141* / 142collide = true. 143144U.compareAndSwapInt (cellsBusy = = 0 & & 144U.compareAndSwapInt (this, CELLSBUSY, 0,1)) {145 / / shows that none of the above conditions are satisfied. Lock, the logic to expand the CounterCell array at this time 146try {147 / / if the array is being expanded by another thread There is no need for this thread to expand 148if (counterCells = = as) {149 / / to create a new array 150 CounterCell [] rs = new CounterCell [n = (long) (sc = sizeCtl) & & (tab = table)! = null & & 15 (n = tab.length))

< MAXIMUM_CAPACITY) { 16 //此时会根据数组长度计算出一个标记位,详见resizeStamp方法的注释 17 int rs = resizeStamp(n); 18 /* 19 sc小于0说明此时有别的线程正在扩容(不可能为-1,因为此时初始化操作已经结束了, 20 并且上面已经判断数组不为空了),那当前线程就来帮助一起做扩容 21 */ 22 if (sc < 0) { 23 /* 24 退出扩容时的条件,也就意味着此时已经做完扩容了: 25 1.(sc >

> > RESIZE_STAMP_SHIFT)! = rs indicates that the current thread is not in the same expansion (theoretically, the result of moving 16 bits to the right of sc should be the same as that of rs, but if it is different, the length of the array has changed. It may be that the current thread is still in the last expansion. While the other 27 threads have already triggered the next expansion in the interval between sc and tab assignments) 28 2.sc = = rs + 1 indicates that there is still a thread doing the final check (the first thread is initially + 2, but in the end, each expansion thread 29 will be-1, which is actually equivalent to one more reduction. That's what + 1 means here. And if the check is completed, the sc will be reset to a positive number 30, so this is the time when the last thread is checking), then this thread does not need to help. Just wait for that thread to finish checking 31 (here the correct judgment condition should be sc = = (rs RESIZE_STAMP_SHIFT)! = rs | | sc = = rs + 1 | | 40 sc = = rs + MAX_RESIZERS | | (nt = nextTable) = = null | | 41 transferIndex RESIZE_STAMP_SHIFT)! = rs means the current thread is not at the same time Expanding (the result of sc moving 16 bits to the right should theoretically be the same as that of rs. But if it is different, the length of the array has changed. It may be that the current thread is still in the last expansion, while the other 89 threads have already triggered the next expansion before sc assignment. 90 2.sc = = rs + 1 indicates that there is still a thread doing the final check (the first thread is initially + 2). But in the end, each expansion thread 91 will be-1, which is actually equivalent to one more reduction, which means + 1 here. And if the check is completed, the sc will be reset to a positive number 92, so this is the last time the thread is checking), then the thread does not need to help. Just wait for that thread to finish checking 93 (the correct judgment condition here should be sc = = (rs RESIZE_STAMP_SHIFT)! = rs | | sc = = rs + 1 | | 104 sc = = rs + MAX_RESIZERS | | transferIndex 1)? (n > 3) / NCPU: n)

< MIN_TRANSFER_STRIDE) 23 stride = MIN_TRANSFER_STRIDE; 24 //如果nextTab是空的,意味着当前线程是第一个进来的扩容线程 25 if (nextTab == null) { 26 try { 27 //创建一个2倍旧容量的Node数组,最后旧数组上的数据会迁移到此数组中 28 @SuppressWarnings("unchecked") 29 Node[] nt = (Node[]) new Node[n = bound || finishing) 59 /* 60 每次i都会减1,表示当前线程每迁移完一个桶就迁移下一个。--i >

= bound indicates that the current thread has allocated a 61 bound area, but has not yet completed the migration of all buckets in this area Finishing added 62 to the true condition to ensure that after all the migration work is done and the last check is done, you can successfully exit the while loop 63, and then jump to the 96th line of code (in fact, I think it is OK to jump to the following if condition. 64 transferIndex is already 0. This may save the cost of reading volatile variables once (inserting the memory barrier) 65 * / 66 advance = false 67 else if ((nextIndex = transferIndex) stride? 74 nextIndex-stride: 0)) {75 / * 76 assigns a new bound boundary to the current thread. If the CAS fails, it means that other threads have preempted the bound range. 77 just continue to cycle to grab the next bound interval 78 * / 79 bound = nextBound 80 I = nextIndex-1; 81 advance = false; 82} 83} 84 / * 85 here the advance is reset to false, and the following if condition is used to determine whether the current thread has finished migrating. I guess it is to prevent data overflow (a thread has been failing in the above CAS operation, but 87 I-1 per loop, and then-1 becomes 2147483647 after reducing to-2147483648 (transferIndex has been greater than 0 during this period) 88 and the condition iTunn > = nextn seems to be judging the wrong scenario of secondary expansion (the relationship between nextn and n is no longer twice as large), but it has been judged outside of this method, and the tab and nextTab passed in are both local variables, so I guess this is just a security check (here is also one of my 90 puzzles, I have submitted this question to StackOverFlow. But no valid reply has been received so far: 91 https://stackoverflow.com/questions/63597067/in-concurrenthashmaps-transfer-method-i-dont-understand-the-meaning-of-these) 92 * / 93 if (I

< 0 || i >

< MAXIMUM_CAPACITY) { 8 int rs = resizeStamp(n) >

RESIZE_STAMP_SHIFT)! = rs this condition is to ensure that the current thread and other threads are in the same expansion, that is, to determine whether the tag bits are the same. If this condition still exists, it should be (sc > RESIZE_STAMP_SHIFT)! = (rs > RESIZE_STAMP_SHIFT) as written above. To be honest, I don't understand why Doug Lea removed this condition:

I noticed another bug,JDK-8242464 on the OpenJDK bug submission record:

What the submitter means is that the main problem with JDK-8214427 is to consider the change in the capacity of the new array, not the problem of one positive number and another negative number, and the changes made in Java 12 did not solve the problem. Sc = = rs + 1 should be changed to (sc > RESIZE_STAMP_SHIFT) = = (rs > RESIZE_STAMP_SHIFT) + 1 (the difference between the two tag bits is 1, that is, the difference of leading 0 is one digit, which means that the capacity of the array has doubled, which is similar to what I said (sc > > RESIZE_STAMP_SHIFT)! = (rs > RESIZE_STAMP_SHIFT). Doug Lea has not commented so far, and the bug is also in OPEN state (it is difficult to reproduce). A complete report is not provided and further evaluation is needed). In fact, according to my opinion, in Java 12, it should be changed to (sc > RESIZE_STAMP_SHIFT)! = (rs > RESIZE_STAMP_SHIFT) | | sc = = rs + 1, that is, write both cases.

Even if the analysis of the multithreaded expansion logic is finished here, I can only follow up the status and comments of the bug (I have submitted this question to StackOverFlow, but so far I have not received a reply (https://stackoverflow.com/questions/63595464/why-has-this-condition-sc-resize-stamp-shift-rs-been-removed-in-jdk12)).

7 get method 1 / * * 2 * ConcurrentHashMap: 3 * / 4 public V get (Object key) {5 Node [] tab; 6 Node e, p; 7 int n, eh; 8 K ek 9 / * 10 calculates the hash of the specified key. Note that the hashCode method of key is called directly, which means that if the 11 key passed in is null, a null pointer exception 12 * / 13 int h = spread (key.hashCode ()) will be thrown. 14 / / if the array is not initialized, or the position of the calculated bucket is null (indicating that the key cannot be found) Return null 15 if ((tab = table)! = null & & (n = tab.length) > 0 & & 16 (e = tabAt (tab)) directly (n-1) & h)! = null) {17 if ((eh = e.hash) = = h) {18 if ((ek = e.key) = = key | | (ek! = null & & key.equals (ek)) 19 / * 20 if the hash value of the first node on the bucket is the same as the hash value you are looking for And if key is the same, 21 will directly return (quick judgment mode) 22 * / 23 return e.val 24} else if (eh

< 0) 25 /* 26 eh < 0说明eh是一个特殊节点:正在迁移中的节点或树节点,又或者是RESERVED节点, 27 此时会走find方法进行查找。而不同的节点会重写find方法。也就是说,每种特殊节点 28 都有自己的寻找方式 29 */ 30 return (p = e.find(h, key)) != null ? p.val : null; 31 /* 32 走到这里说明eh >

= 0, that is, the current bucket is a normal Node linked list, so traverse every node on the linked list to find 33 (the first node does not need to judge Because it has been judged at lines 17 and 18) 34 * / 35 while ((e = e.next)! = null) {36 if (e.hash = = h & & 37 ((ek = e.key) = = key | | (ek! = null & & key.equals (ek) 38 return e.val 39} 40} 41 return null; 42} 43 44 / * 45 * line 30 code: 46 * the most common find method of Node node, you can see that it is to do a traversal search to determine whether hash and key are the same. 47 * / 48 Node find (int h, Object k) {49 Node e = this; 50 if (k! = null) {51 do {52 K ek 53 if (e.hash = = h & & 54 ((ek = e.key) = k | | (ek! = null & & k.equals (ek) 55 return e; 56} while ((e = e.next)! = null); 57} 58 return null 59} 60 61 / * 62 * line 30 code: 63 * find method of ForwardingNode node 64 * / 65 Node find (int h, Object k) {66 outer: 67 / * 68 Note: the migration node is found on nextTable, and it is not traversed in the current array, 69 because the node in the migration scenario is currently being found. On the other hand, the setTabAt method can guarantee 70 nextTable memory visibility during migration. If you can't find it on nextTable, it doesn't matter. If you adjust the get method again, you can find 72 * / 73 for (Node [] tab = nextTable;;) {74 Node e; 75 int n after the expansion. 76 if (k = = null | | tab = = null | | (n = tab.length) = 0 | 77 (e = tabAt (tab, (n-1) & h)) = = null) 78 / * 79 if key is null, the new array is null or the position of the calculated new array bucket is null 80 (indicating that the key cannot be found) Just return to null (Quick judgment Mode) 81 82 Note: the tabAt here takes the position on nextTable, so if you return null, it doesn't mean that the key cannot be found, or the bucket hasn't been migrated yet. But it doesn't matter. The next time you call the get method, 84 and so on, you can find 8586 when the migration is finished. It is worth mentioning that when you jump into the method, it is the ForwardingNode node, indicating that you are in the process of migrating 87, but the nextTable may be null, indicating that the migration is finished at this time, so quickly return to null 88 of course if the following code is executed The migration has just been done, so the quick judgment at this time will not work. But it doesn't hurt, 90 * / 91 return null; 92 for (;;) {93 int eh; 94 K ek which will be judged again from top to bottom after 89. 95 if ((eh = e.hash) = = h & & 96 ((ek = e.key) = = k | | (ek! = null & & k.equals (ek) 97 / / if the hash and key of the current node are the same as the node you are looking for, return it 98 return e; 99 if (eh

< 0) {100 if (e instanceof ForwardingNode) {101 /*102 再次判断一下是否是ForwardingNode节点,走到这里说明当前还在迁移中(可能还是103 这次迁移也可能是下一次迁移了),那么就继续从本方法的开头处再次往下判断(其实这里不去写这个104 分支也是没问题的,直接走下面第128行代码处的ForwardingNode节点的find方法105 就行了。但是这样就相当于递归了,后面会解释为什么这里不用递归)106 107 这里想去解释一下上面说的下一次迁移的意思。如果此时正在遍历链表上的节点,突然发现某一个节点由108 普通的Node节点变为了ForwardingNode节点,这是怎么发生的呢?我所做的一种猜测是:109 比如说一个链表上有4个节点:0,1,2,3。我判断第一个节点的key和hash不是我想要的,110 那么此时就会遍历到第二个节点处也就是节点1。就在此刻,这个链表发生了扩容迁移,111 迁移结束后,节点1可能被放在了2倍容量新数组的桶的第一个位置处。而不久后,又发生了一次扩容迁移,112 即第二次迁移(注意这里的e是局部变量,所以能一直循环下去),那么它就会被包装为ForwardingNode113 节点(注意,虽然这里的e是局部变量,但是变成ForwardingNode节点的操作是通过Unsafe类中的114 setTabAt方法来实现的(volatile语义,内存可见性),所以可以及时判断出来这个节点已经变为了115 ForwardingNode节点)116 117 此时将tab更新一下,以便下次循环时候使用,也就是在说,tab此时会指向最新的nextTable,去进行查找118 (对应于上面所说的情况,即下一次迁移时,这个tab更新的动作才有意义)119 */120 tab = ((ForwardingNode) e).nextTable;121 continue outer;122 } else123 /*124 走到这里说明已经不是ForwardingNode节点了(本次迁移结束,125 该节点已经变成其他的节点了),可能是红黑树节点也可能是126 RESERVED节点,那么就调用它们各自的find方法进行查找127 */128 return e.find(h, k);129 }130 /*131 走到这里eh >

< tab.length) {10 int fh;11 //如前所示,通过tabAt方法来找到桶的位置12 Node f = tabAt(tab, i);13 if (f == null)14 //如果当前这个桶上没有数据存在的话,就将i+1,也就是继续清除下一个桶15 ++i;16 else if ((fh = f.hash) == MOVED) {17 //和putVal方法一样,如果当前这个桶正在被迁移中,就去帮助一起去扩容。等扩容完成后,就更新一下tab18 tab = helpTransfer(tab, f);19 //因为迁移过后,桶上的数据就又都变了,所以重置i为0,重新开始清除每一个新桶上的数据20 i = 0;21 } else {22 //synchronized锁住当前链表上的第一个节点,也就是锁住了这个桶,以防其他线程操作23 synchronized (f) {24 //双重检查25 if (tabAt(tab, i) == f) {26 /*27 如果是普通的Node节点,p就为f;28 否则如果是红黑树节点,就进行强转;29 否则就为null30 */31 Node p = (fh >

= 0? F: 32 (f instanceof TreeBin)? 33 ((TreeBin) f). First: null); 34 / / here it traverses all linked list or red-black tree nodes on the bucket and records the number on delta 35 while (p! = null) {36-- delta 37 p = p.nextscape 38} 39 / / set the current bucket to null through the setTabAt method (note that this is iTunes +), that is, clearing the data 40 setTabAt (tab, iTunes, null) 41} 42} 43} 44} 45 / * 46 after clearing the data, it is time to update the count. The addCount method will be called here, but the delta passed here is negative. 47 for example, if there are 16 nodes, delta is-16, there are 32 nodes, and delta is-32. In this way, the final calculated number of nodes is 0. 48 as for the-1 passed here, it is also not to help expand the capacity, because the above has helped to complete the expansion of 49 * / 50 if (delta! = 0L) 51 addCount (delta,-1); 52} so far, I believe you have a deeper understanding of "how to understand ConcurrentHashMap". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report