What is the reason why more than 8 Map buckets become red-black trees? 07/02 Update SLTechnology News&Howtos

What is the reason why more than 8 Map buckets become red-black trees?

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

The main content of this article is to explain "what is the reason why more than 8 Map buckets have become red-black trees". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the reason why more than 8 Map buckets become red-black trees?"

Both HashMap and ConcurrentHashMap of JDK 1.8 have such a feature: the initial Map is empty because there are no elements in it, and the hash value will be calculated when the element is placed. After calculation, the first value will first occupy a bucket (also known as slot point). Later, if it is found that it needs to fall into the same bucket, it will be extended in the form of a linked list, commonly known as "zipper method", as shown in the figure:

When the length of the linked list is greater than or equal to the threshold (the default is 8), the linked list will be converted to a red-black tree if it also meets the requirement that the capacity is greater than or equal to MIN_TREEIFY_CAPACITY (default is 64). Similarly, if the size is adjusted later due to deletion or other reasons, when the node of the red-black tree is less than or equal to 6, it will return to the linked list form. As shown below:

In the picture, we can see that some slots are empty, some are zippers, and some are red-black trees.

So why is the conversion threshold set to 8 by default?

Well, first of all, we need to know why we need to convert, because conversion is the first step.

Each time you traverse a linked list, the average time complexity of searching is O (n), where n is the length of the linked list. The red-black tree has different search performance from the linked list. Because the red-black tree has the characteristic of self-balance and can prevent the occurrence of imbalance, the time complexity of the search can always be controlled at O (log (n)). At first, the linked list is not very long, so there may not be much difference between O (n) and O (log (n)), but if the linked list gets longer and longer, then the difference will be reflected. Therefore, in order to improve the search performance, it is necessary to convert the linked list into a red-black tree.

Then why not use the red-black tree in the first place, but go through a process of transformation?

This issue has been explained in JDK's source code comments:

Because TreeNodes are about twice the size of regular nodes,use them only when bins contain enough nodes to warrant use (see TREEIFY_THRESHOLD). And when they become too small (due removal or resizing) they are converted back to plain bins.

This means that a single TreeNode takes up about twice as much space as a normal Node, so it is converted to TreeNodes only if it contains enough Nodes, which is determined by the value of TREEIFY_THRESHOLD. When the number of nodes in the bucket is removed or the resize is reduced, it will return to the form of a normal linked list in order to save space.

By looking at the source code, we can find that the default is that when the length of the linked list reaches 8, it turns into a red-black tree, and when the length falls to 6, it is converted back, which reflects the idea of time and space balance. When you first use the linked list, the space occupation is relatively small, and because the linked list is short, the query time is not too big a problem. However, when the linked list is getting longer and longer, it is necessary to use the form of red-black tree to ensure the efficiency of the query. For when to convert from a linked list to a red-black tree, you need to determine a threshold, which defaults to 8, and the number "Select 8" is also explained in the source code, as follows:

In usages with well-distributed user hashCodes, tree bins are rarely used. Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution (http://en.wikipedia.org/wiki/Poisson_distribution) with a parameter of about 0.5 on average for the default resizing threshold of 0.75, although with a large variance because of resizing granularity. Ignoring variance, the expected occurrences of list size k are (exp (- 0.5) * pow (0.5, k) / factorial (k)). The first values are: 0: 0.60653066 1: 0.30326533 2: 0.07581633 3: 0.01263606 4: 0.00157952 5: 0.00015795 6: 0.00001316 7: 0.00000094 8: 0.00000006 more: less than 1 in ten million

What the above paragraph means is that if the hashCode is well distributed, that is, if the results of the hash calculation are well dispersed, then the red-black tree form is rarely used because the values are evenly distributed and the linked list is rarely very long. In an ideal case, the length of the linked list conforms to the Poisson distribution, and the hit probability of each length decreases sequentially. When the length is 8, the probability is only 0.00000006. This is a probability of less than 1/10000000, usually we don't store so much data in Map, so usually, the transition from linked list to red-black tree doesn't happen. This is why the threshold is set to 8 by default.

The appearance of red and black trees

Although in general, the conversion from a linked list to a red-black tree does not occur. However, HashMap determines which bucket an element falls into, which is related to the hashCode of this object. JDK does not prevent us users from implementing our own hash algorithm. If we deliberately make the hash algorithm uneven, for example:

/ * * @ discription demonstrates the conversion of hashmap structure from linked list to red-black tree * / public class HashMapDemo {public static void main (String [] args) {HashMap map = new HashMap (1); for (int I = 0; I < 1000; iTunes +) {HashMapDemo tmpHashMap = new HashMapDemo (); map.put (tmpHashMap, null);} System.out.println ("end of run") } @ Override public int hashCode () {return 1;}}

A new HashMap is created in the code, and the value is constantly put into the object of key, whose hashCode is rewritten and always returns 1. When this code is running, if you ask the program to pause on the line System.out.println ("end of run") through debug, we can look at the nodes in map and find that it has become TreeNode instead of the usual Node, which means that the interior has been turned into a red-black tree.

In fact, when the length of the linked list is more than 8, the design of the red-black tree is more to prevent users from implementing a bad hash algorithm, which leads to too long the linked list, resulting in low query efficiency. at this time, the conversion to red-black tree is more of a bottom-keeping strategy to ensure the efficiency of query in extreme cases.

At this point, I believe you have a deeper understanding of "what is the reason why more than 8 Map buckets turned into red-black trees". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.