Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is HashMap?

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

What is HashMap, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

HashMap is a very important collection, daily use is also very frequent, but also the focus of the interview. This article is not intended to explain the basics of using api, but to go deep into the bottom of HashMap and explain the key knowledge about HashMap. Readers are required to have some understanding of hash tables and HashMap.

HashMap is essentially a hash table, so it is inseparable from the three major problems of hash table: hash function, hash conflict and expansion scheme; at the same time, as a data structure, we must consider the problem of multi-thread concurrent access, that is, thread safety. These four key points are not only the focus of learning HashMap, but also the focus of HashMap design.

HashMap is part of the Map collection architecture, while inheriting the Serializable interface can be serialized, inheriting the Cloneable interface can be copied. His inheritance structure is as follows:

Img

HashMap is not omnipotent, and some other classes are officially extended to meet the requirements in some special scenarios, such as thread-safe ConcurrentHashMap, LinkHashMap for record insertion order, TreeMap for sorting key, and so on.

The article mainly explains four key points: hash function, hash conflict, expansion scheme, thread safety, and then supplement the key source code analysis and related issues.

All contents of this article, unless otherwise specified, are JDK1.8 versions.

Hash function

The goal of the hash function is to calculate the subscript of key in the array. The criteria for judging a hash function are whether the hash is uniform and whether the calculation is simple.

Steps for the HashMap hash function:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Disturb the hashcode of a key object

Obtain array subscript by taking module

The purpose of the disturbance is to make the hashcode more random, and the second step is to take the module so that all the key are not gathered together to improve the hash uniformity. You can see the hash () method by disturbing:

Static final int hash (Object key) {int h; / / gets the hashcode of key, and operates return (key = = null)? 0: (h = key.hashCode ()) ^ (h > 16);}

That is, the low 16 bits are XOR with the high 16 bits, and the high 16 bits remain the same. In general, the length of the array is relatively short, and only the low bit participates in the hash in the modular operation; the high bit and the low bit are XOR, so that the high bit can also participate in the hash operation, making the hash more uniform. The specific operation is as follows (in order to facilitate the use of 8-bit demonstration, 32-bit same):

Img

After the hashcode disturbance, the result needs to be modeled. Instead of simply using% for modeling in jdk1.8, HashMap uses another, more high-performance approach. HashMap controls the integer power of the array length 2. The advantage is that the complementary operation of hashcode and the bit and operation of hashcode and array length-1 have the same effect. As shown below:

Img

However, the efficiency of bit and operation is much higher than the remainder, thus improving the performance. This feature is also used in the expansion operation, which will be discussed later. For the source code of the modular operation, see the putVal () method, which is called in the put () method:

Final V putVal (int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {. / / performs bit sum operations with array length-1 to get the subscript if ((p = tab [I = (n-1) & hash]) = = null).}

For the complete hash calculation process, please refer to the following figure:

Img

We mentioned above that the length of the array of HashMap is the integer power of 2, so how does HashMap control the length of the array to the integer power of 2? There are two ways to modify the length of an array:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

The length specified at initialization

Length increment during capacity expansion

Let's look at the first situation first. By default, if the length is not specified in the HashMap constructor, the initial length is 16. 16 is a more appropriate empirical value, it is the integer power of 2, at the same time too small will frequently trigger capacity expansion, too will waste space. If you specify an integer power other than 2, it is automatically converted to an integer power greater than the minimum of 2 of the specified number. If you specify 6, you will convert to 8, and if you specify 11, you will convert to 16. Combined with the source code analysis, when we initialize and specify a non-2 integer power length, HashMap will call the tableSizeFor () method:

Public HashMap (int initialCapacity, float loadFactor) {... This.loadFactor = loadFactor; / / the tableSizeFor method this.threshold = tableSizeFor (initialCapacity) is called here;} static final int tableSizeFor (int cap) {/ / Note that int n = cap-1; n | = n > 1; n | = n > 2; n | = n > 4; n | = n > 8; n | = n > 16; return (n)

< 0) ? 1 : (n >

= MAXIMUM_CAPACITY)? MAXIMUM_CAPACITY: n + 1;}

The tableSizeFor () method looks complex, and its effect is to make all the bits that follow the highest bit 1 become 1, and then + 1 gets an integer power just greater than the minimum 2 of initialCapacity. The figure is as follows (8-bit simulation is used here, and 32-bit is the same):

Img

So why do you have to do the-1 operation on the cap? If the specified number happens to be the integer power of 2, if there is no-1, the result will be twice as large as he is, as follows:

00100-after high 1, all change 1muri-> 00111-plus 1Mutual Mutual-> 01000

The second case of changing the length of the array is expansion. Each expansion of HashMap is twice the original size, and the size of the array must be the integer power of 2. The relevant source codes are as follows:

Final Node [] resize () {... When if ((newCap = oldCap = DEFAULT_INITIAL_CAPACITY) / / is set to twice the original newThr = oldThr = 8 and the array length > = 64, the linked list is converted to a red-black tree.

When the linked list length > = 8, but the array length threshold) resize (); / / finally returns null (afterNodeInsertion is the callback of LinkHashMap) afterNodeInsertion (evict); return null;}

Each step is explained in detail in the code, so let's sum it up here:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Generally speaking, there are two situations: finding the same key and not finding the same key. Find the need to determine whether to update and return the old value, but not find the need to insert a new Node, update the number of nodes and determine whether to expand capacity.

The search is divided into three situations: array, linked list, and red-black tree. If the array subscript I position is not empty and is not equal to key, then you need to determine whether the tree node or the linked list node and look for it.

When the linked list reaches a certain length, it needs to be expanded to a red-black tree, if and only if the linked list length > = 8 and the array length > = 64.

Finally, draw a picture to deepen the impression of the whole process:

Img

Other questions

Why is it that jdk1.7 used to control the length of the array to be prime, while jdk1.8 used to use the integer power of 2?

Answer: Prime length can effectively reduce hash conflicts; after JDK1.8, the integer power of 2 is used to improve the efficiency of remainder and capacity expansion, and the method of high and low XOR is used to make the hash more uniform.

Why can primes reduce hash conflicts? If you can ensure that the hashcode of key is evenly distributed between each number, then both prime and composite numbers have the same effect. For example, if hashcode is uniformly distributed between 1 and 20, the distribution is uniform regardless of whether the length is a composite of 4 or a prime of 5. If the interval between the hashcode is 2, such as 1BI 3 and 5..., then the subscript of position 2 and position 4 of an array of length 4 cannot be put into the data, while an array of length of 5 does not have this problem. Arrays of composite lengths cause hashcode aggregations with intervals of its factor to appear, thus reducing the hashing effect.

Why do you need to implement the hashcode and equals methods when inserting data into HashMap? What are the requirements for these two methods?

Answer: insert subscript is determined by hashcode and data is found by equals comparison; the hashcode of two equal key must be equal, but objects with the same hashcode are not necessarily equal.

Here you need to distinguish the difference between them: hashcode is like a person's name, the same person's name must be equal, but the same name is not necessarily the same person; equals compares whether the content is the same, it is generally overwritten by the object, and the reference address is compared by default; the reference formation compares whether the reference address is the same, and the value object compares whether the value is the same.

In HashMap, you need to use hashcode to get the subscript of key. If the hashcode of two identical objects is different, it will cause the same key; in HashMap, so the equals returns the same key and their hashcode must be the same. HashMap uses a combination of three comparison methods to compare whether the two elements are the same: p.hash = = hash & & (k = p.key) = = key | (key! = null & & key.equals (k).

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report