Reading and parsing of HashMap source code 07/03 Update SLTechnology News&Howtos

Reading and parsing of HashMap source code

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

I. introduction

HashMap is one of the most common and longest-used data structures. It is powerful and versatile. It is also a common knowledge point in the interview. Frequently asked questions may be what is the HashMap storage structure? How HashMap puts in a key-value pair, how to get the value corresponding to the key-value, and how to delete a key-value pair. Today we will look at the underlying implementation principle of HashMap. Let's get down to business and analyze the implementation principle of the hashmap source code.

Second, HashMap construction method and storage structure public HashMap () {this.loadFactor = DEFAULT_LOAD_FACTOR; threshold = (int) (DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR); table = new entry [default _ INITIAL_CAPACITY]; init ();}

There are several construction methods for HashMap, so we won't cover them one by one here, just our most common HashMap no-parameter construction method. In the above constructor, there are several variables that we need to explain here:

LoadFactor: load factor, default is 0.75

Threshold:threshold is a threshold, and the initial value is 16-0. 75 by default. When the number of key-value pairs stored in hashmap is greater than this value, it indicates that the hashmap capacity needs to be expanded, and the general capacity will double.

Table:table is actually an array of type Entry. In hashmap, we use arrays and linked lists to resolve hash conflicts, where the table array is used to store the head node of the conflicting linked list.

In addition, in HahsMap, we store Entry nodes through arrays and linked tables (the Entry data structure is used to store key-value pairs). The so-called array here is the table mentioned above. It is an Entry array. The initialization values of the nodes in the table object are all null. When our newly inserted node is hashed to this position for the first time, the node will be inserted into the corresponding position in the table. If there are subsequent nodes with the same hash position, the hash conflict will be resolved as a linked list. The schematic diagram is as follows:

III. Analysis of put () method

The put method is our most commonly used method, and we use this method to put key-value pairs into the HashMap collection, so what exactly is the structure of HashMap, and what does the put () method do? Let's take a look at the implementation of the put () method.

Public V put (K key, V value) {if (key = = null) return putForNullKey (value); int hash = hash (key.hashCode ()); int I = indexFor (hash, table.length); for (Entry e = table [I]; e! = null; e = e.next) {Object k If (e.hash = = hash & & (k = e.key) = = key | | key.equals (k)) {V oldValue = e.value; e.value = value; e.recordAccess (this); return oldValue;}} modCount++; addEntry (hash, key, value, I); return null } private V putForNullKey (V value) {for (Entry e = table [0]; e! = null; e = e.next) {if (e.key = = null) {V oldValue = e.value; e.value = value; e.recordAccess (this); return oldValue;}} modCount++; addEntry (0, null, value, 0); return null } if (key = = null) return putForNullKey (value)

If the key value currently passed in is null, execute the putForNullKey () method; when the key value is null, the hash value is 0, and save it to the linked list starting with table [0]. Traversing the linked list, if there is a node whose key value is null, it is replaced directly with the new value. If a node with a key value of null is not found, the addEntry () method is called to insert a new node with a key of null. The addEntry method will be introduced later.

Int hash = hash (key.hashCode ()); int I = indexFor (hash, table.length)

Why is the hash algorithm called again on the hashCode value of key? To put it simply, it is to make the scattered location of the key transmitted more evenly, the specific reason is not introduced in this article, there are a lot of information on the Internet for reference.

Then call the indexFor method to calculate where the current key values are scattered in the table, which is actually key%table.length

For (Entry e = table [I]; e! = null; e = e.next) {Object k; if (e.hash = = hash & & (k = e.key) = = key | | key.equals (k) {V oldValue = e.value; e.value = value; e.recordAccess (this); return oldValue;}}

Iterate through the linked list headed by table [I] to see if a node with the same key value already exists in the linked list. The judgment condition is if (e.hash = = hash & & ((k = e.key) = = key | | key.equals (k). This judgment condition is very important, let's analyze it carefully. First, e.hash = = hash: we have calculated the hash value of the current node to be processed and saved it in the variable hash, where we need to compare whether the hash value (e.hash) and hash of the current linked list traversal node key are equal. If we look at the addEntry () method, we will find that the storage location of the Entry node is actually determined by the hash value of the key. If the hash of key is the same, then their storage location is also the same. (K = e.key) = = key | | key.equals (k). Let's briefly talk about the meaning of "=" and "equals". "=" is a reference consistency judgment, while equals is a content consistency judgment. This means that if two key objects point to the same object, or if they are the same object, true is returned. To sum up, if the hash value is the same, the key value is the same or a reference to the same object indicates that there is an Entry node with key as the key value in the hashmap.

If the judgment if (e.hash = = hash & & ((k = e.key) = = key | | key.equals (k) is returned as true, the old value is replaced with the new value.

If the same key value is not found, the addEntry () method is called to add a new Entry node that specifies key and value.

4. Parsing void addEntry (int hash, K key, V value, int bucketIndex) {Entry e = table [bucketIndex]; table [bucketIndex] = new Entry (hash, key, value, e); if (size++ > = threshold) resize (2 * table.length);}

Let's move on to the addEntry () method, assuming that the current node is the first node inserted into the table [bucketIndex] position.

Entry e = table [bucketIndex]; table [bucketIndex] = new Entry (hash, key, value, e)

There is a line of code in the constructor of the Entry class:

Next = e

That is, the newly created entry node will point to the Entry node e passed by the Entry constructor, and the value saved by e is the value of the header node, that is, null. After the node is created, it is assigned to the table [bucketIndex], which is equivalent to the header node of the linked list that saves the newly inserted node. As shown in the figure below, we insert Entry at the table [I] location

Entry e = table [bucketIndex]; table [bucketIndex] = new Entry (hash, key, value, e)

The node stored in the table [buckertIndex] at this time is

Create a new Entry node, key= "key2", value= "value2", and the entry node next value points to

In addition, there are two lines of code in the addEntry method

If (size++ > = threshold) resize (2 * table.length)

The value of size is the number of nodes stored in the current hashMap, and threshold is a threshold. If the number of nodes stored in hashMap is greater than or equal to threshold, it means that we need to expand the current hashMap. The capacity of each expansion is twice that of the previous capacity. Let's take a look at the resize () method.

Void resize (int newCapacity) {Entry [] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity = = MAXIMUM_CAPACITY) {threshold = Integer.MAX_VALUE; return;} Entry [] newTable = new Entry [newCapacity]; transfer (newTable); table = newTable; threshold = (int) (newCapacity * loadFactor);} void transfer (Entry [] newTable) {Entry [] src = table; int newCapacity = newTable.length For (int j = 0; j

< src.length; j++) { Entry e = src[j]; if (e != null) { src[j] = null; do { Entry next = e.next; int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } while (e != null); } }} 关键代码是这一段 Entry[] newTable = new Entry[newCapacity];transfer(newTable);table = newTable; 如果resize()之前Entry数组的大小为A,那么newTable数组的大小为2A transfer(newTable)方法用于将原先entry[]数组中的节点转移到newTable数组中，下面我们来看下transfer()方法具体干了什么。将原来的table数组赋值给src数组获取newTable数组的长度，这里为table数组长度的2倍循环遍历src数组，执行下面的操作 a. 取src[j]节点的值赋值给e b. 如果e节点不为null，将src[j]的值置为null 我们来举两个简单的例子说明一下tranfer到底干了什么：当src[j]不为空时，比方说src[j]中保存的Entry节点key="key2",value="value2"，src[j]指向的下一个节点key="key1",value="value1"，如下图所示：最开始的时候newTable[]中并没有存放任何Entry节点，只是单纯的进行了初始化。结合上面代码，我们可以看到此时e = entry2节点，next节点值为entry1 利用indexFor重新计算出e节点的散列位置。e节点的next指向被初始化后的newTable[i]节点，同时newTabel[i]的值也被赋值为e节点最后执行e = next;此时e等于entry1 形成节点的示意图如下：接着执行 next = e.next，此时e的next节点为null，next =null；利用indexFor计算出新的散列位置，比如说新的散列位置为j，此时以newTable[j]为头节点的链表中已经存在了两个节点。如下图所示：我们将待处理的节点entry节点插入后会变成什么样呢？简单的来说resize方法就是去逐个遍历table[i]后面的Entry节点链表，利用indexFor方法重新结算节点的散落位置，并将其插入到以newTable[]为头结点的链表中去。五、get()方法解析说完了put我们再来看一下get方法 public V get(Object key) { if (key == null) return getForNullKey(); int hash = hash(key.hashCode()); for (Entry e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value; } return null;}private V getForNullKey() { for (Entry e = table[0]; e != null; e = e.next) { if (e.key == null) return e.value; } return null;} 理解了put方法时如何往hashmap中放入键值对的，那么get()方法也就很好理解了。我们来具体看看get()方法的实现。如果key值为null，执行getForNullKey()方法。当key值为null时，新的键值对会放到table[0]处，所以我们先去遍历table[0]位置的节点链表，查看是否有key值为null的节点。如果有的话，直接返回value。如果找不到key为null的节点，返回null。如果key值不为null，利用indexFor方法找到当前key所处的table[i]位置，遍历table[i]位置的节点链表。根据e.hash == hash && ((k = e.key) == key || key.equals(k))来判断是否有相同key值的节点。如果当前位置链表中存在key值相同的Entry节点，返回Entry节点保存的value。如果找不到key值匹配的Entry节点，返回null。六、remove()方法解析public V remove(Object key) { Entry e = removeEntryForKey(key); return (e == null ? null : e.value);}final Entry removeEntryForKey(Object key) { int hash = (key == null) ? 0 : hash(key.hashCode()); int i = indexFor(hash, table.length); Entry prev = table[i]; Entry e = prev; while (e != null) { Entry next = e.next; Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) { modCount++; size--; if (prev == e) table[i] = next; else prev.next = next; e.recordRemoval(this); return e; } prev = e; e = next; } return e;} 别看remove方法这么长，其实它的逻辑很简单通过hash()和IndexFor()方法找到当前Entry节点的散列位置i，prev节点为当前节点的上一个节点（初始值为table[i]节点），e节点表示当前节点。比较待删除节点的key值和当前节点的key值是否相符。如果找不到相符的节点，返回null；如果有相符的节点，且为头结点，e节点的下一个节点将被赋值给table[i]；如果有相匹配的节点，并且不为头结点，则prev节点不再指向e，而是指向e.next，也即是prev.next = e.next;相当于一个断链操作；七、HashMap遍历如果让你写一个hashmap的遍历代码，估计大部分人写出下面这段代码。可是HashMap的遍历过程到底是怎么样的，为什么我们每次取值的时候都使用iter.next()来取值的呢？下面我们就来看看HashMap的遍历实现。 Itreator iter = map.entrySet().itreator(); while(iter.hashNext()){ Map.entry entry = (Map.entry) iter.next();} HashMap类中有一个私有类EntrySet，它继承自AbstractSet类。EntrySet类中有一个iterator()方法，也就是我们上面在遍历hashMap所调用的iterator()方法，它会返回一个Iterator对象。我们来看看iterator方法： public Iterator iterator() { return newEntryIterator();} iterator()方法中调用了newEntryIterator()方法，接着进入newEntryIterator()方法看看。 Iterator newEntryIterator() { return new EntryIterator();} newEntryIterator方法又创建了一个EntryIterator对象并返回。这个EntryIterator很关键，我们来具体看看这个类。 private final class EntryIterator extends HashIterator { public Map.Entry next() { return nextEntry(); }} EntryIterator类继承自HashItertor类，而且HashIterator类只有一个方法next()。既然EntryIterator继承自HashIterator类，那么EntryIterator到底继承了父类的哪些对象，默认实现了父类的哪些方法呢？我们再看看HashIterator类。 private abstract class HashIterator implements Iterator { Entry next; // next entry to return int expectedModCount; // For fast-fail int index; // current slot Entry current; // current entry HashIterator() { expectedModCount = modCount; if (size >

0) {/ / advance to first entry Entry [] t = table; while (index < t.length & & (next = t [index++]) = = null);}

There are four attributes in the HashIterator class, and their usefulness code comments have been introduced simply and clearly. It is worth noting that HashIterator () provides a no-argument constructor, but he does not initialize all the properties, so what we need to be clear here is that the value of index will be assigned to 0. At the same time, there is a long section behind, what does it do?

First, Entry [] t = table; assigns the Entry [] array table of the current storage header node to t

Then execute a while loop

While (index < t.length & & (next = t [index++]) = = null)

The while loop ends when the index is greater than the length of the table, or when the node saved at the current t [index] location is not empty. That is, the purpose of the loop is to find out the location of the first Entry object stored in the table [] array and record that location with the index variable.

Let's sum up again! When Itreator iter = map.entrySet () .itreator (); ends, we get an Iterator object that holds the modcount value of the current hashMap, index is used to identify the first location in the table [] array that is not null, and the initial value of next is equal to the value of table [index].

While (iter.hashNext ())

The current object is actually a HashIterator object, and the hasNext () method of the HashIterator object is very simple.

Public final boolean hasNext () {return next! = null;} Map.entry entry = (Map.entry) iter.next ()

Combing through the logic again, EntryIterator has a method next

Public Map.Entry next () {return nextEntry ();} final Entry nextEntry () {if (modCount! = expectedModCount) throw new ConcurrentModificationException (); Entry e = next; if (e = = null) throw new NoSuchElementException (); if ((next = e.next) = = null) {Entry [] t = table; while (index < t.length & (next = t [index++]) = null) } current = e; return e;}

If the modCount value is not equal to expectedModCount, it means that HashMap may have been modified by other threads during the current traversal, and we need to throw a ConcurrentModificationException exception, which is what we often call fast-fail. At the same time, create a new Entry node e, with a value of next (the first time next points to the first header node in the table [] array that is not null).

If the next node of the current node is null, it is equivalent to traversing to the last node of the linked list pointed to by the current table [I]. At this point, we should look for the location where the next header node in the table array is not null.

Execute while (index < t.length & & (next = t [index++]) = = null) to find the next header node that is not null and save it to the next node.

Returns the current node e

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.