What is hashCode? 07/11 Update SLTechnology News&Howtos

What is hashCode?

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "what is hashCode". In daily operation, I believe many people have doubts about what is hashCode. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts about "what is hashCode?" Next, please follow the editor to study!

What is hashCode?

What we usually call hashCode is actually an integer value after hashing. This hash algorithm is implemented through a local method hashCode () in the Object class (there will be some other operations in HashMap).

Public native int hashCode ()

You can see that it is a local method. So, the most direct and effective way to understand what this method is for is to look at its source comments.

Now I will translate its meaning in my broken English.

Returns a hash value of the current object. This method is used to support some hash tables, such as HashMap.

Generally speaking, it has the following conventions:

If the information of the object has not been modified, the same value should be returned for the same object no matter how many times the hashCode method is called during the execution of a program. Of course, there is no need to keep the results consistent during different periods of execution of the same program.

If the equals method returns the same value for two objects, the same result must be returned when their respective hashCode methods are called. (ps: this sentence answers some of the above questions, which will be proved by examples later.)

When the equals methods of two objects return different values, their hashCode methods do not have to guarantee that they must return different values. However, we should know that in this case, we'd better also design hashCode to return different values. Because doing so helps to improve the performance of the hash table.

In practice, the hashCode method of the Object class does return different hashes in different objects. This is usually achieved by converting the internal address of the object to an integer.

Ps: the internal address here means the physical address, that is, the memory address. It is important to note that although the hashCode value is based on its memory address. However, hashCode cannot be said to represent the memory address of the object. In fact, the hashCode address is stored in the hash table.

The above source code comments can really be described as every sentence, explaining the hashCode method incisively and vividly. In a moment, I will explain through a case, and I will understand why I said so.

What is a hash table?

The hash table is mentioned above. What is a hash table? Let's look directly at the explanation of Baidu encyclopedia.

Use a picture to show their relationship.

The column on the left is some key, and through the hash function, they all get a fixed value corresponding to a value in the column on the right. The column on the right can be considered a hash table.

Moreover, we will find that there may be some differences in key, but their corresponding hash values are the same, for example, aa,bb all points to 1001. However, it is certain that the same key points to different values.

This is also easy to understand, because the hash table is used to find the hash address of the key. In the case determined by key, the hash address calculated by the hash function must also be determined. If the cc in the figure has been determined to be at position 1002, then it is no longer possible to occupy position 1003.

Think about it. What if another element, ee, comes and its hash address falls at 1002?

What's the use of hashCode?

In fact, the above picture can already illustrate some problems. By calculating its hashCode value from a key, we can uniquely determine its position in the hash table. In this way, when querying, you can directly locate the current element and improve the efficiency of the query.

Now let's assume that there is such a scene. We need to store 10000 different elements in one area of memory (for example, aa,bb,cc,dd, etc.). So how do you insert different elements and overwrite the same elements?

The easiest way for us to think of is to go through the existing elements every time we save a new element to see if there is the same one. This is possible, but if there are already 9000 elements, you need to traverse the 9000 elements. Obviously, this kind of efficiency is very inefficient.

Let's change a way of thinking, or the above figure as an example. If there is a new element ff, first calculate its hashCode value, which is 1003. If you find that there is no element here, put the new element ff directly in this location.

Then ee comes and calculates the hash to get 1002. At this point, it is found that an element already exists at position 1002. So, through the equals method to compare whether they are equal, we find that there is only one dd element, which is obviously not equal to ee. Then, put the ee element after the dd element (which can be stored in a linked list).

We will find that when new elements come, we first calculate their hash value, and then to determine the location of storage, so that we can reduce the number of comparisons. If ff does not need to be compared, ee only needs to compare with dd once.

As there are more and more elements, the new elements only need to be compared with those that already exist in the same position as the current hash. There is no need to compare with elements in different locations of other hashes. This greatly reduces the number of element comparisons.

The hash table drawn in the picture is relatively small for convenience. Now suppose that the hash table is very large, for example, there are so many locations, from 1001 to 9999. Then, when a new element is inserted, there is a high probability that it will be inserted into a position where there is no element yet, so that there is no need for comparison, and it is very efficient. However, we will find that there is a drawback, that is, the hash table takes up more memory space. Therefore, this is a process of tradeoff.

Conscientious students may have found out. I'll go. The above practice is so familiar. Yes, it is the idea of the famous underlying implementation of HashMap. If you don't know anything about HashMap, take a look at this article and sort it out: the underlying implementation principle of HashMap and source code analysis.

So, what's the use of hashCode. Obviously, it improves the efficiency of querying and inserting elements.

What's the difference between equals and =?

This is a classic interview question that has remained unchanged for thousands of years. It reminds me that the noodle scriptures I recited for the interview were simply a handful of tears. You can still remember the standard answer to this question: equals compares content and = compares addresses.

At that time, I really just memorized the answer and knew it but didn't know why. If you ask why you want to rewrite equals, you will be confused.

First, we should know that equals is defined in the parent class Object of all classes.

Public boolean equals (Object obj) {return (this = = obj);}

As you can see, its default implementation is = =, which is used to compare memory addresses. So, if the equals of an object is not overridden, the effect is the same as = =.

We know that when two ordinary objects are created, in general, their corresponding memory addresses are different. For example, I define a User class.

Public class User {private String name; private int age; public String getName () {return name;} public void setName (String name) {this.name = name;} public int getAge () {return age;} public void setAge (int age) {this.age = age } public User (String name, int age) {this.name = name; this.age = age;} public User () {}} public class TestHashCode {public static void main (String [] args) {User user1 = new User ("zhangsan", 20); User user2 = new User ("lisi", 18); System.out.println (user1 = = user2) System.out.println (user1.equals (user2));}} / / result: false false

Obviously, zhangsan and lisi are two people, two different objects. Therefore, they correspond to different memory addresses, and the contents are not equal.

Note that I haven't rewritten the equals for User yet. In fact, equals uses the method of the parent class Object, and the return is definitely not equal. Therefore, in order to better illustrate the point, I will only modify the second line of code as follows:

/ / User user2 = new User ("lisi", 18); User user2 = new User ("zhangsan", 20)

Let user1 and user2 have the same content, both zhangsan,20 years old. According to our understanding, although these are two objects, they should refer to the same person, both Zhang San. However, print the results as follows:

This runs counter to our understanding of why equals does not return the same person when it is the same person. Therefore, at this point we need to rewrite the equals method in the User class to achieve our purpose. Add the following code to User (using idea to automatically generate code):

Public class User {. / / omit the known code @ Override public boolean equals (Object o) {/ / if the memory address of two objects is the same, it points to the same object, so the content must be the same. If (this = = o) return true; / / classes are not the same, let alone equal if (o = = null | | getClass ()! = o.getClass ()) return false; User user = (User) o / / compare all attributes in two objects, that is, name and age must be the same before two objects can be considered equal return age = = user.age & & Objects.equals (name, user.name);}} / / print result: false true

When we execute the program again, we will find that equals returns true at this time, which is what we want.

Therefore, when we use custom objects. If equals returns true when you need to make the contents of two objects the same, you need to override the equals method.

Why rewrite equals and hashCode?

In the above case, we have actually explained why we want to rewrite equals. Because, when the content of the object is the same, we need to make the object equal. Therefore, instead of using the default implementation of the Object class, it is unreasonable to just compare memory addresses.

So why did hashCode rewrite it? This involves collections, such as Map and Set (the underlying layer is actually Map).

Let's look at the source code of HashMap JDK1.8, such as the put method.

We will find that the hash values are compared many times in the code, and the equals methods are compared only when the hash values are equal. The element is overwritten only when hashCode and equals are the same. The same is true of the get method (compare hashes first, then compare equals)

Only if hashCode and equals are equal, it is considered to be the same element. Find and return this element, otherwise return null.

This also corresponds to "what's the use of hashCode?" This bar. The purpose of rewriting equals and hashCode is to facilitate quick query and insertion of structures such as hash tables. If you do not rewrite, you cannot compare elements, or even cause elements to be dislocated.

If you rewrite equals, do you have to rewrite hashCode?

The answer is yes. First of all, we will find this statement in the second point of the JDK source code comment above. Second, we try to rewrite equals instead of rewriting hashCode to see what happens.

Public class TestHashCode {public static void main (String [] args) {User user1 = new User ("zhangsan", 20); User user2 = new User ("zhangsan", 20); HashMap map = new HashMap (); map.put (user1,90); System.out.println (map.get (user2));} / / print result: null

For the user1 and user2 objects in the code, we think he is the same person, Zhang San. Define a map, key stores the User object, and value stores his grades.

When we store the user1 object as key and score 90 as value in map, we certainly hope that when we use key as the value of user2, we get a result of 90. However, in the end, I was disappointed and got null.

This is because our custom User class overrides equals but does not override hashCode. When user1 is put into map, the calculated hash value is not equal to the hash value calculated when user2 is used to get the value. Therefore, there is no opportunity to compare equals methods. Think of them as different elements. However, in fact, we should think that user1 and user2 are the same elements.

As illustrated by the diagram, user1 and user2 are stored in different buckets in the HashMap, resulting in the target element not being queried.

Therefore, when we use custom classes as the key of HashMap, we have to override hashCode and equals. Otherwise, we will get the result we don't want.

That's why we usually like to use String strings as key. Because the String class helps us rewrite the equals and hashCode methods by default. As follows

/ / String.java public boolean equals (Object anObject) {if (this = = anObject) {return true;} if (anObject instanceof String) {String anotherString = (String) anObject; int n = value.length; / / compare each character in the string if (n = = anotherString.value.length) {char v1 [] = value in turn Char v2 [] = anotherString.value; int I = 0; while (NMI -! = 0) {if (v1 [I]! = v2 [I]) return false; iTunes;} return true;}} return false } public int hashCode () {int h = hash; if (h = = 0 & & value.length > 0) {char val [] = value; / / take out every character in the string and participate in the operation for (int I = 0; I < value.length; iTunes +) {h = 31 * h + val [I] } / / store the calculated final value in the hash variable. Hash = h;} return h;}

When rewriting equals, you can use the automatic code provided by idea, or you can do it manually.

Public class User {. / / omit the known code @ Override public int hashCode () {return Objects.hash (name, age);}} / / at this point, map.get (user2) can get the correct value of 90.

After rewriting the hashCode, when using the custom object as the key, you should also be careful not to change the content of the object during use, which will lead to a change in the hashCode value and will not get the correct result. As follows

Public class TestHashCode {public static void main (String [] args) {User user = new User ("zhangsan", 20); HashMap map = new HashMap (); map.put (user,90); System.out.println (map.get (user)); user.setAge (18); / / change the age of the object to 18 System.out.println (map.get (user)) }} / / print result: / / 90 / / null

You will find that after modification, the value you get is null. This is also the first point in the hashCode source code comments, the premise that the hashCode value remains the same is that the information of the object has not been modified. If it is modified, it may cause the hashCode value to change.

At this time, do you think of any other problems? For example, why is the String class designed to be immutable? When String is used as the key of HashMap here, it can be counted as a reason. You certainly don't want to be fine when you put it in, but you can't find the element when you take it out.

Inside the String class, there is a variable (hash) to cache the hashCode value of the string. The hash value is guaranteed to remain unchanged only if the string is immutable.

When hashCode is equal, does equals have to be equal?

Apparently not. In the source code of HashMap, we can see that when the hashCode is equal (resulting in a hash collision), we also need to compare their equals to determine whether it is the same object. Therefore, when hashCode is equal, equals is not necessarily equal.

On the other hand, if equals is equal, must hashCode be equal? That's a must. Equals are all equal, which means that they are considered to be the same element in HashMap, so hashCode values must also be guaranteed to be equal.

Conclusion:

HashCode is equal, equals is not necessarily equal.

HashCode is not the same as equals.

Equals is equal, hashCode must be equal.

Equals is not necessarily different, hashCode is not necessarily different.

With regard to this last point, it is the third point mentioned in the hashCode source code comments. When the equals is not equal, it is not necessary to guarantee that their hashCode is not equal. However, in order to improve the efficiency of the hash table, it is better to design it as unequal.

Because, since we know that they are no longer equal, then when hashCode is designed to be unequal. As long as comparing hashCode is not equal, we can directly return null instead of comparing equals. In this way, the number of comparisons is reduced and the efficiency is undoubtedly improved.

At this point, the study of "what is hashCode" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.