How does HashSet ensure that elements do not repeat 07/13 Update SLTechnology News&Howtos

How does HashSet ensure that elements do not repeat

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Editor to share with you HashSet how to ensure that the elements do not repeat, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to understand it!

HashSet implements the Set interface, which is supported by a hash table (actually HashMap). HashSet does not guarantee the iteration order of the collection, but allows the insertion of null values. In other words, HashSet cannot guarantee that the insertion order of elements is the same as the iteration order.

HashSet has the feature of de-duplication, that is, it can automatically filter out repeated elements in the collection, ensuring that the elements stored in HashSet are unique.

Basic usage of 1.HashSet

The basic operation methods of HashSet are: add (add), remove (delete), contains (determine whether an element exists) and size (number of collections). The performance of these methods is a fixed operation time, if the hash function is to spread the elements in the correct position in the bucket.

Basic uses of HashSet are as follows:

/ / create the HashSet collection HashSet strSet = new HashSet (); / / add data strSet.add ("Java") to the HashSet; strSet.add ("MySQL"); strSet.add ("Redis"); / / print all elements in the HashSet loop strSet.forEach (s-> System.out.println (s)); 2.HashSet disorder

HashSet cannot guarantee that the order in which elements are inserted will be the same as the order in which elements are looped out. In other words, HashSet is actually an unordered collection. The specific code example is as follows:

HashSet mapSet = new HashSet (); mapSet.add ("Shenzhen"); mapSet.add ("Beijing"); mapSet.add ("Xi'an"); / / cycle print all elements in HashSet mapSet.forEach (m-> System.out.println (m))

The implementation results of the above procedures are as follows:

From the above code and implementation results, we can see that the order of HashSet insertion is: Shenzhen-> Beijing-> Xi'an, while the order of circular printing is: Xi'an-> Shenzhen-> Beijing, so HashSet is disordered, and the order of insertion and iteration is not guaranteed.

PS: if you want to ensure that the insertion order is consistent with the iteration order, you can use LinkedHashSet to replace HashSet.

Misuse of 3.HashSet

Some people say that HashSet can only guarantee that the basic data types are not duplicated, but not the custom objects. Is that right?

We use the following example to illustrate this problem.

3.1 HashSet and basic data types

Use HashSet to store basic data types. The implementation code is as follows:

HashSet longSet = new HashSet (); longSet.add (666l); longSet.add (777l); longSet.add (999l); longSet.add (666l); / / Loop print all elements in HashSet longSet.forEach (l-> System.out.println (l))

The implementation results of the above procedures are as follows:

As can be seen from the above results, the use of HashSet ensures that the underlying data types are not duplicated.

3.2 HashSet and Custom object types

Next, store the custom object in HashSet with the following implementation code:

Public class HashSetExample {public static void main (String [] args) {HashSet personSet = new HashSet (); personSet.add (new Person ("Cao Cao", "123")); personSet.add (new Person (" Sun Quan "," 123"); personSet.add ("Cao Cao", "123")) / / print all elements in HashSet personSet.forEach (p-> System.out.println (p));}} @ Getter@Setter@ToStringclass Person {private String name; private String password; public Person (String name, String password) {this.name = name; this.password = password;}}

The implementation results of the above procedures are as follows:

From the above results, we can see that the custom object type has not been deduplicated, that is to say, HashSet can not achieve the de-duplication of custom object type?

In fact, it is not. The deduplication function of HashSet depends on the hashCode and equals methods of the element, and what is returned through these two methods is true, which is the same object, otherwise it is different. The reason why the previous Long type element can be deduplicated is that the hashCode and equals methods have been overridden in the Long type. The specific source code is as follows:

@ Overridepublic int hashCode () {return Long.hashCode (value);} public boolean equals (Object obj) {if (obj instanceof Long) {return value = = ((Long) obj). LongValue ();} return false;} / / omit other source codes.

For more information on hashCode and equals, see https://www.yisu.com/article/204554.htm

Then, if you want HashSet to support custom object deduplication, you only need to override the hashCode and equals methods in the custom object. The specific implementation code is as follows:

@ Setter@Getter@ToStringclass Person {private String name; private String password; public Person (String name, String password) {this.name = name; this.password = password;} @ Override public boolean equals (Object o) {if (this = = o) return true / / return true / / if equal to null, or return false if (o = = null | | getClass ()! = o.getClass ()) return false; / / strongly convert to custom Person type Person persion = (Person) o / / if name and password are equal, return true return Objects.equals (name, persion.name) & & Objects.equals (password, persion.password);} @ Override public int hashCode () {/ / compare whether name and password are equal return Objects.hash (name, password);}}

Rerun the above code, and the execution result is as follows:

From the above results, we can see that the previous repetition of "Cao Cao" has been repeated.

How does 4.HashSet ensure that elements are not duplicated?

As long as we understand how HashSet performs the process of adding elements, we can see why HashSet ensures that elements are not duplicated.

The execution process of adding elements to HashSet is as follows: when adding an object to HashSet, HashSet will first calculate the hashcode value of the object to determine the location of the object, and will also compare it with the hashcode value of other added objects. If there is no matching hashcode,HashSet, it will assume that the object does not repeat, and will insert the object into the appropriate location. But if you find an object with the same hashcode value, the object's equals () method will be called to check whether the object is really the same, and if so, HashSet will not allow duplicate objects to be added to the HashSet, thus ensuring that the elements are not duplicated.

In order to understand the process of adding HashSet more clearly, we can try to read the specific implementation source code of HashSet. The source code for adding HashSet is as follows (the following source code is based on JDK 8):

When put () in hashmap returns null, it indicates that the operation was successful public boolean add (E e) {return map.put (e, PRESENT) = = null;}

From the above source code, you can see that the add method in HashSet actually calls put in HashMap, so let's move on to the put implementation in HashMap:

/ / return value: return null if there is no element in the insertion position, otherwise return the previous element public V put (K key, V value) {return putVal (hash (key), key, value, false, true);}

As can be seen from the above source code, the put () method in HashMap calls the putVal () method again. The source code of putVal () is as follows:

Final V putVal (int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {Node [] tab; Node p; int n, I / / if the hash table is empty, call resize () to create a hash table and record the hash table length if ((tab = table) = = null) with the variable n | (n = tab.length) = = 0) n = (tab = resize ()) .length / * if the specified parameter hash does not have a corresponding bucket in the table, there is no collision * Hash function (n-1) & hash calculates the slot where key will be placed * (n-1) & hash is essentially hash% n-bit operation faster * / if ((p = tab [I = (n-1) & hash]) = = null) / / insert the key-value pair directly into the map to tab [I] = newNode (hash, key, value, null) Element Node e; K k already exists in else {/ / bucket / / compare the hash values of the first element in the bucket (nodes in the array) are equal, key equal if (p.hash = = hash & & (k = p.key) = = key | | (key! = null & & key.equals (k) / / assign the first element to e, and record e = p with e / / there is no such key-value pair in the current bucket, and the bucket is a red-black tree structure. Insert else if (p instanceof TreeNode) e = ((TreeNode) p) .putTreeVal (this, tab, hash, key, value) according to the red-black tree structure. / / there is no such key-value pair in the current bucket, and the bucket is a linked list structure, which is inserted into the tail else {for (int binCount = 0) according to the linked list structure. + + binCount) {/ / traverse to the end of the linked list if ((e = p.next) = = null) {p.next = newNode (hash, key, value, null) / / check whether the length of the linked list reaches the threshold and change the node organization form of the slot into a red-black tree if (binCount > = TREEIFY_THRESHOLD-1) / /-1 for 1st treeifyBin (tab, hash); break } / / No repetitive operation is performed when the linked list node is the same as the put operation / / Jump out of the loop if (e.hash = = hash & & (k = e.key) = = key | | (key! = null & & key.equals (k) break P = e;}} / / find or create a new key-value pair where key and hashCode are equal to the insertion element, and perform the put operation if (e! = null) {/ / existing mapping for key / / record e's value V oldValue = e.value / * when onlyIfAbsent is false or the old value is null, it is allowed to replace the old value * otherwise there is no need to replace * / if (! onlyIfAbsent | | oldValue = = null) e.value = value; / / callback afterNodeAccess (e) after access / / return old value return oldValue;}} / / update structured modification information + + modCount; / / when the number of key-value pairs exceeds the threshold, rehash if (+ + size > threshold) resize (); / / callback afterNodeInsertion (evict) after insertion; return null;}

As can be seen from the above source code, when a key-value pair is put into the HashMap, the storage location of the Entry is first determined according to the hashCode () return value of the key. If the hash values of two key are the same, it will determine whether the equals () of the two elements key is the same. If they are the same, true is returned, which means it is a duplicate key-value pair. Then the return value of the add () method in HashSet will be false, indicating that HashSet failed to add elements. Therefore, if you add an existing element to the HashSet, the newly added collection element does not overwrite the existing element, thus ensuring that the element does not repeat. If it is not a repeating element, the put method eventually returns null, and the add method passed to HashSet is added successfully.

The above is all the content of the article "how to ensure that the elements of HashSet do not repeat". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.