A brief introduction to the common collections, their characteristics and implementation principles in the JAVA collection framework 07/15 Update SLTechnology News&Howtos

A brief introduction to the common collections, their characteristics and implementation principles in the JAVA collection framework

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "introduction to the common collections and their characteristics and implementation principles in the JAVA collection framework". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Many collection classes provided by Java are derived from two major interfaces: the Collection interface and the Map interface.

Collection interface

The Collection interface defines a collection of objects. The main methods of the interface include:

Size ()-number of objects in the collection

Add (E) / addAll (Collection)-add a single / batch object to the collection

Remove (Object) / removeAll (Collection)-removes a single / batch object from the collection

Contains (Object) / containsAll (Collection)-determines whether one or some objects exist in the collection

ToArray ()-returns an array containing all objects in the collection

Etc.

Map interface

The Map interface assigns a key to each of these objects on the basis of Collection, and saves each key-value pair using Entry to quickly locate the object (value) through key. The main methods of the Map interface include:

Size ()-number of objects in the collection

Put (KMagna V) / putAll (Map)-add a single / batch object to the Map

Get (K)-returns the object corresponding to Key

Remove (K)-deletes the object corresponding to Key

KeySet ()-returns the Set containing all the key in the Map

Values ()-returns the Collection containing all the value in the Map

EntrySet ()-returns the EntrySet containing all key-value pairs in the Map

ContainsKey (K) / containsValue (V)-determines whether the specified key/value exists in the Map

Etc.

After learning about the two major interfaces Collection and Map, let's take a look at the common collection classes derived from these two interfaces:

List class collection

Picture .png

The List interface inherits from Collection and is used to define a collection stored in a list. The List interface assigns an index (index) to each object in the collection, marks the location of the object in the List, and can navigate to the object at the specified location through index.

The main methods that List adds to Collection include:

Get (int)-returns the object at the specified index location

Add (E) / add (int, E)-insert an object at the end of the List / at the specified index location

Set (int, E)-replaces the object placed at the index location specified by List

IndexOf (Object)-returns the index location of the specified object in List

SubList (int,int)-returns the child List object from the specified start index to the end index

Etc.

Common implementation classes of the List interface:

ArrayList

ArrayList implements the function of a set based on an array. It maintains an array of objects of variable length, in which all objects in the collection are stored, and implements the dynamic scaling of the length of the array.

ArrayList uses array copies to insert and delete specified locations:

Insert:

Picture .png

Delete:

Picture .png

LinkedList

LinkedList implements the function of the collection based on the linked list, which implements the static class Node. Each object in the collection is saved by a Node, and each Node has its own references to the previous and later Node.

Example of the process of appending elements to LinkedList:

Picture .png

ArrayList vs LinkedList

ArrayList has higher random access, and the array-based ArrayList can directly navigate to the target object, while LinkedList needs to traverse backward / forward several times from the beginning Node or tail Node to locate the target object.

LinkedList is more efficient than ArrayList in performing insert / delete operations on header / tail nodes.

Since each expansion of ArrayList is 1.5 times that of current capacity, LinkedList takes up less memory space.

The traversal efficiency of the two is similar, but it should be noted that iterator mode is used when traversing LinkedList, not get (int) mode, otherwise the efficiency will be very low.

Vector

Vector is similar to ArrayList in that it is a collection based on array implementations. The main difference between Vector and ArrayList is that

Vector is thread-safe, while ArrayList is not

Because the methods in Vector are basically synchronized, their performance is lower than ArrayList.

Vector can define the expansion factor of array length, but ArrayList cannot.

CopyOnWriteArrayList

Like Vector, CopyOnWriteArrayList can also be thought of as a thread-safe version of ArrayList, except that CopyOnWriteArrayList makes a copy of the write operation, writes on the new copy, and then modifies the reference. This mechanism allows CopyOnWriteArrayList to unlock read operations, which makes CopyOnWriteArrayList much more efficient than Vector. The concept of CopyOnWriteArrayList is similar to the separation of read and write, which is suitable for multi-threaded scenarios with more reads and less writes. However, it should be noted that CopyOnWriteArrayList can only guarantee the final consistency of the data, but not the real-time consistency of the data. If a write operation is in progress and has not been completed, the read operation cannot guarantee that the result of the write operation can be read.

Vector vs CopyOnWriteArrayList

Both are thread-safe, array-based List

Vector is [absolutely] thread safe. CopyOnWriteArrayList can only guarantee that the reader thread will read the write result of [completed], but it cannot realize the ability of "wait for the write operation to read the latest value] like Vector.

The read performance of CopyOnWriteArrayList is much higher than that of Vector, and the more concurrent threads, the more obvious the advantage.

CopyOnWriteArrayList takes up more memory space

Map class collection

Picture .png

Map encapsulates key and value into an object called Entry, and the elements stored in Map are actually Entry. Map instantiates the keySet and values objects only when the keySet () and values () methods are called.

According to its own characteristics, each Map has a different Entry implementation, which appears in the form of the inner class of the corresponding Map.

The basic characteristics of the Map interface have been described earlier, so let's take a look at the common implementation classes of the Map interface directly.

HashMap

HashMap stores Entry objects in an array and provides quick access to Entry through a hash table:

Picture .png

The position of the key in the array is determined by the hash value of the key in each Entry. With this feature, the Entry can be found quickly through key, and the corresponding value of the key can be obtained. Under the premise of no hash conflict, the time complexity of search is O (1).

If the index calculated by two different key is the same, there will be a situation where two different key correspond to the same position in the array, which is called hash conflict. HashMap handles hash conflicts by zipping, which means that each location in the array actually holds an Entry linked list, and each Entry in the linked list has a reference to the latter Entry in the linked list. When a hash conflict occurs, append the conflicting Entry to the header of the linked list. When HashMap finds multiple Entry on the array index corresponding to a key, it traverses the Entry linked list at that location until the target Entry is found.

Picture .png

Entry class of HashMap:

Static class Entry implements Map.Entry {final K key; V value; Entry next; int hash;}

HashMap is the most frequently used Map implementation class because of its fast addressing.

Hashtable

Hashtable can be said to be the predecessor of HashMap (Hashtable has existed since JDK1.0, and HashMap and even the entire Map interface is a new feature introduced by JDK1.2). In fact, it is almost exactly the same as HashMap, which stores Entry in an array, calculates the index of Entry in the array with the hash value of key, and solves the hash conflict by zipper method. The biggest difference between the two is that Hashtable is thread-safe and almost all of the methods it provides are synchronous.

ConcurrentHashMap

ConcurrentHashMap is the thread-safe version of HashMap (introduced since JDK1.5) and provides more efficient concurrency performance than Hashtable.

Picture .png

Hashtable locks the entire Entry array during read and write operations, which results in poor performance with more data. On the other hand, ConcurrentHashMap uses the idea of separating locks to solve the concurrency performance, which splits the Entry array into 16 Segment, and uses the hash algorithm to determine which Segment the Entry should be stored in. This makes it possible to lock only one Segment during write operations, which greatly improves the performance of concurrent writes.

When performing read operations, ConcurrentHashMap does not need to be locked in most cases, and the value in its Entry is volatile, which ensures thread visibility when value is modified, and thread-safe read operations can be achieved without locking.

HashEntry class of ConcurrentHashMap:

Static final class HashEntry {final int hash; final K key; volatile V value; volatile HashEntry next;}

But you can't have both, and ConcurrentHashMap's high performance comes at a price (otherwise Hashtable has no value), that is, it cannot guarantee the absolute consistency of read operations. ConcurrentHashMap ensures that the read operation can get the latest value of the value of the existing Entry, and that the read operation can get the contents of the completed write operation, but if the write operation is creating a new Entry, it is possible that the read operation will not get the Entry when the write operation is not completed.

HashMap vs Hashtable vs ConcurrentHashMap

The mechanisms and principles of the three are basically the same at the data storage level.

HashMap is not thread-safe. In a multithreaded environment, apart from not ensuring data consistency, it is also possible to cause Entry linked lists to form rings in the rehash phase, resulting in an endless loop.

Hashtable is thread-safe and ensures absolute data consistency, but performance is a problem. The more concurrent threads, the worse the performance.

ConcurrentHashMap is also thread-safe, the use of split locks and volatile and other methods greatly improve read and write performance, but also ensure data consistency in most cases. However, it can not guarantee absolute data consistency. Before one thread adds Entry to the Map, other threads may not be able to read the new Entry.

LinkedHashMap

LinkedHashMap is very similar to HashMap, except that the Entry of the former adds references to the former inserted Entry and the latter inserted Entry based on the HashMap.Entry, so that it can be traversed in the insertion order of the Entry.

Picture .png

TreeMap

TreeMap is a Map structure based on a red-black tree, and its Entry class has references to the left / right leaf node and parent node, as well as recording its own color:

Static final class Entry implements Map.Entry {K key; V value; Entry left = null; Entry right = null; Entry parent; boolean color = BLACK;}

Red-black tree is actually a kind of balanced binary tree with complex but efficient algorithm, which has the basic properties of binary tree, that is, the value of any node is larger than its left leaf node and smaller than its right leaf node. Using this feature, TreeMap can sort and quickly search Entry.

For a specific introduction to the red-black tree, you can refer to this article, which is very detailed: http://blog.csdn.net/chenssy/article/details/26668941

TreeMap's Entry is ordered, so it provides a series of convenient functions, such as getting KeySet in ascending or descending order, getting key before / after specifying key (Entry), and so on. It is suitable for scenarios where orderly operation of key is required.

ConcurrentSkipListMap

ConcurrentSkipListMap can also provide ordered Entry arrangement, but its implementation principle is different from TreeMap, it is based on SkipList:

Picture .png

As shown in the figure above, ConcurrentSkipListMap is implemented by a multi-level linked list, with all the elements on the underlying chain, and the number of elements in each chain decreases as it rises step by step. When looking up, start from the top-level chain and look up according to the priority of the first right then the next, so as to achieve fast addressing.

Static class Index {reference under final Node node; final Index down;// volatile Index right;// right reference}

Unlike TreeMap, ConcurrentSkipListMap only needs to modify the right reference of the affected node when inserting, deleting and so on, and the right reference is volatile, so ConcurrentSkipListMap is thread safe. However, ConcurrentSkipListMap, like ConcurrentHashMap, cannot guarantee the absolute consistency of the data, and in some cases it may not be able to read the data being inserted.

TreeMap vs ConcurrentSkipListMap

Both can provide ordered Entry collections

The performance of both is similar, and the search time complexity is O (logN).

ConcurrentSkipListMap will take up more memory space

ConcurrentSkipListMap is thread-safe, TreeMap is not

Set class collection

The Set interface inherits Collection and is used to store collections without repeating elements. Almost all Set implementations are based on the same type of Map. Simply put, Set is a castrated version of Map. There is a Map instance of the same type within each Set (except CopyOnWriteArraySet, which is built-in is a CopyOnWriteArrayList instance), Set stores elements as key in its own Map instance, and value is an empty Object. The common implementations of Set also include HashSet, TreeSet, ConcurrentSkipListSet, and so on. The principle is exactly the same as the corresponding Map implementation, so I won't repeat it here.

Picture .png

Queue/Deque class collection

Picture .png

The Queue and Deque interfaces inherit the Collection interface and implement a collection of FIFO (first in, first out). The difference between the two is that Queue can only join the team at the end of the team and leave the team at the head of the team, while the Deque interface can perform outgoing / join operations at both the head and the end of the team.

Common methods for Queue APIs:

Add (E) / offer (E): join the queue, that is, append elements to the end of the queue. The difference between the two is that if the queue is bounded, the add method throws IllegalStateException when the queue is full, while the offer method only returns false.

Remove () / poll (): dequeue, that is, remove 1 element from the head of the queue. The difference is that if the queue is empty, the remove method throws NoSuchElementException, while poll only returns null.

Element () / peek (): look at the header element. The difference between the two is that if the queue is empty, the element method throws NoSuchElementException, while peek only returns null.

Common methods for Deque APIs:

AddFirst / addLast (E) / offerFirst (E) / offerLast (E)

RemoveFirst () / removeLast () / pollFirst () / pollLast ()

GetFirst () / getLast () / peekFirst () / peekLast ()

RemoveFirstOccurrence (Object) / removeLastOccurrence (Object)

Common implementation classes of the Queue interface:

ConcurrentLinkedQueue

ConcurrentLinkedQueue is a queue implemented based on a linked list, and each Node in the queue has a reference to the next Node:

Private static class Node {volatile E item; volatile Node next;}

Because the members of the Node class are all volatile, ConcurrentLinkedQueue is naturally thread-safe. The atomicity and consistency of queuing and dequeuing operations can be guaranteed, but only weak data consistency can be guaranteed when traversing and size () operations.

LinkedBlockingQueue

Unlike ConcurrentLinkedQueue, LinkedBlocklingQueue is an unbounded blocking queue. The so-called blocking queue means that when joining the queue, if the queue is full, the thread will be blocked until the queue has room to join the queue and then return; at the same time, when leaving the queue, if the queue is empty, the thread will also be blocked until there are elements in the queue. LinkedBlocklingQueue is also implemented based on linked lists, and its dequeuing and queuing operations are locked using ReentrantLock. Therefore, it is thread-safe, but similarly, it can only ensure the atomicity and consistency of queuing and dequeue operations, and can only ensure the weak consistency of data during traversal.

ArrayBlockingQueue

ArrayBlockingQueue is a bounded blocking queue based on array implementation. The implementation of synchronous blocking mechanism is basically the same as that of LinkedBlocklingQueue, except that the production and consumption of the former use the same lock, while the production and consumption of the latter use two separate locks.

ConcurrentLinkedQueue vsLinkedBlocklingQueue vs ArrayBlockingQueue

ConcurrentLinkedQueue is a non-blocking queue, and the other two are blocking queues

All three are thread safe.

LinkedBlocklingQueue is unbounded and suitable for implementing queues of unlimited length, while ArrayBlockingQueue is suitable for queues of fixed length.

SynchronousQueue

SynchronousQueue is one of the bizarre queues of JDK implementations. It can't hold any elements. Size is always 0st peek () and always returns null. The thread into which the element is inserted blocks until another thread takes the element away, and vice versa, until another thread inserts the element.

This implementation mechanism is very suitable for transitive scenarios. In other words, if the producer thread needs to confirm in time that the task it has produced has been taken away by the consumer thread before it can execute the subsequent logic, it is suitable to use SynchronousQueue.

PriorityQueue & PriorityBlockingQueue

These two types of Queue are not FIFO queues, but are sorted according to the priority of the elements to ensure that the smallest element is the first to dequeue, or you can pass in the Comparator instance when the queue is constructed, so that PriorityQueue will sort the elements according to the requirements of the Comparator instance.

PriorityQueue is a non-blocking queue and is not thread-safe. PriorityBlockingQueue is a blocking queue and thread-safe at the same time.

The implementation classes of Deque include LinkedList (described earlier), ConcurrentLinkedDeque, LinkedBlockingDeque, and their implementation mechanisms are very similar to those of ConcurrentLinkedQueue and LinkedBlockingQueue mentioned above, so I will not repeat them here.

Finally, make a brief summary of the common collection implementation classes described in this article:

This is the end of the introduction to the common collections and their characteristics and implementation principles in the JAVA collection framework. Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.