Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Performance comparison of contains method between ArrayList and HashSet in java

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "java ArrayList and HashSet contains method performance comparison", in daily operation, I believe many people in java ArrayList and HashSet contains method performance comparison problems there are doubts, Xiaobian consulted all kinds of information, sorted out simple and easy to use operation methods, hope to answer "java ArrayList and HashSet contains method performance comparison" doubts helpful! Next, please follow the small series to learn together!

1 Introduction

In everyday development, ArrayList and HashSet are collection classes that are commonly used in Java.

ArrayList is the most commonly used implementation class for the List interface;

HashSet is an implementation of Set that holds unique elements.

This article mainly discusses the methods contained () shared by the two, mainly on performance comparison, and uses JMH(ava Microbenchmark Harness) for test comparison.

2 First look at the JMH test results

We tested using a Micro Benchmark Framework developed by the Java compilers in OpenJDK/Oracle. The following is a brief description of the process.

2.1 Maven Import Related Dependency

Import JMH dependencies, you can go to the official website to view the latest version:

org.openjdk.jmh jmh-core ${openjdk.jmh.version} org.openjdk.jmh jmh-generator-annprocess ${openjdk.jmh.version} 1.192.2 Creating Test-Related Classes 2.2.1 Collecting Classes for Stored Objects

Because we want to test the methods of the collection class, we create a class that represents the objects stored by the collection. As follows:

@Data@AllArgsConstructor(staticName = "of")public class Student { private Long id; private String name;}2.2.2 JMH Test Class

Next, let's write a class to test performance comparison. The code is as follows:

@BenchmarkMode(Mode.AverageTime)@OutputTimeUnit(TimeUnit.NANOSECONDS)public class ContainsPerformanceTest { @State(Scope.Thread) public static class MyState { private Set studentSet = new HashSet(); private List studentList = new ArrayList(); private Student targetStudent = Student.of(99L, "Larry"); @Setup(Level.Trial) public void prepare() { long MAX_COUNT = 10000; for (long i = 0; i

< MAX_COUNT; i++) { studentSet.add(Student.of(i, "MQ")); studentList.add(Student.of(i, "MQ")); } studentList.add(targetStudent); studentSet.add(targetStudent); } } @Benchmark public boolean arrayList(MyState state) { return state.studentList.contains(state.targetStudent); } @Benchmark public boolean hashSet(MyState state) { return state.studentSet.contains(state.targetStudent); } public static void main(String[] args) throws Exception { Options options = new OptionsBuilder() .include(ContainsPerformanceTest.class.getSimpleName()) .threads(6) .forks(1) .warmupIterations(3) .measurementIterations(6) .shouldFailOnError(true) .shouldDoGC(true) .build(); new Runner(options).run(); }} 测试类注解说明: @BenchmarkMode:表示进行Benchmark时使用的模式;AverageTime表示测试调用的平均时间。 @OutputTimeUnit:测试的度量时间单位;NANOSECONDS表示使用纳秒为单位。 @State:接受一个Scope参数表示状态的共享范围;Scope.Thread表示每个线程独享。 @Setup:执行Benchmark前执行,类似于JUnit的@BeforeAll。 @Benchmark:进行Benchmark的对象,类似于JUnit的@Test。 测试类启动参数Options说明: include:benchmark所在的类名; threads:每个进程中的测试线程数; fork:进程数,如果为3,则JMH会fork出3个进程来测试; warmupIterations:预热的迭代次数, measurementIterations:实际测量的迭代次数。 2.3 测试结果 设置好参数后,就可以跑测试了。测试结果如下: # Benchmark: ContainsPerformanceTest.arrayList# Run progress: 0.00% complete, ETA 00:00:18# Fork: 1 of 1# Warmup Iteration 1: 42530.408 ±(99.9%) 2723.999 ns/op# Warmup Iteration 2: 17841.988 ±(99.9%) 1882.026 ns/op# Warmup Iteration 3: 18561.513 ±(99.9%) 2021.506 ns/opIteration 1: 18499.568 ±(99.9%) 2126.172 ns/opIteration 2: 18975.407 ±(99.9%) 2004.509 ns/opIteration 3: 19386.851 ±(99.9%) 2248.536 ns/opIteration 4: 19279.722 ±(99.9%) 2102.846 ns/opIteration 5: 19796.495 ±(99.9%) 1974.987 ns/opIteration 6: 21363.962 ±(99.9%) 2175.961 ns/opResult "ContainsPerformanceTest.arrayList": 19550.334 ±(99.9%) 2771.595 ns/op [Average] (min, avg, max) = (18499.568, 19550.334, 21363.962), stdev = 988.377 CI (99.9%): [16778.739, 22321.929] (assumes normal distribution)# Benchmark: ContainsPerformanceTest.hashSet# Run progress: 50.00% complete, ETA 00:00:16# Fork: 1 of 1# Warmup Iteration 1: 10.662 ±(99.9%) 0.209 ns/op# Warmup Iteration 2: 11.177 ±(99.9%) 1.077 ns/op# Warmup Iteration 3: 9.467 ±(99.9%) 1.462 ns/opIteration 1: 9.540 ±(99.9%) 0.535 ns/opIteration 2: 9.388 ±(99.9%) 0.365 ns/opIteration 3: 10.604 ±(99.9%) 1.008 ns/opIteration 4: 9.361 ±(99.9%) 0.154 ns/opIteration 5: 9.366 ±(99.9%) 0.458 ns/opIteration 6: 9.274 ±(99.9%) 0.237 ns/opResult "ContainsPerformanceTest.hashSet": 9.589 ±(99.9%) 1.415 ns/op [Average] (min, avg, max) = (9.274, 9.589, 10.604), stdev = 0.505 CI (99.9%): [8.174, 11.004] (assumes normal distribution)# Run complete. Total time: 00:00:32Benchmark Mode Cnt Score Error UnitsContainsPerformanceTest.arrayList avgt 6 19550.334 ± 2771.595 ns/opContainsPerformanceTest.hashSet avgt 6 9.589 ± 1.415 ns/op 经过测试,发现两者耗时差异极大,ArrayList大概是20K纳秒,而HashSet则10纳秒左右。两者完全不在一个数量级上。 3 源码分析 通过测试得知两者差异极大,就小窥一下源码分析分析。 3.1 ArrayList的contains() ArrayList的底层使用数组作为数据存储,当给定一个Object去判断是否存在,需要去遍历数组,与每个元素对比。 public boolean contains(Object o) { return indexOf(o) >

= 0;}public int indexOf(Object o) { if (o == null) { for (int i = 0; i

< size; i++) if (elementData[i]==null) return i; } else { for (int i = 0; i < size; i++) if (o.equals(elementData[i])) return i; } return -1;} 从源码可以发现,contains()方法是通过调用indexOf()来判断的,而后者就是需要遍历数组,直到找到那个与入参相等的元素才会停止。因为,ArrayList的contains()方法的时间复杂度为O(n),也就是说,时间取决于长度,而且是正比的关系。 3.2 HashSet的contains() HashSet底层是通过HashMap来实现的,而HashMap的底层结构为数组+链表,JDK 8后改为数组+链表+红黑树。 HashMap的相关代码如下: public boolean containsKey(Object key) { return getNode(hash(key), key) != null;}final Node getNode(int hash, Object key) { Node[] tab; Node first, e; int n; K k; if ((tab = table) != null && (n = tab.length) >

0 && (first = tab[(n - 1) & hash]) != null) { if (first.hash == hash && // always check first node ((k = first.key) == key || (key != null && key.equals(k)))) return first; if ((e = first.next) != null) { if (first instanceof TreeNode) return ((TreeNode)first).getTreeNode(hash, key); do { if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } while ((e = e.next) != null); } } return null;}

First find by getting the Hash value, if the Hash value is equal and the object is also equal, then find. Generally speaking, in the case where the hashCode() method is implemented without problems, the occurrence of Hash collisions is relatively small. Therefore, it can be considered that in most cases, the time complexity of contains() is O(1), and the number of elements does not affect its speed. If a Hash collision occurs, the time complexity is O(n) when the length of the linked list is less than 8; when the linked list is greater than 8, it is converted to a red-black tree with O(logn) time complexity.

In general, we think that HashSet/HashMap lookup time complexity is O(1).

4 summarizes

Through JMH testing we found that ArrayList and HashSet contains() methods differ significantly in performance. After source code analysis, we know that the time complexity of ArrayList is O(n), and the time degree of HashSet is O(1).

At this point, the study on "Comparison of the performance of the methods contained in ArrayList and HashSet in java" is over, hoping to solve everyone's doubts. Theory and practice can better match to help everyone learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report