1 原理说明

1.1 ArrayList

ArrayList中contains()方法的实现过程:

  1. /**
  2. * Returns <tt>true</tt> if this list contains the specified element.
  3. * More formally, returns <tt>true</tt> if and only if this list contains
  4. * at least one element <tt>e</tt> such that
  5. * <tt>(o==null&nbsp;?&nbsp;e==null&nbsp;:&nbsp;o.equals(e))</tt>.
  6. *
  7. * @param o element whose presence in this list is to be tested
  8. * @return <tt>true</tt> if this list contains the specified element
  9. */
  10. public boolean contains(Object o) {
  11. return indexOf(o) >= 0;
  12. }

contains()方法调用了indexOf()方法,indexOf()具体实现如下。从源码可以看出,该方法通过遍历数据和比较元素的方式来判断是否存在给定元素。当ArrayList中存放的元素非常多时,这种实现方式来判断效率将非常低,后面通过实例来验证。

  1. /**
  2. * Returns the index of the first occurrence of the specified element
  3. * in this list, or -1 if this list does not contain the element.
  4. * More formally, returns the lowest index <tt>i</tt> such that
  5. * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
  6. * or -1 if there is no such index.
  7. */
  8. public int indexOf(Object o) {
  9. if (o == null) {
  10. for (int i = 0; i < size; i++)
  11. if (elementData[i]==null)
  12. return i;
  13. } else {
  14. for (int i = 0; i < size; i++)
  15. if (o.equals(elementData[i]))
  16. return i;
  17. }
  18. return -1;
  19. }

1.2 HashSet

既然ArrayList的contains()方法存在性能问题,那么就应该寻找改进的办法。这里推荐使用HashSet来代替ArrayList。

下面介绍HashSet的contains()方法的实现过程:

注:HashSet将元素存放在HashMap中(HashMap的key)

  1. /**
  2. * Returns <tt>true</tt> if this set contains the specified element.
  3. * More formally, returns <tt>true</tt> if and only if this set
  4. * contains an element <tt>e</tt> such that
  5. * <tt>(o==null&nbsp;?&nbsp;e==null&nbsp;:&nbsp;o.equals(e))</tt>.
  6. *
  7. * @param o element whose presence in this set is to be tested
  8. * @return <tt>true</tt> if this set contains the specified element
  9. */
  10. public boolean contains(Object o) {
  11. return map.containsKey(o);
  12. }

contains()方法调用HashMap的containsKey()方法

  1. /**
  2. * Returns <tt>true</tt> if this map contains a mapping for the
  3. * specified key.
  4. *
  5. * @param key The key whose presence in this map is to be tested
  6. * @return <tt>true</tt> if this map contains a mapping for the specified
  7. * key.
  8. */
  9. public boolean containsKey(Object key) {
  10. return getNode(hash(key), key) != null;
  11. }

containsKey()方法调用getNode() 方法。在该方法中,首先根据key计算hash值,然后从HashMap中取出该hash值对应的链表(链表的元素个数将很少),再通过变量该链表判断是否存在给定值。这种实现方式效率将比ArrayList的实现方法效率高非常多。

  1. /**
  2. * Implements Map.get and related methods.
  3. *
  4. * @param hash hash for key
  5. * @param key the key
  6. * @return the node, or null if none
  7. */
  8. final Node<K,V> getNode(int hash, Object key) {
  9. Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
  10. if ((tab = table) != null && (n = tab.length) > 0 &&
  11. (first = tab[(n - 1) & hash]) != null) {
  12. if (first.hash == hash && // always check first node
  13. ((k = first.key) == key || (key != null && key.equals(k))))
  14. return first;
  15. if ((e = first.next) != null) {
  16. if (first instanceof TreeNode)
  17. return ((TreeNode<K,V>)first).getTreeNode(hash, key);
  18. do {
  19. if (e.hash == hash &&
  20. ((k = e.key) == key || (key != null && key.equals(k))))
  21. return e;
  22. } while ((e = e.next) != null);
  23. }
  24. }
  25. return null;
  26. }

2 实例验证

2.1 测试ArrayList

代码:

  1. public static void main(String[] args) {
  2. ArrayList<String> arrayList = new ArrayList<>();
  3. // 存入100000个数据
  4. for (int i = 0; i < 100000; i++) {
  5. arrayList.add("test" + i);
  6. }
  7. // 验证300000个数据(其中200000不存在)
  8. long beginTime = System.currentTimeMillis(); for (int i = 0; i < 300000; i++) {
  9. arrayList.contains("test" + i);
  10. }
  11. long endTime = System.currentTimeMillis();
  12. System.out.println("cost time: " + (endTime - beginTime) + "ms");
  13. }

打印结果:

cost time: 182210ms

2.2 测试HashSet

代码:

  1. public static void main(String[] args) {
  2. Set<String> hashSet = new HashSet<>();
  3. // 存入100000个数据
  4. for (int i = 0; i < 100000; i++) {
  5. hashSet.add("test" + i);
  6. }
  7. // 验证300000个数据(其中200000不存在)
  8. long beginTime = System.currentTimeMillis();
  9. for (int i = 0; i < 300000; i++) {
  10. hashSet.contains("test" + i);
  11. }
  12. long endTime = System.currentTimeMillis();
  13. System.out.println("cost time: " + (endTime - beginTime) + "ms");
  14. }

打印结果:

cost time: 49ms

3 总结

通过第二节的实例可以看出,使用ArrayList的contains()耗时是使用HashSet的contains()方法的30多倍。具体原因可以参考第一节中的原理分析。