📡ArrayList的contains()方法的性能问题及优化方法 - 《Java 学习笔记》

1 原理说明
- 1.1 ArrayList
- 1.2 HashSet
2 实例验证
- 2.1 测试ArrayList
- 2.2 测试HashSet
3 总结

1 原理说明

1.1 ArrayList

ArrayList中contains()方法的实现过程：

    /**
     * Returns <tt>true</tt> if this list contains the specified element.
     * More formally, returns <tt>true</tt> if and only if this list contains
     * at least one element <tt>e</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;e==null&nbsp;:&nbsp;o.equals(e))</tt>.
     *
     * @param o element whose presence in this list is to be tested
     * @return <tt>true</tt> if this list contains the specified element
     */
    public boolean contains(Object o) {
        return indexOf(o) >= 0;
    }

contains()方法调用了indexOf()方法，indexOf()具体实现如下。从源码可以看出，该方法通过遍历数据和比较元素的方式来判断是否存在给定元素。当ArrayList中存放的元素非常多时，这种实现方式来判断效率将非常低，后面通过实例来验证。

    /**
     * Returns the index of the first occurrence of the specified element
     * in this list, or -1 if this list does not contain the element.
     * More formally, returns the lowest index <tt>i</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;get(i)==null&nbsp;:&nbsp;o.equals(get(i)))</tt>,
     * or -1 if there is no such index.
     */
    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }

1.2 HashSet

既然ArrayList的contains()方法存在性能问题，那么就应该寻找改进的办法。这里推荐使用HashSet来代替ArrayList。

下面介绍HashSet的contains()方法的实现过程：

注：HashSet将元素存放在HashMap中（HashMap的key）

    /**
     * Returns <tt>true</tt> if this set contains the specified element.
     * More formally, returns <tt>true</tt> if and only if this set
     * contains an element <tt>e</tt> such that
     * <tt>(o==null&nbsp;?&nbsp;e==null&nbsp;:&nbsp;o.equals(e))</tt>.
     *
     * @param o element whose presence in this set is to be tested
     * @return <tt>true</tt> if this set contains the specified element
     */
    public boolean contains(Object o) {
        return map.containsKey(o);
    }

contains()方法调用HashMap的containsKey()方法

    /**
     * Returns <tt>true</tt> if this map contains a mapping for the
     * specified key.
     *
     * @param   key   The key whose presence in this map is to be tested
     * @return <tt>true</tt> if this map contains a mapping for the specified
     * key.
     */
    public boolean containsKey(Object key) {
        return getNode(hash(key), key) != null;
    }

containsKey()方法调用getNode() 方法。在该方法中，首先根据key计算hash值，然后从HashMap中取出该hash值对应的链表（链表的元素个数将很少），再通过变量该链表判断是否存在给定值。这种实现方式效率将比ArrayList的实现方法效率高非常多。

    /**
     * Implements Map.get and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

2 实例验证

2.1 测试ArrayList

代码：

public static void main(String[] args) {
    ArrayList<String> arrayList = new ArrayList<>();
    // 存入100000个数据
    for (int i = 0; i < 100000; i++) {
        arrayList.add("test" + i);
    }
    // 验证300000个数据（其中200000不存在）     
    long beginTime = System.currentTimeMillis();        for (int i = 0; i < 300000; i++) {
        arrayList.contains("test" + i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("cost time: " + (endTime - beginTime) + "ms");
}

打印结果：

cost time: 182210ms

2.2 测试HashSet

代码：

public static void main(String[] args) {
    Set<String> hashSet = new HashSet<>();
    // 存入100000个数据
    for (int i = 0; i < 100000; i++) {
        hashSet.add("test" + i);
    }
    // 验证300000个数据（其中200000不存在）
    long beginTime = System.currentTimeMillis();
    for (int i = 0; i < 300000; i++) {
        hashSet.contains("test" + i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("cost time: " + (endTime - beginTime) + "ms");
}

打印结果：

cost time: 49ms

3 总结

通过第二节的实例可以看出，使用ArrayList的contains()耗时是使用HashSet的contains()方法的30多倍。具体原因可以参考第一节中的原理分析。