1 原理说明
1.1 ArrayList
ArrayList中contains()方法的实现过程:
/**
* Returns <tt>true</tt> if this list contains the specified element.
* More formally, returns <tt>true</tt> if and only if this list contains
* at least one element <tt>e</tt> such that
* <tt>(o==null ? e==null : o.equals(e))</tt>.
*
* @param o element whose presence in this list is to be tested
* @return <tt>true</tt> if this list contains the specified element
*/
public boolean contains(Object o) {
return indexOf(o) >= 0;
}
contains()方法调用了indexOf()方法,indexOf()具体实现如下。从源码可以看出,该方法通过遍历数据和比较元素的方式来判断是否存在给定元素。当ArrayList中存放的元素非常多时,这种实现方式来判断效率将非常低,后面通过实例来验证。
/**
* Returns the index of the first occurrence of the specified element
* in this list, or -1 if this list does not contain the element.
* More formally, returns the lowest index <tt>i</tt> such that
* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,
* or -1 if there is no such index.
*/
public int indexOf(Object o) {
if (o == null) {
for (int i = 0; i < size; i++)
if (elementData[i]==null)
return i;
} else {
for (int i = 0; i < size; i++)
if (o.equals(elementData[i]))
return i;
}
return -1;
}
1.2 HashSet
既然ArrayList的contains()方法存在性能问题,那么就应该寻找改进的办法。这里推荐使用HashSet来代替ArrayList。
下面介绍HashSet的contains()方法的实现过程:
注:HashSet将元素存放在HashMap中(HashMap的key)
/**
* Returns <tt>true</tt> if this set contains the specified element.
* More formally, returns <tt>true</tt> if and only if this set
* contains an element <tt>e</tt> such that
* <tt>(o==null ? e==null : o.equals(e))</tt>.
*
* @param o element whose presence in this set is to be tested
* @return <tt>true</tt> if this set contains the specified element
*/
public boolean contains(Object o) {
return map.containsKey(o);
}
contains()方法调用HashMap的containsKey()方法
/**
* Returns <tt>true</tt> if this map contains a mapping for the
* specified key.
*
* @param key The key whose presence in this map is to be tested
* @return <tt>true</tt> if this map contains a mapping for the specified
* key.
*/
public boolean containsKey(Object key) {
return getNode(hash(key), key) != null;
}
containsKey()方法调用getNode() 方法。在该方法中,首先根据key计算hash值,然后从HashMap中取出该hash值对应的链表(链表的元素个数将很少),再通过变量该链表判断是否存在给定值。这种实现方式效率将比ArrayList的实现方法效率高非常多。
/**
* Implements Map.get and related methods.
*
* @param hash hash for key
* @param key the key
* @return the node, or null if none
*/
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && // always check first node
((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
}
return null;
}
2 实例验证
2.1 测试ArrayList
代码:
public static void main(String[] args) {
ArrayList<String> arrayList = new ArrayList<>();
// 存入100000个数据
for (int i = 0; i < 100000; i++) {
arrayList.add("test" + i);
}
// 验证300000个数据(其中200000不存在)
long beginTime = System.currentTimeMillis(); for (int i = 0; i < 300000; i++) {
arrayList.contains("test" + i);
}
long endTime = System.currentTimeMillis();
System.out.println("cost time: " + (endTime - beginTime) + "ms");
}
打印结果:
cost time: 182210ms
2.2 测试HashSet
代码:
public static void main(String[] args) {
Set<String> hashSet = new HashSet<>();
// 存入100000个数据
for (int i = 0; i < 100000; i++) {
hashSet.add("test" + i);
}
// 验证300000个数据(其中200000不存在)
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 300000; i++) {
hashSet.contains("test" + i);
}
long endTime = System.currentTimeMillis();
System.out.println("cost time: " + (endTime - beginTime) + "ms");
}
打印结果:
cost time: 49ms
3 总结
通过第二节的实例可以看出,使用ArrayList的contains()耗时是使用HashSet的contains()方法的30多倍。具体原因可以参考第一节中的原理分析。