1 原理说明
1.1 ArrayList
ArrayList中contains()方法的实现过程:
/*** Returns <tt>true</tt> if this list contains the specified element.* More formally, returns <tt>true</tt> if and only if this list contains* at least one element <tt>e</tt> such that* <tt>(o==null ? e==null : o.equals(e))</tt>.** @param o element whose presence in this list is to be tested* @return <tt>true</tt> if this list contains the specified element*/public boolean contains(Object o) {return indexOf(o) >= 0;}
contains()方法调用了indexOf()方法,indexOf()具体实现如下。从源码可以看出,该方法通过遍历数据和比较元素的方式来判断是否存在给定元素。当ArrayList中存放的元素非常多时,这种实现方式来判断效率将非常低,后面通过实例来验证。
/*** Returns the index of the first occurrence of the specified element* in this list, or -1 if this list does not contain the element.* More formally, returns the lowest index <tt>i</tt> such that* <tt>(o==null ? get(i)==null : o.equals(get(i)))</tt>,* or -1 if there is no such index.*/public int indexOf(Object o) {if (o == null) {for (int i = 0; i < size; i++)if (elementData[i]==null)return i;} else {for (int i = 0; i < size; i++)if (o.equals(elementData[i]))return i;}return -1;}
1.2 HashSet
既然ArrayList的contains()方法存在性能问题,那么就应该寻找改进的办法。这里推荐使用HashSet来代替ArrayList。
下面介绍HashSet的contains()方法的实现过程:
注:HashSet将元素存放在HashMap中(HashMap的key)
/*** Returns <tt>true</tt> if this set contains the specified element.* More formally, returns <tt>true</tt> if and only if this set* contains an element <tt>e</tt> such that* <tt>(o==null ? e==null : o.equals(e))</tt>.** @param o element whose presence in this set is to be tested* @return <tt>true</tt> if this set contains the specified element*/public boolean contains(Object o) {return map.containsKey(o);}
contains()方法调用HashMap的containsKey()方法
/*** Returns <tt>true</tt> if this map contains a mapping for the* specified key.** @param key The key whose presence in this map is to be tested* @return <tt>true</tt> if this map contains a mapping for the specified* key.*/public boolean containsKey(Object key) {return getNode(hash(key), key) != null;}
containsKey()方法调用getNode() 方法。在该方法中,首先根据key计算hash值,然后从HashMap中取出该hash值对应的链表(链表的元素个数将很少),再通过变量该链表判断是否存在给定值。这种实现方式效率将比ArrayList的实现方法效率高非常多。
/*** Implements Map.get and related methods.** @param hash hash for key* @param key the key* @return the node, or null if none*/final Node<K,V> getNode(int hash, Object key) {Node<K,V>[] tab; Node<K,V> first, e; int n; K k;if ((tab = table) != null && (n = tab.length) > 0 &&(first = tab[(n - 1) & hash]) != null) {if (first.hash == hash && // always check first node((k = first.key) == key || (key != null && key.equals(k))))return first;if ((e = first.next) != null) {if (first instanceof TreeNode)return ((TreeNode<K,V>)first).getTreeNode(hash, key);do {if (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))return e;} while ((e = e.next) != null);}}return null;}
2 实例验证
2.1 测试ArrayList
代码:
public static void main(String[] args) {ArrayList<String> arrayList = new ArrayList<>();// 存入100000个数据for (int i = 0; i < 100000; i++) {arrayList.add("test" + i);}// 验证300000个数据(其中200000不存在)long beginTime = System.currentTimeMillis(); for (int i = 0; i < 300000; i++) {arrayList.contains("test" + i);}long endTime = System.currentTimeMillis();System.out.println("cost time: " + (endTime - beginTime) + "ms");}
打印结果:
cost time: 182210ms
2.2 测试HashSet
代码:
public static void main(String[] args) {Set<String> hashSet = new HashSet<>();// 存入100000个数据for (int i = 0; i < 100000; i++) {hashSet.add("test" + i);}// 验证300000个数据(其中200000不存在)long beginTime = System.currentTimeMillis();for (int i = 0; i < 300000; i++) {hashSet.contains("test" + i);}long endTime = System.currentTimeMillis();System.out.println("cost time: " + (endTime - beginTime) + "ms");}
打印结果:
cost time: 49ms
3 总结
通过第二节的实例可以看出,使用ArrayList的contains()耗时是使用HashSet的contains()方法的30多倍。具体原因可以参考第一节中的原理分析。
