不再纠结什么路线,什么架构图了,就从一个点开始发散吧,螺旋螺旋…

预热>>>

所以,首先,什么是Hash?

:::info 朽木说:就是一个int值吧,通过某种算法得到的,hash函数?我不是很确定哎。 :::

我还是google一下吧:

image.png

散列函数?摘要,貌似勾起了我大学的一丁点回忆,开始螺旋了吗?我选择知乎,Go!

:::info 针对第一条回答,稍微提取一些重点吧:
hash 函数,是将任意长度的数据映射到有限长度的域上,作为这段数据的特征(指纹);

抗碰撞能力 :对于任意两个不同的数据块,其hash值相同的可能性极小;对于一个给定的数据块,找到和它hash值相同的数据块极为困难。

抗篡改能力 :对于一个数据块,哪怕只改动其一个bit位,其hash值得改动也会非常大。

在用到hash进行管理的数据结构中,比如 HashMap , hash 值(key)存在的目的是 加速键值对的查找 ,key的作用是为了将元素适当地放在各个桶里,对于抗碰撞的要求没那么高。hash出来的key,只要保证value大致均匀的放在不同的桶里就可以…….

:::

关于hash密码那段的就不看…先大致了解下就好。

根据答主的提示,翻了下JDK8中String的 hashCode() 方法:

  1. /**
  2. * Returns a hash code for this string. The hash code for a
  3. * {@code String} object is computed as
  4. * <blockquote><pre>
  5. * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
  6. * </pre></blockquote>
  7. * using {@code int} arithmetic, where {@code s[i]} is the
  8. * <i>i</i>th character of the string, {@code n} is the length of
  9. * the string, and {@code ^} indicates exponentiation.
  10. * (The hash value of the empty string is zero.)
  11. *
  12. * @return a hash code value for this object.
  13. */
  14. public int hashCode() {
  15. int h = hash;
  16. if (h == 0 && value.length > 0) {
  17. char val[] = value;
  18. for (int i = 0; i < value.length; i++) {
  19. h = 31 * h + val[i];
  20. }
  21. hash = h;
  22. }
  23. return h;
  24. }

再来学习下 Object 类中对hashCode方法的注释:

  1. /**
  2. * Returns a hash code value for the object. This method is
  3. * supported for the benefit of hash tables such as those provided by
  4. * {@link java.util.HashMap}.
  5. * <p>
  6. * The general contract of {@code hashCode} is:
  7. * <ul>
  8. * <li>Whenever it is invoked on the same object more than once during
  9. * an execution of a Java application, the {@code hashCode} method
  10. * must consistently return the same integer, provided no information
  11. * used in {@code equals} comparisons on the object is modified.
  12. * This integer need not remain consistent from one execution of an
  13. * application to another execution of the same application.
  14. * <li>If two objects are equal according to the {@code equals(Object)}
  15. * method, then calling the {@code hashCode} method on each of
  16. * the two objects must produce the same integer result.
  17. * <li>It is <em>not</em> required that if two objects are unequal
  18. * according to the {@link java.lang.Object#equals(java.lang.Object)}
  19. * method, then calling the {@code hashCode} method on each of the
  20. * two objects must produce distinct integer results. However, the
  21. * programmer should be aware that producing distinct integer results
  22. * for unequal objects may improve the performance of hash tables.
  23. * </ul>
  24. * <p>
  25. * As much as is reasonably practical, the hashCode method defined by
  26. * class {@code Object} does return distinct integers for distinct
  27. * objects. (This is typically implemented by converting the internal
  28. * address of the object into an integer, but this implementation
  29. * technique is not required by the
  30. * Java&trade; programming language.)
  31. *
  32. * @return a hash code value for this object.
  33. * @see java.lang.Object#equals(java.lang.Object)
  34. * @see java.lang.System#identityHashCode
  35. */
  36. public native int hashCode();

简单翻一下三条协定:

  1. 同一个Java程序执行中,多次调用hashCode方法返回值必须一致;如果是程序的多次执行,则每一次可以不同。
  2. equals方法声明相等的对象必须具有相等的哈希代码。
  3. 如果两个对象equals返回false,hashCode 则不是必须相同。It is not required that…

equals 方法的说明 :::danger Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

请注意,通常需要在重写此方法时覆盖hashCode方法,以便维护hashCode方法的常规协定, 该方法声明相等的对象必须具有相等的哈希代码 。意思就是 equals 方法返回true,hashCode要相等,反正不成立,因为hash碰撞/冲突无法避免。 :::

HashMap的使用

HashMap是应用更加广泛的哈希表实现,非同步, 支持null键和值 。通常情况下,HashMap进行put/get操作可以达到常数时间的性能,所以它是绝大部分利用键值对存取场景的首选。

初始化对象
新增,删除
遍历方法等

Map整体架构

1

HashMap 深入分析

HashMap内部结构,可以看做是数据和链表组合成的复合结构,数组被分为一个个桶 bucket,通过哈希值决定了键值对在在这个数组的寻址;哈希值相同的键值对,则以链表形式存储。这里需要注意的是,如果链表大小超过阈值(TREEIFY_THRESHOLD, 8),图中的链表就会被改造为树形结构。
从HashMap 开始... - 图3

构造方法之一:

  1. /**
  2. * Constructs an empty <tt>HashMap</tt> with the specified initial
  3. * capacity and load factor.
  4. *
  5. * @param initialCapacity the initial capacity
  6. * @param loadFactor the load factor
  7. * @throws IllegalArgumentException if the initial capacity is negative
  8. * or the load factor is nonpositive
  9. */
  10. public HashMap(int initialCapacity, float loadFactor) {
  11. if (initialCapacity < 0)
  12. throw new IllegalArgumentException("Illegal initial capacity: " +
  13. initialCapacity);
  14. if (initialCapacity > MAXIMUM_CAPACITY)
  15. initialCapacity = MAXIMUM_CAPACITY;
  16. if (loadFactor <= 0 || Float.isNaN(loadFactor))
  17. throw new IllegalArgumentException("Illegal load factor: " +
  18. loadFactor);
  19. this.loadFactor = loadFactor;
  20. this.threshold = tableSizeFor(initialCapacity);
  21. }

从实现来看,这个表格(数组)并没有在最初就初始化完成,仅仅设置了一些初始值而已。
put 方法

  1. /**
  2. * Associates the specified value with the specified key in this map.
  3. * If the map previously contained a mapping for the key, the old
  4. * value is replaced.
  5. *
  6. * @param key key with which the specified value is to be associated
  7. * @param value value to be associated with the specified key
  8. * @return the previous value associated with <tt>key</tt>, or
  9. * <tt>null</tt> if there was no mapping for <tt>key</tt>.
  10. * (A <tt>null</tt> return can also indicate that the map
  11. * previously associated <tt>null</tt> with <tt>key</tt>.)
  12. */
  13. public V put(K key, V value) {
  14. return putVal(hash(key), key, value, false, true);
  15. }
  16. /**
  17. * Implements Map.put and related methods.
  18. *
  19. * @param hash hash for key
  20. * @param key the key
  21. * @param value the value to put
  22. * @param onlyIfAbsent if true, don't change existing value
  23. * @param evict if false, the table is in creation mode.
  24. * @return previous value, or null if none
  25. */
  26. final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
  27. boolean evict) {
  28. Node<K,V>[] tab; Node<K,V> p; int n, i;
  29. if ((tab = table) == null || (n = tab.length) == 0)
  30. n = (tab = resize()).length;
  31. if ((p = tab[i = (n - 1) & hash]) == null)
  32. tab[i] = newNode(hash, key, value, null);
  33. else {
  34. Node<K,V> e; K k;
  35. if (p.hash == hash &&
  36. ((k = p.key) == key || (key != null && key.equals(k))))
  37. e = p;
  38. else if (p instanceof TreeNode)
  39. e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
  40. else {
  41. for (int binCount = 0; ; ++binCount) {
  42. if ((e = p.next) == null) {
  43. p.next = newNode(hash, key, value, null);
  44. if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
  45. treeifyBin(tab, hash);
  46. break;
  47. }
  48. if (e.hash == hash &&
  49. ((k = e.key) == key || (key != null && key.equals(k))))
  50. break;
  51. p = e;
  52. }
  53. }
  54. if (e != null) { // existing mapping for key
  55. V oldValue = e.value;
  56. if (!onlyIfAbsent || oldValue == null)
  57. e.value = value;
  58. afterNodeAccess(e);
  59. return oldValue;
  60. }
  61. }
  62. ++modCount;
  63. if (++size > threshold)
  64. resize();
  65. afterNodeInsertion(evict);
  66. return null;
  67. }

resize() 方法初始化,扩容

hash 值的方法:将高位数据移位到低位进行异或运算。

有些数据计算出的哈希值差异主要在高位,而HashMap里的哈希寻址是忽略容量以上的高位的,那么这种处理就可以有效避免类似情况下的哈希碰撞。

很难理解…
_

  1. /**
  2. * Computes key.hashCode() and spreads (XORs) higher bits of hash
  3. * to lower. Because the table uses power-of-two masking, sets of
  4. * hashes that vary only in bits above the current mask will
  5. * always collide. (Among known examples are sets of Float keys
  6. * holding consecutive whole numbers in small tables.) So we
  7. * apply a transform that spreads the impact of higher bits
  8. * downward. There is a tradeoff between speed, utility, and
  9. * quality of bit-spreading. Because many common sets of hashes
  10. * are already reasonably distributed (so don't benefit from
  11. * spreading), and because we use trees to handle large sets of
  12. * collisions in bins, we just XOR some shifted bits in the
  13. * cheapest possible way to reduce systematic lossage, as well as
  14. * to incorporate impact of the highest bits that would otherwise
  15. * never be used in index calculations because of table bounds.
  16. */
  17. static final int hash(Object key) {
  18. int h;
  19. return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
  20. }

继续深入方向:

  1. 梳理几个方法的逻辑流程,如 resize() putVal() treeifyBin
  2. 为什么要树化,如何构造hash相等的对象?

困的不行了,肝不动了,睡了,明天继续!…

6-8 pm

10-12.30 pm

4.5h

老师我想问一下,hashmap明明继承了abstractmap,而abstractmap已经实现了map接口,为什么hashmap还要实现map接口呢?
https://stackoverflow.com/questions/2165204/why-does-linkedhashsete-extend-hashsete-and-implement-sete