• map : key Lesson 11 Hash Table - 图1 value

    • key: var-length, unordered
  • hash table: int Lesson 11 Hash Table - 图2 value through array

    • int: small range (hash code)
      From key to hash code: hash table

Hash Code generate

Lesson 11 Hash Table - 图3%20%3D%20compress(HashCode(key))#card=math&code=h%28key%29%20%3D%20compress%28HashCode%28key%29%29)

How to generate HashCode?

    1. Lesson 11 Hash Table - 图4: HashCode(“abc”) = HashCode(“acb”)
    1. Lesson 11 Hash Table - 图5#card=math&code=%5Csum_%7Bi%7D%20%28key%5Bi%5D%20%5Ccdot%2016%5Ei%29) Lesson 11 Hash Table - 图6 byte shift
    1. cyclic shift

From simple to complicated.

Collision Handling

collision: Lesson 11 Hash Table - 图7%20%3D%20h(key_2)#card=math&code=h%28key_1%29%20%3D%20h%28key_2%29) but Lesson 11 Hash Table - 图8

    1. don’t care (tolerate some errors)
    • throw away
    • don’t insert

example: use hash-based map to realize set:
set = Lesson 11 Hash Table - 图9
key Lesson 11 Hash Table - 图10 Lesson 11 Hash Table - 图11

Lesson 11 Hash Table - 图12%5D%20%3D%201#card=math&code=a%5Bh%28k_i%29%5D%20%3D%201): correct if no collision ,may not correct if collision
Bloom Filter

    1. fixed array per bucket (can still overflow)
    1. linked list per bucket(chaining, hope short chain)
    1. other container—-other container: secondary
    1. “reuse” empty space: open addressing

a[key] = value

    1. insert(key, value) to array location Lesson 11 Hash Table - 图13%5D#card=math&code=%5Bh_0%28key%29%5D)
    1. if fail, re-insert to Lesson 11 Hash Table - 图14#card=math&code=h_1%28key%29)
    1. if fail, re-insert to Lesson 11 Hash Table - 图15#card=math&code=h_2%28key%29)
  • m. if fail, re-insert to Lesson 11 Hash Table - 图16#card=math&code=h_%7Bm-1%7D%28key%29)
  • m+1. if still not found declare failure

How to decide h(key)

    1. linear probing
      Lesson 11 Hash Table - 图17%20%25K%20%3D%20(h0%20%2Bi%20)%25%20K#card=math&code=h_i%20%3D%20%28h%7Bi-1%7D%2B1%29%20%25K%20%3D%20%28h_0%20%2Bi%20%29%25%20K)
    1. quadratic probing
      Lesson 11 Hash Table - 图18%20%25%20K#card=math&code=h_i%20%3D%20%28h_0%2Bi%5E2%29%20%25%20K)
    1. double hashing
      Lesson 11 Hash Table - 图19#card=math&code=h_i%20%3D%20%28h_0%20%2B%20i%20%5Ccdot%20%5Cwidetilde%7Bh%7D%28key%29)

Hash Table Build

Consider a hash table which has Lesson 11 Hash Table - 图20 entries and Lesson 11 Hash Table - 图21 keys
We want Lesson 11 Hash Table - 图22 load factor to be small

  • Lesson 11 Hash Table - 图23 large Lesson 11 Hash Table - 图24 hash don’t work
  • hash non-uniform Lesson 11 Hash Table - 图25 Lesson 11 Hash Table - 图26 large

When n(load factor) is getting larger, then we increase K

naive method

    1. set Lesson 11 Hash Table - 图27
    1. h(key) Lesson 11 Hash Table - 图28 range{0,1,2,…,2K-1}
    1. full rebuild Lesson 11 Hash Table - 图29#card=math&code=w%2FO%28n%29) if each insert Lesson 11 Hash Table - 图30#card=math&code=%5Csimeq%20O%281%29)

shortcoming: step3 long waiting

partial rebuild method

Example:
origin hash table(mod 4, h(key) = (key[0] - ‘a’) % K):
Lesson 11 Hash Table - 图31
We want insert “egg”

  • Full build method:
    Lesson 11 Hash Table - 图32
    We move “eat” to block 4 and “good” to block 6
  • partial build method:
    We only move “eat” to block 4 because it’s overflow
    Lesson 11 Hash Table - 图33
    We use pointer to connect the other blocks
    complexity: Lesson 11 Hash Table - 图34%2BO(%5Cfrac%7Bn%7D%7Bk%7D)#card=math&code=O%28K%29%2BO%28%5Cfrac%7Bn%7D%7Bk%7D%29)
    This way, the hash table is effectively extendable.

Summarize

Lesson 11 Hash Table - 图35