Chapter7 Hashtable

Symbol-table Probem

Table s holding n records,each record has a key value and some satellite data.

operations
1) Insert 2) Delete 3)Search

Direct Access Table

suppose keys are drawm from U={0,1,…,m-1}.
Assume keys are distinct.
set up array T[0,1,…,m-1] to represent dynamic set s.
T[k] = x if x∈s && key[x] = k ,otherwise nil.
Here all opyions take θ(1) time.

Limitation: a small-size array with large value like 64 bits long is unafford to store .

Hashing

We use a hashing function H which maps the keys “randomly” into slots of table T.

  • collision
    When a record to be insered maps to an already occupied slot,a collison occurs.

  • Resolving collisions by chaining

    • Idea:link records in the same slot into list
    • Analysis
      • Worst-Case
        every kry hashes to the same slot.
        Access take θ(n) time if |S| = n.
      • Average-Case
        With assumption of simple uniform of hashing,each key k∈S equally likely to be hashed to any slot in T,independent pf where other keys are hashed to.
  • Definition of Load factor
    Thel load factor of hash table with n keys and m slots is α=n/m = average |keys| per slot.
    We give a conclusion that Expected search time = θ(1+α)。

  • How to choose a hash function?

  1. It should distribute keys uniformly into slots.
  2. Regularity in the key distributions should not afffect uniformity.

Several usual hash functions

1) Division method
h(k) = K mod m
- Don not pick m with small divisor d.
e.x. d = 2 and all keys are even,then odd slots never used.
	m = 2^r ->hash does not depend on all slots.
- We shold pick m primely not too close to power of 2 or 10.

2)Multiplication method
m = 2^r, computer has w-bits words.
h(k) = (A*k mod  2^w) rsh (w-r) ,here rsh denotes right shift,and A is an odd number between 2^(w-1) and 2^w
- Dont pick A too close to 2^(w-1) or 2^w.
- multiplication and mod operation is faster than division and rsh is fast,too.
e.x.  m = 8 = 2^3 ,w = 7, w-r = 4
						1 0 1 1 0 0 1   A
					*   1 1 0 1 0 1 1   k
	    = 1 0 0 1 0 1 0 0 1 1 0 0 1 1
High order of result will be ignore,then h(k) = 0 1 1 after right shifting.
  • Resolving collision by open addressing(No storage for links)
    • Idea:
    1. Probe table systematically utill an empty slot is found.
    2. Probe step shold be permutation.
      limitation: deletion operation is difficult.
    • Probing strategies
    1. Linear - h(k,i) = (h(k,α)+i) )mod n
      “Primary clustering” - long runs of filled slots
    2. Double hashing - h(k) = (h1(k)+i*h2(k)) mod m
      excellent method - use pick m = 2^r and h2(k) odd
    • Anylysis of open addressing
      Assumption of uniform hashing: each key is equally likely to have one of the m’ permutations as probe sequence,independent of other keys.
      We can proof that the expected probe nums E[#probes] <= 1/(1-α) if α<1 ,so n<m.
      And if α is a constant,then we it takes θ(1)probe.
发布了80 篇原创文章 · 获赞 332 · 访问量 70万+

猜你喜欢

转载自blog.csdn.net/qq_40527086/article/details/103228418
今日推荐