Sequential and List Method
Unsorted Lists
Sequential Search
一个一个找
Self-Organizing Lists : Heuristics
Move-to-Front
When a record is found, move it to the front of the list.
每次找到一个就把这个放在最前面
Count
Order by actual historical frequency of access. When a record’s frequency count goes up,
it moves forward in the list to become the last record with that value for its frequency count.
当这个元素的值加完之后和别的元素刚好相同时,就把这个元素跟在后面
Transpose
When a record is found, swap it with the record ahead of it.
找到一个元素,就把这个元素和前面的交换。
Application: Text Compression
Move-to-Front heuristic
每新加一个单词就把它放在最前面,找到以前的单词也把它放在最前面
Sorted Lists
Binary Search
int search_binary(vector<int>v, int value)
{
int low = 0;
int high = v.size()-1;
int mid = (low+high)/2;
while (low <= high)
{
if (v[mid] == value) return mid;
else if (value<v[mid]) high=mid-1;
else low = mid + 1;
mid= (low + high) / 2;
}
return -1;
}
注意两点 :low<=high
return mid
Jump Search
检查每一个第r
个元素
- 如果
k
比第ir
个元素大,就比较(i+1)r
和k
- 否则就线性搜索
(i-1)r
和ir
Dictionary Search
Interpolation Search
始终在合适的地方附近寻找,就像查字典一样
Direct Access by Key Value
Hashing
Hash Function & Hash Table
- Hash Function : A hash function ( h ( k ) h(k) h(k)) maps key values ( k k k) to positions
- h ( ) h() h() must return a value within the hash table range
- Reminder function (%) can be used
- Hash Table : A hash table (HT) is an array that holds the records
- HT has M M M slots, indexed form 0 0 0 to M − 1 M-1 M−1
// Insert e into hash table HT
bool hashInsert(const Elem& e) {
int pos = h(getkey(e)); // Init
HT[pos] = e; // Insert e
return true;
}
// Find out the position of e
int hashSearch(const Elem& e) {
return h(getkey(e));
}
Collision
Solution 1
Linked List
这里linked list中的元素有多种排序方式 :
- Simple way is to define each slot to be the head of a linked list
- The sequence of the elements in linked lists can in many orders,.
- Order of insertion
- Key value order
- Good for an unsuccessful search
- Stop when a key is bigger than the one being searched for
- Frequency-of-access
Bucket Hashing
-
M slots are divided into B buckets
-
Insertion Algorithm
每个元素留一块位置,冲突了就往后推,满了就放在overflow
里- Record is assigned to the first slot in a bucket
- If it is already occupied, go to next
- If the bucket is full, put it into overflow
-
Searching Algorithm
先在桶里找,找不到了再去overflow
找- Hash the key to determine the bucket and search it
- If the desired key value is not found
- If the bucket is not full, the value is not exist
- Otherwise, search overflow bucket
Solution 2 :Probing
Probe Function&Probe Sequence
- Probe Function: p ( i ) p(i) p(i) ( i i i : the number of trials)
- Probe Sequence: ( β 0 , β 1 , . . ) , β i = ( h ( k ) + p ( k , i ) ) m o d M (β_0, β_1,..), β_i = (h(k) + p(k, i))\ mod\ M (β0,β1,..),βi=(h(k)+p(k,i)) mod M
- β 0 = h ( k ) β_0 = h(k) β0=h(k)
- β 1 = ( h ( k ) + p ( k , 1 ) ) m o d M β_1 = (h(k) + p(k, 1))\ mod\ M β1=(h(k)+p(k,1)) mod M
- β 2 = ( h ( k ) + p ( k , 2 ) ) m o d M β_2 = (h(k) + p(k, 2))\ mod\ M β2=(h(k)+p(k,2)) mod M
Linear Probing
就是一个一个往后推,直到能放下为止。这样的做法容易聚成一堆。
- p ( ) p() p() is a linear function Linear probing simply goes to the next slot in the table
- Past bottom, wrap around to the top
Quadratic Probing
企图解决以前会聚成一堆的问题, 所以是隔固定的数放一个
- Try c c c th next slot rather than next slot p ( i ) = i x c p(i) = i x c p(i)=ixc
- c c c should be relatively prime to M M M
注意这里的 M M M不能是 c c c的倍数
- c c c should be relatively prime to M M M
Random Probing
- p ( k , i ) = r a n d o m ( ) p(k, i) = random() p(k,i)=random() 如果是这样,就没有办法查找了,因此我们首先确定一个随机序列
- Values in Probe Sequence ( r 1 , r 2 , … , r M − 1 ) (r_1, r_2, …, r_{M-1}) (r1,r2,…,rM−1) is selected from 1 1 1to M − 1 M-1 M−1 randomly without replacement:
- For M = 10 : 8, 6, 1, 7, 2, 4, 5, 9, 3
Double Probing
Two hash functions are applied ( h ( ) h() h() and h 2 ( ) h2() h2())
- h ( ) h() h() : determine the initial position (home) h ( k ) = h o m e h(k) = home h(k)=home
- h 2 ( ) h2() h2() : determine the interval of each step
p ( k , i ) = i ∗ h 2 ( k ) p(k, i) = i * h2(k) p(k,i)=i∗h2(k)- Be sure that all probe sequence constants ( h 2 ( k ) h2(k) h2(k)) are relatively prime to M M M
- Be sure that all probe sequence constants ( h 2 ( k ) h2(k) h2(k)) are relatively prime to M M M
template <class Key, class Elem, class KEComp, class EEComp>
bool hashdict<Key, Elem, KEComp, EEComp>::
hashInsert(const Elem& e) {
int home; // Home position for e
int pos = home = h(getkey(e)); // Init
for (int i=1; !(EEComp::eq(EMPTY, HT[pos])); i++) {
pos = (home + p(getkey(e), i)) % M;
if (EEComp::eq(e, HT[pos])) return false; // Duplicate
}
HT[pos] = e; // Insert e
return true;
}
// Search for the record with Key K
template <class Key, class Elem, class KEComp, class EEComp>
bool hashdict<Key, Elem, KEComp, EEComp>::
hashSearch(const Key& K, Elem& e) const {
int home; // Home position for K
int pos = home = h(K); // Initial posit
for (int i = 1; !KEComp::eq(K, HT[pos]) && !EEComp::eq(EMPTY, HT[pos]);
i++)
pos = (home + p(K, i)) % M; // Next
if (KEComp::eq(K, HT[pos])) {
// Found
e = HT[pos];
return true;
}
else return false; // K not found
}
Deletion
Tombstones
放上一个标志占位,搜索时跳过,插入时认为是空
Mark a special flag on a deleted record
- The special flag named tombstone
- A tombstone will not stop a search but only indicate the slot is available
Local Reorganization
一个一个把位置补上
- After delete a record, follow the probe sequence
- Move the records if they are in the same sequence to overwrite the deleted one
Rehash
全部更新到新表中去
- Reinsert all records into a new hash table
- The new hash table can be in the same structure or bigger size