"Algorithm" notes 7-- symbol tables, sequential search, binary search

  • Symbol table
    • API
    • Ordered symbol table
    • Cost Model
  • Find unordered list in order
    • achieve
    • performance
  • Ordered arrays of binary search
    • achieve
    • performance

Modern computers and networks enable people to access vast amounts of information, but also a variety of computing devices is to generate a steady stream of new information, the ability to efficiently retrieve this information has become an important prerequisite for dealing with them. Then learn several classic search algorithm.

Symbol table

Refers to the symbol table is a table abstract for storing information, the main purpose is to value a key and a link, a key may be inserted into the symbol table, the symbol may be from all of the key pair table according to a value corresponding directly to the key, the symbol table is also called the dictionary.

API

The most basic operation is the symbol table: Insert, search, further comprising in addition several convenient operation of the algorithm. To achieve the symbol table, you must first define the data structure behind it, and indicate the creation and manipulation of this data structure to insert, search algorithm needed.
API symbol table as follows:

public class ST<Key,Value>{
    ST() //创建一张符号表
    void put(Key key,Value val)  //将键值对存入表中(若值为空则将键Key从表中删除)
    Value get(Key key)  //获取键key对应的值(若键key不存在则返回null)
    void delete(Key key)  //从表中删除键key和其对应的值
    boolean contains(Key key)  //键key在表中是否有对应的值
    boolean isEmpty() //表是否为空
    int size()  //表中键值对的数量
    Iterable<Key> keys()  //表中所有键的集合
}

Design symbol table will follow the following rules: implemented using generic symbol table; each key corresponds to only one value, stored in the table if the key already exists in the table, is covered with the new value the old value, so there is no duplicate keys in the table; not allowed into empty or null values, when the key is not found only in the table corresponds to the value, get method returns null.

Ordered symbol table

The symbol table wherein the key is ordered into ordered and disordered symbol table symbol table, the symbol table ordered based ordering of keys may be implemented more useful operations.

Key floor(Key key)  //获取小于等于key的最大键
Key ceiling(Key key)  //获取大于等于Key的最小键
int rank(Key key) //获取小于Key的键的数量
Key select(int k)  //获取排名为k的键
Key max()  //获取最大的键
Key min()  //获取最小的键

Get the maximum, minimum operation keys such that an ordered symbol table with the priority queue having similar functions, except that the priority queue duplicate keys may be present but not the symbol table.
The method can be used to select and rank test whether a new key is inserted into the appropriate position. For 0 to size () - 1 i have all i = rank (select (i) ), and key = select (rank (key) ). In analogy to the ceiling and floor of a real number is rounded down, rounding up operation.

Cost Model

Insert the symbol table, will need to find a value in the symbol table compares key in achieving learning symbol table, counts the number of comparisons to analyze the cost of an implementation, rarely if an implementation of the number of comparisons , then the number of its access to the data structure considerations.

Find unordered list in order

List can be used as a simple implementation of the symbol table, each node stores a key-value pair. get () method will traverse the list, the key will be to look for in turn compared with the list of nodes, the match is successful return value node, otherwise return null; put () method will traverse the list, if a match is found node, it is overwritten with the value of the matching node value to be inserted, otherwise to add a new node at the head of the linked list.

achieve

public class SequentialSearchST<Key, Value> {
    private Node first;
    private int n;

    private class Node {
        Key key;
        Value val;
        Node next;

        public Node(Key key, Value val, Node next) {
            this.key = key;
            this.val = val;
            this.next = next;
        }
    }

    public Value get(Key key) {
        if (key == null)
            throw new IllegalArgumentException("argument to get() is null");
        for (Node x = first; x != null; x = x.next) {
            if (key.equals(x.key)) {
                return x.val;
            }
        }
        return null;
    }

    public void put(Key key, Value value) {
        if (key == null)
            throw new IllegalArgumentException("first argument to put() is null");

        for (Node x = first; x != null; x = x.next) {
            if (key.equals(x.key)) {
                 x.val=value;
                 return;
            }
        }
        first = new Node(key, value, first);
        n++;
    }
    
    ...
}

performance

For a symbol contains N key value table based on the unordered list of terms:
miss lookup, a relatively N times, and compare as to traverse the list of all the keys;
insertion operation, if an element is not to be inserted in symbol table, also require N comparisons;
for finding a hit, worst case N comparisons, but average, hits do not need to look so many times compare. By calculating the total number of lookup tables for each key, which is divided by N to estimate the average number of comparisons to find a hit, this method is the possibility of assuming each symbol table lookup keys are the same, also known as random hit. Although the actual application, it is impossible to be completely random, but it basically.
Hit in the random mode, the total number of the first to find a number of comparisons = key + the number of comparisons the second key number of comparisons + ... + N-th key = 1 + 2 + ... + N = N (N + 1) / 2; the average number of comparisons = (N + 1) / 2 . Compared with the equivalent of half the elements.
When N different key is inserted to an empty table, each insert are required compared to all key has been inserted is also required + 2 + .. +. 1 N = N
(N +. 1) / 2 comparisons. Growth in the order of the square of the level.

Ordered arrays of binary search

Based on the symbol table in the ordered array, a data structure used herein is a pair of parallel array, a storage key, a stored value; can also be realized by using a configuration of the data key.

achieve

public class BinarySearchST<Key extends Comparable<Key>, Value> {
    private Key[] keys;
    private Value[] vals;
    private int N;

    public BinarySearchST(int capacity) {
        keys = (Key[]) new Comparable[capacity];
        vals = (Value[]) new Object[capacity];
    }

    public Value get(Key key) {
        if (isEmpty())
            return null;
        int i = rank(key);
        if (i < N && keys[i].compareTo(key) == 0) {
            return vals[i];
        } else {
            return null;
        }
    }

    public void put(Key key, Value val) {
        int i = rank(key);
        if (i < N && keys[i].compareTo(key) == 0) {
            vals[i] = val;
            return;
        }
        for (int j = N; j > i; j--) {
            keys[j] = keys[j - 1];
            vals[j] = vals[j - 1];
        }
        keys[i] = key;
        vals[i] = val;
        N++;
    }

    public int rank(Key key) {
        // return rankRecursion(key, 0, N - 1);
        return rankIteration(key, 0, N - 1);
    }

    public int rankRecursion(Key key, int lo, int hi) {
        // if (hi <= lo)
        if (hi < lo)
            return lo;
        int mid = lo + (hi - lo) / 2;
        int cmp = key.compareTo(keys[mid]);
        if (cmp < 0) {
            return rankRecursion(key, lo, mid - 1);
        } else if (cmp > 0) {
            return rankRecursion(key, mid + 1, hi);
        } else {
            return mid;
        }
    }

    public int rankIteration(Key key, int lo, int hi) {
        while (lo <= hi) {
            int mid = lo + (hi - lo) / 2;
            int cmp = key.compareTo(keys[mid]);
            if (cmp < 0) {
                hi = mid - 1;
            } else if (cmp > 0) {
                lo = mid + 1;
            } else {
                return mid;
            }
        }
        return lo;
    }

    public Iterable<Key> keys() {
        return keys(keys[0], keys[N - 1]);
    }

    public Iterable<Key> keys(Key lo, Key hi) {
        if (lo == null)
            throw new IllegalArgumentException("first argument to keys() is null");
        if (hi == null)
            throw new IllegalArgumentException("second argument to keys() is null");

        Queue<Key> queue = new Queue<Key>();
        if (lo.compareTo(hi) > 0)
            return queue;
        for (int i = rank(lo, false); i < rank(hi, false); i++)
            queue.enqueue(keys[i]);
        if (get(hi) != null)
            queue.enqueue(keys[rank(hi, false)]);
        return queue;
    }
}

The core is achieved rank method, which returns the number of tables is less than the given key to the key. For the get method, as long as the given key exists in the table, according to the rank method you can know where to find it; to put the method, if the given key exists in the table, according to the rank method can know where to update key corresponding to value, if the key is not in the table, according to the rank method can also be aware of the new value should be inserted into any position. Inserted, will first of all larger keys to move back one space to make room.

Due to the use of an ordered array, rank method can be found quickly find key positions by half. When searching, the first being to find the center button and the sub-arrays of comparison, if the key is to find less than the middle button, we continue to find in the left sub-array, if more than the middle button, you continue to find the right sub-array, or the middle button is to be hit key.
rankRecursion and rankIteration are recursive and iterative implementation of this algorithm. put method when inserting key-value pairs, since the method to get the rank ordering of the keys to be inserted position, can ensure a symbol table has been ordered.

performance

When the binary search in a sorted array of N keys, first find an intermediate element, and then continue the rest of the binary search (N-1) / 2 elements, whereby the number of comparisons can be obtained the relationship:
C (N) <= C ( (N-1) / 2) +1, where 1 is compared with the intermediate element 1;
then C (N) <= C ( N / 2) +1;
and there are C ( 0) = 0, C (1 ) = 1;
assuming exactly the number N is a power of 2, i.e., N = 2 ^ n, n = LgN;
then: C (2 ^ n) < = C (2 ^ (n -1)) + 1;
iteration is continued until n = 0, can be obtained: C (2 ^ n) < = C (2 ^ 0) + n;
the N = 2 ^ n, n = LgN into the above equation can be obtained:
C (N) <= 1 + LgN.
That is exactly the number N is a power of 2, it can take up to LgN + 1 comparisons;
extended to the general case, we can see the look quantitative growth level for the number of levels.
Find the number of operating performance level, it is very fast, but how to insert the operating performance?
In the worst case, the insertion position at the very beginning of the array, it is necessary to move all the elements of a grid, a key, a value of each array, a total number of missions 2N, rank method calls back before inserting the front insert comparisons of the LGN, LGN negligible small compared to 2N, the insertion operation requires access to the array ~ 2N times;
inserting N elements, a null symbol to the table, in the worst case, every time the insertion of the beginning of the array , totaling 2 N (. 1-N) / 2, the number of about ~ N ^ 2 missions, the increase in the number of stages for the insertion of a linear, ordered to build a symbol table is required level time squared, linear, square level the algorithm can not be used to solve large-scale problems.

Guess you like

Origin www.cnblogs.com/zhixin9001/p/11574407.html