Time Complexity Analysis in Algorithm Interviews

Example: Given an array of strings, first sort each string in the array alphabetically, and then sort the entire string lexicographically. Time complexity of the whole operation?

Answer: Suppose the longest string length is s and there are n strings in the array.
Sort each string: slogs, there are n in total, so n s log(s)
all strings are sorted: O(s*nlog(n)) //Sort strings, each comparison is at most s

==> O(n * slogs) + O(s * nlogn) = O(sn(logn + logs))

Algorithmic complexity is in some cases use case dependent

Have a concept of data size -- back cover estimation

Test with one of the following programs:

for(int x = 1; x <= 9; x++){
    int n = pow(10, x);
    
    clock_t startTime = clock();
    int sum = 0;
    for(int i = 0; i < n; i++)
        sum += i;
    clock_t  endTime = clock();
    
    cout << "10^" << x << " : "<< double(endTime - startTime)/CLOCKS_PER_SEC << " s"<< endl;
}

This is an O(n) algorithm running on a native 4-core i7 machine and the results are as follows:

10^1 : 0 s
10^2 : 0 s
10^3 : 0 s
10^4 : 0 s
10^5 : 0 s
10^6 : 0 s
10^7 : 0.03125 s
10^8 : 0.25 s
10^9 : 2.4375 s

That is to say, if the program requires to run the result within 1s, the data size should not exceed 10^8, not 10^9, that is to say:
O(n^2): about 10^4 level of data can be processed
O(n^logn): about 10^7 level data can be processed
O(n): about 10^8 level data can be processed

Common Complexity Analysis

\(O(1)\)

void swap(int &a, int& b){
    int tmp = a; a = b; b = tmp;
}

\(O(n)\) --- constant coefficients are probably not 1

int sum(int n) {
    int sum = 0;
    for(int i = 1; i <= n; i++) {
        sum += n;
    }
    return sum;
}

\(O(n^{2})\)

// 选择排序
for(int i = 0; i < n; i++){
    int minIndex = i;
    for(int j = i + 1; j < n; j++){
        if(arr[j] < arr[minIndex]) {
            minIndex = j;
        }
    }
}
swap(arr[i], arr[minIndex]);

\ (O (logn) \)

// lower_bound
int binSearch(const vector<int>& nums, int lo, int hi, int key) {
    while(lo < hi) {
        int mid = lo + (hi - lo) / 2;
        if(nums[mid] < key) {
            lo = mid;
        }
        else{
            hi = mid;
        }
    } 
    return lo;
}

Consider the following example: After a few divisions by 10, does n equal 0? The answer is log_{10}n

// Warnning!! this code is buggy
// 需要考虑各种情况
string intToString (int num) {
    string s = "";
    while(num) {
        s += '0' + num % 10;
        num /= 10
    }
    reverse(s);
    return s;
}

There is no difference in order of magnitude between base 2 and base 10 logarithms. (just a linear relationship)

\(O(nlogn)\)

Although the following code is also a double loop, the complexity is \(O(nlogn)\) , because the outer loop is exponentially increased (multiplied by 2 each time)

void hello(int n) {
    for(int sz = 1; sz < n; sz += sz) {
        for(int i = 1; i < n; i++){
            cout << "Hello" << endl;
        }
    }
}

complexity experiment

I clearly wrote the algorithm of \(O(nlogn)\) , but the interviewer said I was \(O(n^{2})\) ?

You can verify it yourself to see what level of data scale it can handle, and refer to the back cover estimate.
Experiment and observe trends. For example, increase the data size by two times each time and observe the time change.

Complexity Analysis of Recursive Algorithms

For a single recursive call, the complexity is generally \(O(T*depth)\) , write a recursive expression for derivation.

Such as the following code:

double pow(double x, int n) {
    assert(n >= 0);
    if(n == 0) return 1;
    
    double t = pow(x, n/ 2);
    if(n %2) 
        return x*t*t;
    else
        return t*t;
}

The recursion depth \(depth = logn, T = 1\) , so the complexity is \(O(logn)\) .

int f() {
    //递归基 here
    return f(n - 1) + f(n - 1);
}

Multiple recursive calls, recursive depth \(depth = n, 2\) for each operation , it can be deduced that the algorithm complexity is exponential \(O(2^n)\)
\[\begin{equation}\begin{split }\\f(n) &= 2f(n-1)\\&= 4(n-2)\\&= 8f(n-3)\\&\cdots\\&= 2^{n}f (1)\\&= O(2^{n})\\\\end{split}\end{equation} \tag{1}\]

Simple and not rigorous analysis of the complexity of quicksort:

Each partition operation of quicksort can construct such a position, that is, all values ​​on the left are smaller than the pivot point, and all values ​​on the right are larger than the pivot point, so \(f(n) = 2*f(n/2) + f(partition)\) and the partition operation only needs to do one loop, so it is an O(n), so
\[\begin{split}\\f(n) &= 2f(n/2) + O(n) \\&= 4f(n/4) + O(n) + 2*O(n/2)\\&= 8f(n/8) + O(n) + 2*(n/2) + 4* (n/4)\\&\cdots\\&= 2^{\log_{2}n} * f(1) + \underbrace{ O(n) + O(n) +\cdots+ O(n) } _{k = \log_{2}n}\\&= n + O(n*\log_{2}n)\\&= O(n * \log_{2}n)\\\\end{split } \tag{2}\]
Of course, rigorous analysis also needs to introduce probability (random distribution). This is just a simple and imprecise derivation.

Amortized Complexity Analysis Amoritzed Time

Typical example: Every time a dynamic array (vector)
is dynamically expanded (resize()), a new space needs to be opened up and then assigned one by one. The complexity of such an operation is \(O(n)\) Then the problem comes : What is the average complexity of vector push_back?

Assuming that the current array capacity is n, from empty to full, the consumption of each operation is \(O(1)​\) . If there is another element at this time, you need to resize, then the last operation costs \(O( n)​\) , then on average, the total cost of the past n+1 operations is
\[ \underbrace{ O(1) + O(1) +\cdots+ O(1) }_{n} + O (n) = O(2n)\] Then the amortized cost
of each operation is \(O(\frac{2n}{n+1}) = O(2) = O(1)\)push_back

Then the question comes again. If pop_backyou find that the size is 1/2 of the current capacity and then resize, what is the time complexity?
Assuming that the current array capacity is 2n, from full to half, the consumption of each operation is \(O(1)\) , if at this time pop_back, the time that needs to be consumed is \(O(2*n)\) , then The amortization analysis from this perspective is still at level O(1),
but it is a strange situation: after the array is full, the resize is doubled, which requires O(n); at this time, it needs to be deleted again, and then the critical point is reached. When resize is required, and \(O(n)\) is required , then in this degenerate case, the complexity of a single operation degenerates to \(O(n)\) , this situation is also called complex degree of oscillation

What is the correct way to do it? pop_backWait until the size is 1/4 of the capacity before resize.

Complexity analysis of interview questions

Assuming that the current dynamic array/hash table has n elements, and the initial size of the vector is 1, how many copy operations have there been since the beginning?

Last replication: n/2 participation

Penultimate replication: n/4 participation

Penultimate replication: n/8 participation

...

Second time: 2 participants

First time: 1 participation

\[\therefore S(n) = \frac{n}{2} + \frac{n}{4} + \frac{n}{8} +\cdots+4+2+1 \tag{3}\]

\[\Rightarrow 2S(n) = n + \frac{n}{2} + \frac{n}{4} + \frac{n}{8} +\cdots+4+2 \tag{4}\]

It's easy to get

\[S(n) = n - 1 \tag{*}\]

That is to say, the complexity of each insertion of a new element can be amortized to the level of \(O(1)\)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325208588&siteId=291194637