Analysis of Algorithms

Analysis of Algorithms refers to the practice of determining the asymptotic runtime complexity of algorithms, i.e. how the algorithm behaves for large input sizes.  We use Big-Oh notation to represent the runtime as N -> infinity.

Analysis of Iterative Algorithms

Finding the runtime complexity of an iterative algorithm is straightforward.  First, you must classify every step of the algorithm into one of three classes:
  1. a simple statement which takes constant time to compute
  2. a loop which iterates from 0 to N
  3. a call to another function
Simple statements like "a++" or even "a = b*sqrt(x)^y" take constant time to compute, so T(N) = = O(1), where "c" is some constant (we don't care exactly how long it takes to compute--just that it's constant).  For-loops which iterate N times (or approximately N times) take linear time, so T(N) = N = O(N).  For-loops with something inside take N times the runtime of the inner portion: Touter(N) = N*Tinner(N)

To find the total runtime complexity of the algorithm, multiply when something is inside a for-loop.  Otherwise, add it.

For example, consider the following simple algorithm:

1: for i = 1 to N do
2:   for j = 1 to N do
3:     a++
4:     foo(a, i, j, N)
5: for k = 1 to N do
6:   b++

Let's say that foo's runtime is lgN.  Applying the rules above, we see that:
  1. the runtime for line 3 is some constant c1
  2. the runtime for line 4 is lgN
  3. the runtime for line 6 is some constant c2
  4. the runtime for lines 3 and 4 together is c1 + lgN
  5. the runtime for lines 2, 3, and 4 is N*(c1 + lgN)
  6. the runtime for lines 1 thru 4 is N*N*(c1 + lgN)
  7. the runtime for lines 5 and 6 is N*c2
  8. the runtime for lines 1 thru 6 is N*N*(c1 + lgN) + N*c2
Simplifying, we see that the overall runtime is T(N) = c1N2 + N2lgN + Nc2.  Next, we wish to find the asymptotic behavior of this function (the Big-Oh notation).  To do this, notice that c1N2 and Nc2 are both dominated by N2lgN.  In other words, as N -> infinity, the terms c1N2 + Ncbecome insignificant compared to N2lgN (which grows at a much faster rate).  Therefore, the asymptotic runtime complexity of this algorithm is O(N2lgN).

Analysis of Recursive Algorithms

The strategy above cannot be applied to recursive functions, since there are usually no for-loops and the function calls itself, not another known function.  Instead, we make use of recurrence relations like this one:

T(N) = c1 + N + T(N-1)
T(0) = c2

The above runtime is recurrent because it depends on itself.  In order to find the asymptotic behavior (Big-Oh notation) of a recurrent function like this, we need to find a closed form of the function.  This is done as follows:
  1. Start with the base case T(0)
  2. Write the recurrence relation for T(1) by substituting T(0)
  3. Repeat: write T(k) based on T(k-1)
Eventually, you'll see a pattern as you iterate from 1 to N.  If you can write an expression which matches the pattern, you will have found the closed form you need.

For example, consider the simple factorial function:

factorial(n) :
  if n <= 1
    return n
    return n * factorial(n-1)

Let's assume the if-statement takes time a and the multiplication takes b.  Then, the runtime of this function can be written as follows:

T(N) = a + b + T(N-1)
T(0) = a
T(1) = a

Notice that we actually have two base-cases here.  That's okay.  You can just pick one or use both.

Let's start "unrolling" the recurrence relation:

T(0) = a
T(1) = a
T(2) = a + b + T(1) = a + b + a = 2a + b
T(3) = a + b + T(2) = a + b + 2a + b = 3a + 2b
T(4) = a + b + T(3) = a + b + 3a + 2b = 4a + 3b

Now you should see a pattern.  For T(k), the runtime is T(k) = ka + (k-1)b = ka + kb - b

Now that you have the closed form, you can easily find the asymptotic complexity (Big-Oh notation).  Just notice that the terms k*a and k*b are linear, whereas -b is constant.  Thus, you have T(N) = O(N) + O(N) - O(1) = O(N).