Analysis of Algorithms
• An algorithm is a finite set of precise instructions for performing a computation or for solving a
      problem.
                                     What is the goal of analysis of algorithms?
           To compare algorithms mainly in terms of running time but also in terms of other factors (e.g.,
              memory requirements, programmer's effort etc.)
                                   What do we mean by running time analysis?
           Determine how running time increases as the size of the problem increases.
                                                   Input Size
      • Input size (number of elements in the input)
              – size of an array
              – polynomial degree
              – # of elements in a matrix
              – # of bits in the binary representation of the input
              – vertices and edges in a graph
                                                Types of Analysis
      • Worst case
              – Provides an upper bound on running time
              – An absolute guarantee that the algorithm would not run longer, no matter what the inputs
                  are.
      • Best case
              – Provides a lower bound on running time
              – Input is the one for which the algorithm runs the fastest
      • Average case
              – Provides a prediction about the running time
              – Assumes that the input is random
                                       How do we compare algorithms?
•   We need to define a number of objective measures.
          (1) Compare execution times?
                  Not good: times are specific to a particular computer !!
          (2) Count the number of statements executed?
                  Not good: number of statements vary with the programming language as well as the
                  style of the individual programmer.
                                                 Ideal Solution
      • Express running time as a function of the input size n (i.e., f(n)).
      • Compare different functions corresponding to running times.
      • Such an analysis is independent of machine time, programming style, etc.
                                                    Example
      • Associate a "cost" with each statement.
      • Find the "total cost“ by finding the total number of times each statement is executed.
             Algorithm 1                   Algorithm 2
                            Cost                              Cost
           arr[0] = 0;       c1        for(i=0; i<N; i++)        c2
           arr[1] = 0;       c1           arr[i] = 0;            c1
           arr[2] = 0;       c1
             ...             ...
           arr[N-1] = 0; c1
                    -----------                         -------------
                  c1+c1+...+c1 = c1 x N              (N+1) x c2 + N x c1 = (c2 + c1) x N + c2
                                                          1
                                            Another Example
   Algorithm 3                         Cost
      sum = 0;                           c1
      for(i=0; i<N; i++)                 c2
        for(j=0; j<N; j++)               c2
        sum += arr[i][j];                c3
                              ------------
                       c1 + c2 x (N+1) + c2 x N x (N+1) + c3 x N2
                                            Asymptotic Analysis
Asymptote
noun (plural asymptotes)
   (mathematics): A straight line which a curve approaches arbitrarily closely, but never reaches, as
   they go to infinity. The limit of the curve, its tangent "at infinity".
Asymptotic
adjective
   (mathematics): Of, relating to, or being an asymptote.
Definition: Asymptotic is an adjective meaning 'of a probability distribution as some variable or
parameter of it (usually, the size of the sample from another distribution) goes to infinity.
In theoretical analysis of algorithms it is common to estimate their complexity in asymptotic sense, i.e., to
estimate the complexity function for reasonably large length of input.
Asymptotic Time Complexity: The limiting behavior of the execution time of an algorithm when the
size of the problem goes to infinity.
Asymptotic complexity is a way of expressing the main component of the cost of an algorithm, using
idealized units of computational work.
    • To compare two algorithms with running times f(n) and g(n), we need a rough measure that
    characterizes how fast each function grows.
    • Hint: use rate of growth
    • Compare functions in the limit, that is, asymptotically!
        (i.e., for large values of n)
                                          Order or Rate of Growth
     Elementary operation: an operation whose execution time can be bounded above by a constant
    depending on implementation used.
     Assume all elementary operations can be executed at unit cost. This is not true, but they are
    within some constant factors of each other. We are mainly concerned about the order of growth.
     For very large input size, it is the rate of grow, or order of growth that matters asymptotically.
     We can ignore the lower-order terms, since they are relatively insignificant for very large n. We
    can also ignore leading term’s constant coefficients, since they are not as important for the rate of
    growth in computational efficiency for very large n.
     Higher order functions of n are normally considered less efficient.
                                      Why Order of Growth Matters?
     Computer speeds double every two years, so why worry about algorithm speed?
     When speed doubles, what happens to the amount of work you can do?
     What about the demands of applications?
               ops/sec:               1M              2M             Gain
               Using n2 alg:          1000            1414           1.4
               Using n lg n alg:      62700           118600         1.9
                                                     2
•       The number of items that can be sorted in one second using an algorithm taking exactly n2
time as compared to one taking n lg n time, assuming 1 million and 2 million operations per second.
Notice that, for the n lg n algorithm, doubling the speed almost doubles the number of items that can
be sorted.
• Higher gain with faster hardware for more efficient algorithm.
• Results are more dramatic for higher speeds.
                            Comparing the Complexity of two algorithms
•           Let f(n) be the cost, in the worst case, of one algorithm, expressed as a function of the
input size n, and g(n) be the cost function for the other algorithm.
•           E.g., for sorting algorithms, f(10) and g(10) would be the maximum number of steps that
the algorithms would take on a list of 10 items.
•           If, for all values of n >= 0, f(n) is less than or equal to g(n), then the algorithm with
complexity function f is strictly faster.
•           But, generally speaking, our concern for computational cost is for the cases with large
inputs; so the comparison of f(n) and g(n) for small values of n is less significant than the "long term"
comparison of f(n) and g(n), for n larger than some threshold.
                               Asymptotic Notation: Big-O notation
                                    (Upper Bound – Worst Case)
• Big-O is the formal method of expressing the upper bound of an algorithm's running time. It's a
measure of the longest amount of time it could possibly take for the algorithm to complete.
• More formally, for non-negative functions, f(n) and g(n), if there exists an integer n0 and a
constant c > 0 such that for all integers n > n0, f(n) ≤ cg(n), then f(n) is Big O of g(n). This is denoted
as "f(n) = O(g(n))". If graphed, g(n) serves as an upper bound to the curve you are analyzing, f(n).
                                              Examples
• Say that f(n) = 2n + 8, and g(n) = n2.
• Can we find a constant c, so that 2n + 8 <= n2?
• The number 4 works here, giving us 16 <= 16.
• For any number c greater than 4, this will still work.
• Since we're trying to generalize this for large values of n, and small values (1, 2, 3) aren't that
important, we can say that f(n) is generally faster than g(n); that is, f(n) is bound by g(n), and will
always be less than it.
• It could then be said that f(n) runs in O(n2) time: "f-of-n runs in Big-O of n-squared time".
                               To find the upper bound - the Big-O time
• Assume we know that f(n) is equal to (exactly) 2n + 8, we can take a few shortcuts.
• For example, we can remove all constants from the runtime; eventually, at some value of c, they
become irrelevant.
• This makes f(n) = 2n.
• Also, for convenience of comparison, we remove constant multipliers; in this case, the 2.
• This makes f(n) = n.
• It could also be said that f(n) runs in O(n) time; that lets us put a tighter (closer) upper bound onto
the estimate.
                             Practical Examples
• O(n): printing a list of n items to the screen, looking at each item once.
• O(ln n): also "log n", taking a list of items, cutting it in half repeatedly until there's only one item
    left. O(n2): taking a list of n items, and comparing every item to every other item.
 For a given function g(n), we denote by O(g(n)) the set of functions
          O(g(n)) = {f(n): there exist positive constants c >0 and n0 >0 such that 0 ≤ f(n) ≤ cg(n)
            for all n ≥ n0 }
 We say g(n) is an asymptotic upper bound for f(n):
                                 f ( n)
                0 ≤ lim n →∞              < ∞ 3
                                 g ( n)
 O(g(n)) means that as n → ∞ , the execution time f(n) is at most c.g(n) for some constant c
 What does O(g(n)) running time mean?
     The worst-case running time (upper-bound) is a function of g(n) to a within a constant
        factor
•                 We say g(n) is an asymptotic upper bound for f(n)
•                 f(n) = Θ (g(n)) ⇒ f(n) = O(g(n)).
•                 Θ (g(n)) ⊂ O(g(n)).
 This is a mathematically formal way of ignoring constant factors, and looking only at the “shape”
  of the function
 f(n)=O(g(n)) should be considered as saying that “f(n) is at most g(n), up to constant factors”.
 We usually will have f(n) be the running time of an algorithm and g(n) a nicely written function
 E.g. The running time of insertion sort algorithm is O(n2)
 Example1:    Is 2n + 7 = O(n)?
 Let
       T(n) = 2n + 7
       T(n) = n (2 + 7/n)
       Note for n=7;
             2 + 7/n = 2 + 7/7 = 3
       T(n) ≤ 3 n ; ∀ n ≥ 7                     n0
        Then T(n) = O(n)
        lim n→ ∞ [T(n) / n)] = 2 ≥ 0 è T(n) = O(n)
 Example2:    Is 5n3 + 2n2 + n + 106 = O(n3)?
 Let
       T(n) = 5n3 + 2n2 + n + 106
                                                4
         T(n) = n3 (5 + 2/n + 1/n2 + 106/n3)
         Note for n=100;
              5 + 2/n + 1/n2         + 106/n3 =
              5 + 2/100 + 1/10000 + 1         = 6.05
         T(n) ≤ 6.05 n ;
                        3
                                 ∀ n ≥ 100                       n0
         Then T(n) = O(n3)
         Lim n∞→[T(n) / n3)] = 5 ≥ 0 è T(n) = O(n3)
 Express the execution time as a function of the input size n
 Since only the growth rate matters, we can ignore the multiplicative constants and the lower
  order terms, e.g.,
       n, n+1, n+80, 40n, n+log n                     is O(n)
       n1.1 + 10000000000n                            is O(n1.1)
       n2                                             is O(n2)
       3n2 + 6n + log n + 24.5                        is O(n2)
 O(1) < O(log n) < O((log n)3) < O(n) < O(n2) < O(n3) < O(nlog n) < O(2sqrt(n)) < O(2n) < O(n!) <
  O(nn)
 Constant < Logarithmic < Linear < Quadratic< Cubic < Polynomial < Factorial < Exponential
            A                                B
      5n2 + 100n              3n2 + 2               A ∈ Θ (B)
       A ∈ Θ (n2), n2 ∈ Θ (B) ⇒ A ∈ Θ (B)
     log3(n2)                            log2(n3)           A ∈ Θ (B)
     logba = logca / logcb; A = 2lgn / lg3, B = 3lgn, A/B =2/(3lg3)
    nlg4                                3lg n                        A ∈ w(B)
     alog b = blog a; B =3lg n=nlg 3; A/B =nlg(4/3) → ∞ as n→∞
    lg2n                             n1/2                   A ∈ o (B)
    lim ( lga n / nb ) = 0 (here a = 2 and b = 1/2) ⇒ A ∈ o (B)
    n→∞
                                          Back to Our Example
            Algorithm 1                            Algorithm 2
                   Cost                                                    Cost
    arr[0] = 0;     c1                       for(i=0; i<N; i++)             c2
    arr[1] = 0;     c1                       arr[i] = 0;                    c1
    arr[2] = 0;     c1
                                                     5
           ...
           arr[N-1] = 0; c1
                  -----------                                       -------------
           c1+c1+...+c1 = c1 x N                (N+1) x c2 + N x c1 = (c2 + c1) x N + c2
       •    Both algorithms are of the same order: O(N)
            Algorithm 3                    Cost
            sum = 0;                       c1
            for(i=0; i<N; i++)             c2
               for(j=0; j<N; j++)          c2
                  sum += arr[i][j];       c3
                                        ------------
    c1 + c2 x (N+1) + c2 x N x (N+1) + c3 x N2 = O(N2)
                                                     Examples
        •                   2n = O(n ): 2n ≤ cn ⇒ 2 ≤ cn ⇒ c = 1 and n0= 2
                               2      3     2      3
        •                   n2 = O(n2): n2 ≤ cn2 ⇒ c ≥ 1 ⇒ c = 1 and n0= 1
        •                   1000n2+1000n = O(n2): 1000n2+1000n ≤ 1000n2+ n2 =1001n2⇒ c=1001 and n0 =
        1000
        •                   n = O(n2): n ≤ cn2 ⇒ cn ≥ 1 ⇒ c = 1 and n0= 1
         Show that 30n+8 is O(n).
                  Show ∃ c,n0: 30n+8 ≤ cn, ∀n>n0 .
                         • Let c=31, n0=8. Assume n>n0=8. Then
                            cn = 31n = 30n + n > 30n+8, so 30n+8 < cn.
                                         Big-O example, graphically
     Note 30n+8 isn’t
     less than n
     anywhere (n>0).
     It isn’t even
     less than 31n
     everywhere.
     But it is less than
     31n everywhere to
     the right of n=8.
                                               No Uniqueness
       • There is no unique set of values for n0 and c in proving the asymptotic bounds
•   Prove that 100n + 5 = O(n2)
               – 100n + 5 ≤ 100n + n = 101n ≤ 101n2 for all n ≥ 5 n0 = 5 and c = 101 is a solution
                                                       6
               –100n + 5 ≤ 100n + 5n = 105n ≤ 105n2
                                        for all n ≥ 1
                 n0 = 1 and c = 105 is also a solution
       •                Must find SOME constants c and n0 that satisfy the asymptotic notation relation
        Any linear function an + b is in O(n2). How?
        Show that 3n3=O(n4) for appropriate c and n0.
                                   Asymptotic Notation: Big-Omega Ω Notation
                                              (Lower Bound – Best Case)
        •           For non-negative functions, f(n) and g(n), if there exists an integer n0 and a constant c > 0
        such that for all integers n > n0, f(n) ≥ cg(n), then f(n) is omega of g(n).
        •           This is denoted as "f(n) = Ω(g(n))".
        •           This is almost the same definition as Big Oh, except that "f(n) ≥ cg(n)", this makes g(n) a
        lower bound function, instead of an upper bound function.
    It describes the best that can happen for a given data size.
   For a given function g(n), we denote by Ω (g(n)) the set of functions
   Ω (g(n)) = {f(n): there exist positive constants c >0 and n0 >0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0
    }
   We say g(n) is an asymptotic lower bound for f(n):
                                            f ( n)
                         0 < lim     n →∞            ≤ ∞
                                            g ( n)
   Ω (g(n)) means that as n → ∞ , the execution time f(n) is at least c.g(n) for some constant c
   What does Ω (g(n)) running time mean?
   The best-case running time (lower-bound) is a function of g(n) to a within a constant factor
       •              We say g(n) is an asymptotic lower bound for f(n)
       •              f(n) = Θ (g(n)) ⇒ f(n) = Ω (g(n)).
       •              Θ (g(n)) ⊂ Ω (g(n)).
        We say Insertion Sort’s run time T(n) is Ω (n)
        Proof:
             Suppose run time is T(n) = an + b
             Let g(n) = n
                     a.g(n) = a.n ≤ T(n) = an + b
                     T(n) = Ω (g(n)) = Ω (n)
        For example
             the worst-case running time of insertion sort is O(n2), and
             the best-case running time of insertion sort is Ω (n)
                                                           7
                Running time falls anywhere between a linear function of n and a quadratic function of n
        Examples:
             n, n+1, n+80, 40n                     is Ω (n)
             n1.1 + 10000000000n         is Ω (n1.1)
             3n2 + 6n + log n + 24.5               is Ω (n2)
       •              Example of functions in Ω(n2)
               n2                              n
               n2 + n                          n / 3000
                    n 2 + 2000n                       n1.99999
                    500n 2 − 1000n                    n 2 / lg lg lg n
       •    5n2 = Ω (n)
                    ∃ c, n0 such that: 0 ≤ cn ≤ 5n2 ⇒ cn ≤ 5n2 ⇒ cn ≤ 5n2 ⇒ c = 1 and n0 = 1
        • 100n + 5 ≠ Ω (n2)
                    ∃ c, n0 such that: 0 ≤ cn2 ≤ 100n + 5
                      100n + 5 ≤ 100n + 5n (∀ n ≥ 1) = 105n
                      cn2 ≤ 105n ⇒ n(cn – 105) ≤ 0
                    Since n is positive ⇒ cn – 105 ≤ 0 ⇒ n ≤ 105/c
                    ⇒ Contradiction: n cannot be smaller than a constant
•   n = Ω (2n), n = Ω (n2), n = Ω (logn)
                  3
                                  Asymptotic Notation: Big-Omega Θ Notation
                                                   (Tight Bound)
        • For non-negative functions, f(n) and g(n), f(n) is theta of g(n) if and only if f(n) = O(g(n)) and
        f(n) = Ω(g(n)). This is denoted as "f(n) = Θ(g(n))".
        • This is basically saying that the function, f(n) is bounded both from the top and bottom by the
        same function, g(n).
        • In some cases,
                o f(n) = O(g(n)) and f(n) = Ω (g(n))
                o This means, that the worst and best cases require the same amount of time t within a
                    constant factor
                o In this case we use a new notation called “theta Θ ”
        • For a given function g(n), we denote by Θ (g(n)) the set of functions
                o Θ (g(n)) = {f(n): there exist positive constants c1>0, c2 >0 and n0 >0 such that
                         c1 g(n) ≤ f(n) ≤ c2 g(n) ∀ n ≥ n0}
         We say g(n) is an asymptotic tight bound for f(n):
                                          f ( n)
                         0 < lim n →∞              < ∞
                                          g ( n)
   Theta notation
               θ (g(n)) means that as n → ∞ , the execution time f(n) is at most c2.g(n) and at least
                 c1.g(n) for some constants c1 and c2.
        f(n) = Θ (g(n)) if and only if
               f(n) = O(g(n)) & f(n) = Ω (g(n))
       •                 Technically, f(n) ∈ Θ (g(n)).
       •                 Older usage, f(n) = Θ (g(n)).
       •                 I’ll accept either…
       •                 f(n) and g(n) are nonnegative, for large n.
                                                          8
        • Θ (g(n)) is the set of functions with the same order of growth as g(n)
•   g(n) is an asymptotically tight bound for f(n)
                                                     Examples
         Example: How to show that n /2-3n=Θ(n2) ?
                                          2
        We must determine positive constants c1, c2, and n0 such that
                        c1n 2 ≤ n 2 / 2 − 3n ≤ c2 n2
                       => c1 ≤ 1/ 2 − 3 / n ≤ c2
       by choosing c1 =1/14, c2=1/2, and n0=7, we can verify that n2/2-3n=Θ(n2)
        Other choices for the constants may exist. The key is some choice exists.
                       n 3 ≠ Θ( n 2 )
        How to verify6that              ?
        Suppose for the purpose of contradiction that c2 and n0 exist such that 6n3 ≤ c2n2 for all n ≥ n0
             Dividing by n2 yields
                    n ≤ c2/6
             which cannot possibly hold for arbitrary large n, since c2 is constant
                Also, lim n→∞ [6n3 / n2 ] = limn→∞ [6n] = ∞ , which is not a non-zero constant
       • n2/2 –n/2 = Θ (n2)
       •                  ½ n2 - ½ n ≤ ½ n2 ∀n ≥ 0 ⇒ c2= ½
       •                  ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( ∀n ≥ 2 ) = ¼ n2     ⇒ c1= ¼
       •                          n ≠ Θ (n2): c1 n2 ≤ n ≤ c2 n2
       ⇒ only holds for: n ≤ 1/c1
       •             6n3 ≠ Θ (n2): c1 n2 ≤ 6n3 ≤ c2 n2
       ⇒ only holds for: n ≤ c2 /6
       •             n ≠ Θ (logn): c1 logn ≤ n ≤ c2 logn
       ⇒ c2 ≥ n/logn, ∀ n≥ n0 – impossible
                                                        9
                                          Relations Between Θ ,Ω and O
    Theorem : For any two functions g(n) and f(n),
            f(n) = Θ (g(n)) iff f(n) = O(g(n)) and f(n) = Ω (g(n)).
   i.e., Θ (g(n)) = O(g(n)) ∩ Ω (g(n))
         •           In practice, asymptotically tight bounds are obtained from asymptotic upper and lower
         bounds.
                                                 Running Times
        “Running time is O(f(n))” ⇒ Worst case is O(f(n))
        O(f(n)) bound on the worst-case running time ⇒ O(f(n)) bound on the running time of every input.
        Θ (f(n)) bound on the worst-case running time ⇒ Θ (f(n)) bound on the running time of every
       input.
        “Running time is Ω (f(n))” ⇒ Best case is Ω (f(n))
        Can still say “Worst-case running time is Ω (f(n))”
               Means worst-case running time is given by some unspecified function g(n) ∈ Ω (f(n)).
        Insertion sort takes Θ (n2) in the worst case, so sorting (as a problem) is O(n2). Why?
        Any sort algorithm must look at each item, so sorting is Ω (n).
        In fact, using (e.g.) merge sort, sorting is Θ (n lg n) in the worst case.
               Later, we will prove that we cannot hope that any comparison sort to do better in the worst
                  case.
                                 Asymptotic Notation: Little-oh (o) Notation
                                 (Upper bound but not asymptotically tight)
       • For non-negative functions, f(n) and g(n), f(n) is little o of g(n) if and only if f(n) = O(g(n)), but
       f(n) ≠ Θ(g(n)). This is denoted as "f(n) = o(g(n))".
       • This represents a loose bounding version of Big O. g(n) bounds from the top, but it does not
       bound the bottom.
        The bound provided by O-notation may or may not be asymptotically tight.
        The bound 2n2=O(n2) is asymptotically tight, but the bound 2n=O(n2) is not.
        The o-notation denotes an upper bound that is not asymptotically tight. Formally, define o(g(n))
       as the set
       o(g(n)) = {f(n): for any positive constants c>0, there exits a constant n0>0 such that 0≤f(n)<c g(n)
       for all n≥n0}.
       •           For example, 2n=o(n2), but 2n2≠o(n2).
   The definitions of O-notation and o-notation are similar.
   The main difference
               • In f(n)=O(g(n)), the bound 0≤f(n)≤c g(n) holds for some constant c>0
               • In f(n)=o(g(n)), the bound 0≤f(n)<c g(n) holds for all constant c>0
                                                         10
     Intuitively, in the o-notation, the function f(n) becomes insignificant relative to g(n) as n
    approaches infinity; that is
                        f ( n)
                   lim         =0
                   n →∞ g ( n)
                           Asymptotic Notation: Little-omega (ω) Notation
                                (Lower bound but not asymptotically tight)
    • For non-negative functions, f(n) and g(n), f(n) is little omega of g(n) if and only if f(n) = Ω(g(n)),
    but f(n) ≠ Θ(g(n)). This is denoted as "f(n) = ω(g(n))".
    • Much like Little Oh, this is the equivalent for Big Omega. g(n) is a loose lower boundary of the
    function f(n); it bounds from the bottom, but not from the top.
    • ω-notation is to Ω-notation as o-notation is to O-notation.
    • Theω-notation denotes an lower bound that is not asymptotically tight. Formally, define ω(g(n))
    as the set
    • ω(g(n)) = { f(n): for any positive constants c>0, there exits a constant n0>0 such that 0≤c
    g(n)<f(n) for all n≥n0}.
    • One way to define it is by
        f(n)∈ω(g(n)) if and only if g(n)∈o(f(n))
    • For example, n2/2=ω(n), but n2/2≠ω(n2).
    • The relation f(n)=ω(g(n)) implies that
                          f ( n)
                      lim        = ∞, if the limit exists.
                    n →∞ g ( n)
                                                     Exercise
Q(1): Find out big oh, big omega & theta for following Functions:-
              a) f(n) = 7n+4
              b) f(n)= 7n2+8n+2
              c) f(n)=8n2+1
              d) f(n)=4n3+2n2+12
              e) f(n)=4n3+8
              f)     f(n)=2n+6n2+3n
              g) f(n)=2n+6n
              h)      f(n)=nlog2n
              i)     f(n)=n2logn+nlogn+n+1
              j)     f(n)=n0.6+n
Q(2): For each of the following statements, decide if it is always true, never true or sometimes true for
      non-negative function f(),g(). If it is always true or never true, explain why? If it is sometime true,
      give one example for which it is true and one for which it is false.
              I) f(n)+g(n) = Theeta(max(f(n),g(n)))
              II) f(n) != O(g(n)) and g(n)=O(f(n))
Q3: Show that the solution of T(n)=T(n/2) +1 is O(log n).
    For each of the following pair of functions f(n) and g(n), state whether f is O(g) or f is o(g) or f is Θ(g) or f is
    Ω(g):-
             i.      f(n) =n3/2 , g(n)=2n/2
            ii.      f(n)=n3/2, g(n)=nlog2 n
           iii.      f(n)=log n3
           iv.       f(n) log3 n, g(n)=log 2n,
                                                           11