Chapter 3 Exercises
Chapter 3 Exercises
Reinforcement
                                                                         In [1]:
#-----------R3-1-----------------
import matplotlib.pyplot as plt
import numpy as np
import math
ys = []
for y in ys:
    plt.plot(x, y)
plt.yscale('log')
plt.xscale('log')
                                                                       In [2]:
#-------------R3-2-------------------
"""
A is better than B for n greater than n when 8nlogn = 2n^2, or 4logn = n
"""
plt.plot(x, y1)
plt.plot(x, y2)
"""
                                                                        Out[3]:
'\n40n^2 = 2n^3 when n = 20, which means n0 = 20\n\n'
                                                                         In [4]:
#----------R3-4--------------------
"""
A constant function is plotted the same on each scale
"""
                                                                        Out[4]:
'\nA constant function is plotted the same on each scale\n\n'
                                                                         In [5]:
#---------------R3-5-------------
"""
if y = n^c, then log(y) = log(n^c) = c log(n).
"""
                                                                      Out[5]:
'\nif y = n^c, then log(y) = log(n^c) = c log(n).\n\nThis is of the form y=
mx for a log scale\n\n\n'
                                                                       In [6]:
#-----------R3-6------------------
"""
The sum is twice the sum of all the numbers from 0 to n, which is
2*(n+1)(n)/2 = (n+1)(n)
"""
                                                                      Out[6]:
'\nThe sum is twice the sum of all the numbers from 0 to n, which is 2*(n+1
)(n)/2 = (n+1)(n)\n'
                                                                       In [7]:
#--------------R3-7----------------
"""
If the worst case runtime is O(f(n)), then there is a constant c such that
cf(n)>worst_case for n>n0
Since the worst case is always >= to all other cases, (worst case runtime
>= g(n)), then cf(n)>=worst_case>= all other cases A
Since a>=b and b>=c, then a>=c, which means that c (the algorithm A) is
O(f(n))
"""
                                                                      Out[7]:
'\nIf the worst case runtime is O(f(n)), then there is a constant c such th
at cf(n)>worst_case for n>n0\n\nSince the worst case is always >= to all ot
her cases, (worst case runtime >= g(n)), then cf(n)>=worst_case>= all other
cases A\n\nSince a>=b and b>=c, then a>=c, which means that c (the algorith
m A) is O(f(n))\n\n'
                                                                       In [8]:
#--------------R3-8----------------------
"""
In increasing order or asymptotic growth rate...
constant (2^10)
n (2^log(n) = n)
n (3n + 100logn) (<- this grows at a faster rate than the one before
because 3>1)
n (4n) <- 4>3 and the 100logn term will eventually become relatively
trivial at high n
nlogn (nlogn)
nlogn (4nlogn + 2n) <- Grows quicker because of the constant
n^2 (n^2 + 10)
n^3 (n^3)
2^n (2^n)
"""
                                                                      Out[8]:
'\nIn increasing order or asymptotic growth rate...\n\nconstant (2^10)\n\nn
(2^log(n) = n)\nn (3n + 100logn) (<- this grows at a faster rate than the o
ne before because 3>1)\nn (4n) <- 4>3 and the 100logn term will eventually
become relatively trivial at high n\nnlogn (nlogn)\nnlogn (4nlogn + 2n) <-
Grows quicker because of the constant\nn^2 (n^2 + 10)\nn^3 (n^3)\n2^n (2^n)
\n\n'
                                                                       In [9]:
#------------R3-9------------------
"""
if d(n) is O(f(n)), there is a constant such that d(n) <= cf(n) for every
n>n0.
Essentially, the new constant will be c_old * a, which still maintains the
original condition of O notation, which
will be true when n>a*n0
"""
                                                                      Out[9]:
"\nif d(n) is O(f(n)), there is a constant such that d(n) <= cf(n) for ever
y n>n0.\n\nif we multiply d(n) by a, we get (ad(n)) <= acf(n) = c'f(n)\n\nE
ssentially, the new constant will be c_old * a, which still maintains the o
riginal condition of O notation, which \nwill be true when n>a*n0\n"
                                                                      In [10]:
#-----------R-3-10-----------------------
"""
If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)<= cf(n) for n_f>n_f0 and
e(n)<= dg(n) for n_e>n_e0
which means that there is a new n' = n_f*n_e and and n0' = n_f0*n_e0, and a
c' = c*d such that
d(n)e(n) <= c'(f(n)g(n)) for n'>n0', which means that d(n)e(n) is
O(f(n)g(n))
"""
                                                                     Out[10]:
"\nIf d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)<= cf(n) for n_f>n_f0 a
nd e(n)<= dg(n) for n_e>n_e0\n\nas a result, d(n)e(n) <= (cf(n))*(dg(n)) an
d n_f*n_e > n_f0*n_e0\n\nwhich means that there is a new n' = n_f*n_e and a
nd n0' = n_f0*n_e0, and a c' = c*d such that \nd(n)e(n) <= c'(f(n)g(n)) for
n'>n0', which means that d(n)e(n) is O(f(n)g(n))\n\n\n"
                                                                      In [11]:
#-------------R3-11--------------
"""
Like before, if d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)<= cf(n) for
n_f>n_f0 and e(n)<= dg(n) for n_e>n_e0
That means that d(n) + e(n) <= (cf(n)) + (dg(n)) and n > n_f0+n_e0
which means that there is a new n' = n_f+n_e and and n0' = n_f0+n_e0, such
that:
d(n) + e(n) <= cf(n) + dg(n) for n>n0'; however, this still does not
satisfy O notation.
We can absorb c and d into their functions such that    d(n) + e(n) <= f(cn)
+ g(dn)
To absorb c, we note that n'' = cn', so n' = n''/c, which means that n''/c
>= n0''/c
d(n) + e(n) <= (f(n)+d(n)) for n>= n0'', which satisfies O(f(n) + d(n))
conditions
"""
                                                                     Out[11]:
"\nLike before, if d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)<= cf(n) f
or n_f>n_f0 and e(n)<= dg(n) for n_e>n_e0\n\nThat means that d(n) + e(n) <=
(cf(n)) + (dg(n)) and n > n_f0+n_e0\n\nwhich means that there is a new n' =
n_f+n_e and and n0' = n_f0+n_e0, such that:\n\nd(n) + e(n) <= cf(n) + dg(n)
for n>n0'; however, this still does not satisfy O notation. \n\nWe can abs
orb c and d into their functions such that d(n) + e(n) <= f(cn) + g(dn) \n
\nTo absorb c, we note that n'' = cn', so n' = n''/c, which means that n''/
c >= n0''/c\n\nsimilarly for d, n''/cd >= n0''/cd\n\nTherefore there are ne
w values of n0 such that \n\nd(n) + e(n) <= (f(n)+d(n)) for n>= n0'', which
satisfies O(f(n) + d(n)) conditions\n\n\n"
                                                                      In [12]:
#-------------R3-12--------------
"""
The key point here is that just because something is O(n) doesn't mean that
is has to be that function
For example, f(n) = 5 is O(n) is mathematically true, although it is poor
form to say so
There is no value for n>0 such that 0>=n, which means that d(n) -e(n) is
not O(f(n)-g(n))
Side Note: The wikipedia definition of Big-O is |f(n)| <= Mg(x) for all
x>x0,
which provides more opportunities to disprove this property
"""
                                                                     Out[12]:
"\nThe key point here is that just because something is O(n) doesn't mean t
hat is has to be that function\nFor example, f(n) = 5 is O(n) is mathematic
ally true, although it is poor form to say so\n\nTherefore, if we have d(n)
= n and e(n) = n with f(n) = n and g(n) = n, then we satisfy\nd(n) <= C(f(n
)) for n>=0, and e(n) <= C2(g(n)) for n>=0\n\nf(n)-g(n) = 0 and d(n) -e(n)
= n\n\nThere is no value for n>0 such that 0>=n, which means that d(n) -e(n
) is not O(f(n)-g(n))\n\nSide Note: The wikipedia definition of Big-O is |f
(n)| <= Mg(x) for all x>x0,\nwhich provides more opportunities to disprove
this property\n\n"
                                                                      In [13]:
#-------------R3-13-----------------
"""
If d(n) is O(f(n)) and f(n) is O(g(n)), then d(n)<= cf(n) for n_f>n_f0 and
f(n)<= dg(n) for n_g>n_g0
"""
                                                                     Out[13]:
"\nIf d(n) is O(f(n)) and f(n) is O(g(n)), then d(n)<= cf(n) for n_f>n_f0 a
nd f(n)<= dg(n) for n_g>n_g0\n\n\nIf that is true, then d(n)<=cf(n)<=c(d(g(
n))) = cd*g(n) = c'g(n) by substitution, which is true for\nn_f*n_g>n_f0*n_
g0, or n>n0'\n\nTogether, these satisfy the conditions of d(n) is O(g(n))\n
\n"
                                                                      In [14]:
#------------R3-14---------------------
"""
By definition, if y is the output of max{f(n), g(n)}, then by definition
y>=f(n) and y>=g(n), which means that 2y>=f(n) + g(n)
That means that f(n) + g(n) <= 2(max{f(n), g(n)}) for every n>n0=0,
which satisfies our requirements for O
"""
                                                                     Out[14]:
'\nBy definition, if y is the output of max{f(n), g(n)}, then by definition
\ny>=f(n) and y>=g(n), which means that 2y>=f(n) + g(n)\nThat means that f(
n) + g(n) <= 2(max{f(n), g(n)}) for every n>n0=0,\nwhich satisfies our requ
irements for O\n\n'
                                                                      In [15]:
#----------R3-15-------------------
"""
for f(n) to be O(g(n)), f(n)<=Cg(n) for all n>n0
if g(n) is not T(f(n)), then there is a value where g(n)<=cf(b) for n>n0
"""
                                                                      Out[15]:
'\nfor f(n) to be O(g(n)), f(n)<=Cg(n) for all n>n0\n\nfor g(n) to be T(f(n
)), g(n)>= Cf(b) for all n>n0\n\n\nif g(n) is not T(f(n)), then there is a
value where g(n)<=cf(b) for n>n0\n\nif that is true, then g(n)<=cf(n) at th
e critical n, then f(n)<=(1/c)f(n) for some n>n0\n(note this is true becaus
e c>0)\nwhich violates the rules for f(n) is O(g(n)).\n\nTherefore, the org
inal statement holds by contraposition\n\n\n\n\n'
                                                                       In [16]:
#--------------R3-16---------------------------
"""
It has been shown earlier that a polynomial is O(n^d), where d is its
highest order
We can ignore the constant term, which means that log(f(n)) <= dlog(n),
which means that
log(f(n)) is O(log(n)) for a polynomial
"""
                                                                     Out[16]:
"\nIt has been shown earlier that a polynomial is O(n^d), where d is its hi
ghest order\n\nTherefore, f(n) <= f'(n), where f'(n) = Cn^d\n\nif we take t
he logarithm of this function, we get log(cn^d) = log(c) + dlog(n)\n\nWe ca
n ignore the constant term, which means that log(f(n)) <= dlog(n), which me
ans that\nlog(f(n)) is O(log(n)) for a polynomial\n\n"
                                                                      In [17]:
#--------------R3-17----------------------------
"""
If you expand out the function, you get n^5 + ax^4 + .... + 1,
which is < (1+a+a2...+1)n^5 for all n>n0.
Therefore, it is O(n^5)
"""
                                                                     Out[17]:
'\nIf you expand out the function, you get n^5 + ax^4 + .... + 1,\nwhich is
< (1+a+a2...+1)n^5 for all n>n0.\n\nTherefore, it is O(n^5)\n\n'
                                                                      In [18]:
#------------R3-18-------------------------
"""
From the power rules, 2^(n+1) = (2)(2^n), which means there is a constant
C=2 such that
2^(n+1) <= Cf(n) for n>n0, where f(n) = 2^n
"""
                                                                     Out[18]:
'\nFrom the power rules, 2^(n+1) = (2)(2^n), which means there is a constan
t C=2 such that\n2^(n+1) <= Cf(n) for n>n0, where f(n) = 2^n\n\nTherefor 2^
(n+1) is O(2^n)\n\n'
                                                                      In [19]:
#-----------R3-19-----------------------
"""
We have shown previously that if d(n) is O(f(n)) and e(n) is O(g(n)), then
d(n)e(n) is O(f(n)g(n)).
If we set d(n) = n and e(n) = 1, with f(n) = n and g(n) = log(n), then
(n)(1) is O(nlog(n)), satisfying our requirements.
"""
                                                                     Out[19]:
'\nWe have shown previously that if d(n) is O(f(n)) and e(n) is O(g(n)), th
en \nd(n)e(n) is O(f(n)g(n)).\n\nIf we set d(n) = n and e(n) = 1, with f(n)
= n and g(n) = log(n), then \n(n)(1) is O(nlog(n)), satisfying our requirem
ents.\n\n\n\n'
                                                                         In [20]:
#------------R3-20---------------------
"""
We have shown that f(n) is O(g(n)) iff g(n) is T(f(n))
"""
                                                                     Out[20]:
'\nWe have shown that f(n) is O(g(n)) iff g(n) is T(f(n))\n\nTherefore, if
nlogn is O(n^2), we have proved the above.\n\nLike in the previous exercise
, log(n) is O(n) and n is O(n), so\nnlogn is O(n^2) and therefore (n^2) is
T(nlogn) for n>n0\n\n\n'
                                                                      In [21]:
#---------R3-21-------------------
"""
As before, 1 is O(log(n)) for n>n0 (where n0>2 since log(1)=0)
"""
                                                                     Out[21]:
'\nAs before, 1 is O(log(n)) for n>n0 (where n0>2 since log(1)=0)\n\nTheref
ore (n)*(1) is O(log(n)*n) and n is O(nlogn)\n\n'
                                                                      In [22]:
#------------R3-22-----------------------
"""
f(n) <= ceil(f(n)) for all n>0, but ceil(f(n)) <= f(n) + 1.
"""
                                                                     Out[22]:
'\nf(n) <= ceil(f(n)) for all n>0, but ceil(f(n)) <= f(n) + 1.\n\nWe have s
hown that O(f(n) + g(n)) = O(max(f(n), g(n)),\nif f(n) is f(n) and g(n) is
1, then O(f(n) + g(n)) is O(max(f(n), 1)), which will be \n\nf(n) for a spe
cific range n>n0\n\nTherefore, cuil(f(n)) is O(f(n))\n\n\n'
                                                                      In [23]:
#------------R3-23--------------------------
"""
Accessing the len requires O(1)
"""
                                                                     Out[23]:
'\nAccessing the len requires O(1)\n\nthe loop is called n times -> O(n)\n
each loop requires O(1) times\nOverall, the loop is O(n)\n\nThe return take
s O(1)\n\nTherefore, O(1+n+1) is O(n)\n\n\n'
                                                                      In [24]:
#--------------R3-24-------------------------
"""
Accessing the len requires O(1)
"""
                                                                     Out[24]:
'\nAccessing the len requires O(1)\n\nthe loop is called n/2 times -> still
O(n)\n    each loop requires O(1) times\nOverall, the loop is O(n)\n\nThe r
eturn takes O(1)\n\nTherefore, O(1+n+1) is O(n)\n\n'
                                                                      In [25]:
#--------------R3-25-------------------------
"""
The initial steps are O(1)
"""
                                                                      Out[25]:
'\nThe initial steps are O(1)\n\nThe outer loop is called (n) times\n       Th
e inner loop is called between 1-n times\n        Each inner loop action is
O(1)\nThe overall number of calls is 2, 3, 4, 5, ....n = (n)(n+1)/2 - 1, wh
ich is O(n^2)\n\nTherefor the overall is O(1+n^2), which is O(n^2)\n\n'
                                                                       In [26]:
#--------------R3-26-------------------------
"""
Initial is O(1)
return is O(1)
Total is O(1+n+1) = O(n)
"""
                                                                     Out[26]:
'\nInitial is O(1)\n\nThe loop is called n times:\n    each call uses O(1)\
nTherefore total is O(n)\n\nreturn is O(1)\n\nTotal is O(1+n+1) = O(n)\n\n\
n'
                                                                      In [27]:
#-------------R3-27------------------
"""
Initial is O(1)
Final is O(1)
"""
                                                                     Out[27]:
'\nInitial is O(1)\n\nOuter Loop has n calls\n    Inner two loops have O(n^
2 calls) (see R3-25)\n    \nFinal is O(1)\n\nTherefore O(1+ n*n^2 + 1) = O(
n^3)\n\n'
                                                                      In [28]:
#------------R3-28--------------
import pandas as pd
import math
#Note, some of these calculations are too large for the computer to handle
so I limited it to
second = 10e6
rel_lengths = [1, 60*60]
df = pd.DataFrame(columns = ['seconds','hours'])
for i in range(len(functions)):
    inp = list(map(functions[i], x))
    #print(inp)
      df.loc[names[i]] = inp
seconds hours
n 1e+07 3.6e+10
n2 3162.28 189737
2n 23.2535 35.0673
                                                                        In [29]:
#-------------R3-29------------------------
"""
They multiply, so it is O(nlogn)
"""
                                                                        Out[29]:
'\nThey multiply, so it is O(nlogn)\n\n'
                                                                        In [30]:
#-------------R3-30-----------------
"""
You are choosing log(n) elements, which each require O(n)
"""
                                                                     Out[30]:
'\nYou are choosing log(n) elements, which each require O(n)\n\nOverall, it
is also O(nlogn)\n\n\n'
                                                                      In [31]:
#------------R3-31----------------
"""
You are doing O(n) and O(logn), which is overall O(n)
"""
                                                                        Out[31]:
'\nYou are doing O(n) and O(logn), which is overall O(n)\n\nIf all entries
are odd, your best case is O(logn)\n\nWorst case in O(n) if they are all od
d\n\n'
                                                                      In [32]:
#--------------R3-32---------------
"""
Algorithm D is calls Algorithm E n times
"""
                                                                     Out[32]:
'\nAlgorithm D is calls Algorithm E n times\n\nTherefore, the worst case ru
ntime is O(D(n))*(O(i)), which seems to be O(n*1) = O(n) from the descripti
on\n\n'
                                                                      In [33]:
#--------------R3-33------------------
"""
O notation means that there is a constant C such that f(n) < Cg(n)
That means that Al's runtime on a single iteration is 15 times slower, but
since he performs fewer operations overall, he
starts to perform better at large values of n
"""
import math
print(100/math.log(100, 2))
15.05149978319906
                                                                      In [34]:
#-------------R3-34----------------------
"""
If you are visiting a new restaurant, the chance that you will enjoy the
meal is 1/j, where j is the total number of
restaurants that you have visited so far (including that one)
Therefore, the expected number of meals you will enjoy is the sum of (1/j)
for j=1-number_of_restaurant
From calculus, the integral of 1/x is ln(x), so for 10,000 meals, the
average will be ln(10,000) - ln(1) = 9.2 restaurants
"""
import math
import matplotlib.pyplot as plt
print(math.log(10000, math.e))
print(math.log(1, math.e)) #This is trivial
x = list(range(1, 10000))
y = list(map(lambda x: math.log(x, math.e), x))
plt.plot(x, y)
plt.xlabel('Total Restaurants')
plt.ylabel('Enjoyed Restaurants')
plt.show()
9.210340371976184
0.0
Creativity
                                                                       In [35]:
#----------C3-35--------------
"""
If you sort each set, it will take O(nlogn)*3 time, which is still
O(nlogn).
Once you have the sequences sorted, you can move along each sequence with
indices i, j and k.
At each iteration, either choose the biggest value from the current
positions, or repeat the process with the
remaining two indices until they are all equal or greater to that biggest
value and start again. At each step, check
if the values are the same and return False if they are. If you make it to
the end, return true
Since you are traversing each index once, this part will be done in O(n)
time, so the overall is
O(nlogn + n), which is O(nlogn) still
"""
                                                                     Out[35]:
'\nIf you sort each set, it will take O(nlogn)*3 time, which is still O(nlo
gn).\n\nOnce you have the sequences sorted, you can move along each sequenc
e with indices i, j and k. \nAt each iteration, either choose the biggest
value from the current positions, or repeat the process with the \nremainin
g two indices until they are all equal or greater to that biggest value and
start again. At each step, check\nif the values are the same and return Fa
lse if they are. If you make it to the end, return true\n\nSince you are t
raversing each index once, this part will be done in O(n) time, so the over
all is \nO(nlogn + n), which is O(nlogn) still\n\n\n'
                                                                      In [36]:
#------------C3-36--------------
"""
One possibility is to sort the array first O(nlogn) and then choose the
last 10 value
"""
def last_10(array):
    if len(array)>=10:   return sorted(array)[-10:]
    else: return array
print(last_10([1,2,3,4,5,6,7,4,3,2,65,7,5,4]))
print(last_10([x^2+3*x for x in range(30)]))
print(last_10([1,2,3,5,4]))
[3, 4, 4, 4, 5, 5, 6, 7, 7, 65]
[42, 68, 72, 74, 74, 80, 82, 82, 84, 84]
[1, 2, 3, 5, 4]
                                                                        In [37]:
#-------------C3-37------------------
"""
An increasing oscillating function like x^2(cos(x)+1) would always be
positive but neighter bounded by O(n) or Omega(n)
"""
import matplotlib.pyplot as plt
import math
x = list(range(100))
y = list(map(lambda x: x**2*(1+math.cos(x)), x))
plt.plot(x, y)
                                                                        Out[37]:
[<matplotlib.lines.Line2D at 0x265d329c160>]
                                                                      In [38]:
#-----------------C3-38--------------------
"""
Note: the question is not asking whether computing this sum is O(n), it is
saying that the result of the expression is O(n)
For i = 2, 1 + 4 < 9 + 9
for i = 3, 1 + 4 + 9 < 16*3
Therefore, since this is true for the first few iterations, it is true by
induction
"""
                                                                     Out[38]:
'\nNote: the question is not asking whether computing this sum is O(n), it
is saying that the result of the expression is O(n)\n\nsum(i**2) = sum ((i-
1)**2) + i**2\n\nIf sum((i-1)**2) is less than i * (i**2), then sum ((i-1)*
*2) + i**2 < i*i**2 + i**2 < O(i^3)\nIn other words, 1^2 + 2^2 + ... (n-1)^
2 < n^2 + n^2 + n^2 ..... n^2\n\nFor i = 2, 1 + 4 < 9 + 9\nfor i = 3, 1 + 4
+ 9 < 16*3\n\nTherefore, since this is true for the first few iterations, i
t is true by induction\n\n\n\n'
                                                                      In [39]:
#--------------C3-39-------------------
"""
With some mental gymnastics, we can see that it's possible to rewrite this
as sum[k=0-n]( sum[i=k to n]((1/2)*(i)))
For this inner sum, we know that the geometric sum is the geometric sum
from 1 to n minus the geometric sum from 1 to k:
"""
                                                                     Out[39]:
"\nWith some mental gymnastics, we can see that it's possible to rewrite th
is as sum[k=0-n]( sum[i=k to n]((1/2)*(i)))\n\nFor this inner sum, we know
that the geometric sum is the geometric sum from 1 to n minus the geometric
sum from 1 to k:\n\nThe formula is (1/2)*(1-(1/2)**i)/(1-1/2), which means
the overall sum is:\n\n[1-0.5**i - 1 + 0.5**k], which is consistenly less t
han 0.5**k since 0.5**i is greater than zero.\n\nPutting this through the o
uter sum gives us:\n\n(1-0.5**k)/(1-0.5) = 2 - 0.5**k. Since 0.5**k is als
o always above 0, this sum is <= 2\n\n\n"
                                                                      In [40]:
#----------C3-40----------------------
"""
from the log rules, logb(a) = log2(a)/log2(b), which means that there is a
constant 1/log2(b) such that
logb(n) = Clog(b), which satisfies the conditions of theta
"""
                                                                     Out[40]:
'\nfrom the log rules, logb(a) = log2(a)/log2(b), which means that there is
a constant 1/log2(b) such that\nlogb(n) = Clog(b), which satisfies the cond
itions of theta\n\n'
                                                                       In [41]:
#------------C3-41-------------------
"""
If you could use 2n comparisons, it would be easy to just compare each
number to both the min and max
We could try short circuiting the loop by only checking max if the min
comparison failed, but that would not be true in all cases
(ex. if the array was reverse sorted). You could hedge against that by
randomizing or alternating which call was made first
The tournament method is the actual solution, but uses comparisons for
array size, which might be considered cheating:
"""
#Note, we can count the number of calls using a class instead of a function
class find_minmax():
    def __init__(self):
        self._count = 0
a = find_minmax()
print(a(list(range(10))), a._count, 10*3/2)
b = find_minmax()
print(b(list(range(100000))), b._count, 100000*3/2)
#Values are close but not exactly 3n/2 for higher numbers...
(0, 9) 14 15.0
(0, 99999) 165534 150000.0
                                                                      In [42]:
#--------------C3-42-----------------------------------
"""
If bob has n friends and they can visit at most i times, it is legal for
all his friends to visit
(i-1) times. Therefore he can only start to get suspicious if the counter
is higher than n*(i-1)
When n is odd, we can still draw the same line y = x and then sum up the n
half blocks:
For odd numbers, just ignore the middle element and add it to the end so
that you now have
(n+1)*((n-1)/2) + (1*((n+1)/2))
which simplifies to (1/2)* ((n+1)n - (n+1) + (n+1)) = n(n+1)/2 as before
         ___
        |   |
   (n+1)|   |_
        |   | |
        |___|_| (n+1)/2
     (n-1)/2 1
"""
                                                                     Out[43]:
'\nTriangle Method:\n\nWhen n is odd, we can still draw the same line y = x
and then sum up the n half blocks:\n\nsum = n*n/2 + n*(1*1/2) = (n^2 + n)/2
= n(n+1)/2\n\n\n\n\nBox Method:\n\nFor odd numbers, just ignore the middle
element and add it to the end so that you now have \n\n(n+1)*((n-1)/2) + (1
*((n+1)/2))\nwhich simplifies to (1/2)* ((n+1)n - (n+1) + (n+1)) = n(n+1)/2
as before\n         ___\n        |   |\n   (n+1)|   |_\n        |   | |\n
|___|_| (n+1)/2\n     (n-1)/2 1\n\n'
                                                                      In [44]:
#------------C3-44--------------------
"""
An integer of up to 100 bits is 2^100 = 1.2e30
a) We know from before that you only have to check up to the square root of
n, which is 1.13e15
If each check takes 1 microsecond, it will take at least 1.13e9 seconds,
which is over a billion seconds!
That is approximately 32 years
"""
                                                                     Out[44]:
'\nAn integer of up to 100 bits is 2^100 = 1.2e30\n\na) We know from before
that you only have to check up to the square root of n, which is 1.13e15\nI
f each check takes 1 microsecond, it will take at least 1.13e9 seconds, whi
ch is over a billion seconds!\nThat is approximately 32 years \n\nb) If the
number of bytes to store r is logr/8+1, if we want to express it in terms o
f n the answer would be \nO(e^n)\n\n\n\n'
                                                                      In [45]:
#-----------C3-45----------------------
"""
Since you can only use O(1) space, options like dictionaries are off limits
since they would grow
One option is to just multiply all the numbers from one set and then divide
them by the other.
The remainder will be the missing number!!
To account for the zero term, add one to all the numbers coming in, then
subtract one before making your return
"""
def find_missing(S):
    total_list = list(range(len(S)+1))
    total = 1
    for x in total_list:
        total *= x+1
      for x in S:
          total /= x+1
return int(total-1)
find_missing([0,1,2,3,4,6,7,8,9])
                                                                        Out[45]:
5
                                                                        In [46]:
#------------C3-46-------------------
"""
Side note: Al has a lot of crazy claims!
He hasn't proven that if n is true, then n+1 is true. He has instead shown
that if n is true, n-1 is true
and started with n=1. His claim is true for all 0<n<1, but not for all n>1
"""
                                                                     Out[46]:
"\nSide note: Al has a lot of crazy claims!\n\nHe hasn't proven that if n i
s true, then n+1 is true. He has instead shown that if n is true, n-1 is t
rue \nand started with n=1. His claim is true for all 0<n<1, but not for a
ll n>1\n\n\n"
                                                                      In [47]:
#-----------C3-47--------------------
"""
Base case, 1 line will meet at 0 points
2 lines will meet at one point
When you add a new line, it will meet with every other existing lines once
given the constraints
Alternate answer:
For big O to be true, there must be a constant C such that d(n) <= Cf(n)
"""
                                                                     Out[48]:
"\nThis justification mixes up the 'n' as an actual number and 'n' as a rep
resentation of the nth number.\n\nAs used in the example (as a number of ca
lls), the justification is correct\n\nHowever, in terms of the actual value
itself, the justification is not correct.\n\n\nAlternate answer:\n\nFor big
O to be true, there must be a constant C such that d(n) <= Cf(n)\n\nFor thi
s justification, C was chosen as 1\n\nTherefore, the proof should have been
\nF(1) = 1 <= 1n\nF(2) = 2 <= 1n\n\nfor the induction step, \nF(n) = F(n-1)
+ F(n-2) = 1(n-1) + (n-2) = 2n-3, which is not <= 1n as before \n\n\n"
                                                                       In [49]:
#-------------C3-49--------------------
"""
The fibonacci sequence is : 1, 2, 3, 5, 8, 13, 21, 34, etc..
The first few steps of the sequence are actually bigger than (3/2)**n, so
we can't use them!
THe first instance when this is true is for n = 5
Since (1.555)x > x, we have proved that F(n) >= (3/2)^n for all values
greater than 5, which
satisfies the constraints of Omega
"""
def output_fibbo():
    smaller = True
    F0 = 0
    F1 = 1
    power = 1
    counter = 0
    while F1 <= power:
        counter += 1
        F1, F0 = F1+F0, F1
        power *= (3/2)
        print (counter, F1, power)
output_fibbo()
1   1   1.5
2   2   2.25
3   3   3.375
4   5   5.0625
5   8   7.59375
                                                                         In [50]:
#--------------C3-50---------------------
"""
a)
Loop from 0 to n as i
    Loop from 0 to i, calculating the value of xi as ai*=x
x = 1
for a in polynomials:
      total += a*(x_total)
      x_total *= x
return total
This seems like it would only take O(n) time, so I'm not sure...
"""
"""
                                                                     Out[51]:
'\nlog(a) + log(b) = log(a*b)\n\nTherefore sum(log(i)) is actually log(Prod
(i))\n\nSince 1X2X3...n < n^n, log(Prod(i)) < log(n^n) = nlogn\n\n\n'
                                                                      In [52]:
#--------------C3-52-------------------
"""
For this to be true, sum(log(i)) must be >= C*nlogn
In the second half of the series, each value is at least log(n/2) and the
total value is greater than the sum from n/2 to n
"""
                                                                     Out[52]:
'\nFor this to be true, sum(log(i)) must be >= C*nlogn\n\nIn the second hal
f of the series, each value is at least log(n/2) and the total value is gre
ater than the sum from n/2 to n\n\nTherefore, sum(log(i)) >= sum[n/2 to n](
log(n/2)) = (n/2)(log(n) - log(2)) which is O(nlogn)\n\nTherefore sum(log(i
)) is Theta(nlogn) as well\n\n\n'
                                                                       In [1]:
#--------------C3-53----------------------
"""
We can represent any number using log(n) digits in binary.
Enlist one cup for each binary digit. and add a drop of that wine to the
cup if there is a 1 in the binary representation.
After a month, the poison tasters killed will map out the binary
representation of the poisoned wine.
Ex. If wine 5 is poisoned then tasters 1 and 3 (starting from 1) will die
but everyone else will live
Note: this assumes that the poison tasters are physically capable of
drinking one drop from half of the wines in a day,
which may not be feasible at huge numbers.
Side Note: What makes him an evil king and why are we trying to protect
him?
"""
                                                                      Out[1]:
'\nWe can represent any number using log(n) digits in binary.\n\nTherefore,
assign each number a binary value (ex. bottle 5 is 0000...0000101).\n\nEnli
st one cup for each binary digit. and add a drop of that wine to the cup if
there is a 1 in the binary representation.\n\nEx. Wine 5 will be added to t
he first and third cup.\n\nAfter a month, the poison tasters killed will ma
p out the binary representation of the poisoned wine.\n\nEx. If wine 5 is p
oisoned then tasters 1 and 3 (starting from 1) will die but everyone else w
ill live\n\nIf wine 10 is poisoned (000...001010), wine tasters 2 and 4 wil
l die but everyone else will live.\n\nNote: this assumes that the poison ta
sters are physically capable of drinking one drop from half of the wines in
a day,\nwhich may not be feasible at huge numbers.\n\n\nSide Note: What mak
es him an evil king and why are we trying to protect him?\n\n'
                                                                      In [54]:
#---------------C3-54------------------------------
"""
The algorithm below works in O(n) time, but uses O(n) space as well
"""
import random
def find_most_frequent(n):
   S = []
   for _ in range(n):
       S.append(random.randint(0, 4*n-1))
print (S)
   counts = [0]*(4*n)
   max_int = 0
   for num in S:
       counts[num] += 1
       if counts[num] >= counts[max_int]:
           max_int = num
   print(max_int, counts[max_int])
   if counts[max_int] == 1: return False
   else: return max_int
find_most_frequent(1000)
[2339, 956, 1188, 1307, 684, 1406, 3358, 3963, 2671, 3983, 516, 2127, 1996,
3569, 1032, 1237, 2398, 3008, 3473, 475, 251, 1400, 1300, 2224, 2670, 2203,
2201, 986, 2570, 569, 1772, 588, 2936, 218, 658, 1738, 1687, 1425, 1654, 27
84, 3553, 148, 989, 1294, 2172, 478, 1570, 3225, 3731, 1996, 2832, 1448, 28
66, 3356, 1246, 2771, 2212, 3310, 922, 1885, 1583, 158, 3539, 1713, 3240, 1
419, 547, 1657, 1859, 77, 3931, 3390, 616, 3636, 871, 3455, 2855, 1906, 413
, 25, 3823, 1268, 1576, 2037, 3345, 1296, 3740, 3773, 3237, 1337, 2933, 305
8, 2330, 485, 3031, 2956, 2158, 521, 707, 3984, 1175, 1643, 3726, 2850, 307
, 2085, 1310, 1565, 3208, 2003, 997, 3338, 2484, 1115, 446, 2246, 2345, 423
, 2957, 230, 2329, 1114, 3463, 3084, 2366, 1516, 1871, 1625, 203, 1587, 454
, 808, 3361, 915, 537, 1537, 1155, 3320, 3490, 2318, 2006, 643, 312, 1741,
3264, 749, 1831, 2064, 311, 2794, 3313, 2886, 3001, 3607, 1475, 2885, 519,
742, 2150, 3165, 1193, 2594, 2069, 3955, 1005, 2654, 775, 1567, 283, 2881,
1245, 1310, 607, 1109, 1647, 1555, 1713, 851, 2371, 920, 2014, 1452, 1637,
406, 3369, 617, 885, 1981, 2970, 3208, 3874, 2476, 2079, 1724, 976, 2725, 1
943, 1232, 3071, 530, 3144, 3015, 955, 2317, 2022, 256, 835, 2434, 1947, 32
53, 2020, 3021, 679, 2853, 2776, 3390, 2332, 3866, 2186, 1492, 368, 1386, 3
585, 79, 2574, 240, 1663, 1845, 3299, 2932, 3695, 1754, 2292, 2131, 529, 12
76, 916, 2612, 3626, 3237, 73, 2675, 3583, 2839, 2393, 2454, 507, 3661, 395
1, 2887, 825, 2163, 1912, 2204, 2288, 1757, 3482, 3058, 1719, 2515, 3018, 3
441, 2822, 3206, 1812, 825, 809, 1218, 1172, 2458, 1198, 737, 3732, 3368, 3
677, 3824, 3335, 2169, 3932, 1588, 232, 2743, 1164, 625, 1353, 1338, 2108,
875, 1949, 1384, 3987, 2969, 3142, 3169, 75, 2395, 2544, 3553, 3786, 3680,
1361, 1102, 612, 2804, 3793, 2863, 1530, 3957, 3256, 479, 1100, 761, 52, 37
26, 733, 2832, 1786, 3582, 179, 1672, 1899, 1088, 3876, 2826, 3214, 2109, 1
434, 3692, 2963, 1685, 2954, 1877, 1686, 1081, 2229, 1915, 3530, 1103, 76,
1530, 245, 247, 1311, 3157, 2907, 544, 3951, 2951, 3982, 1289, 2990, 1171,
1440, 2698, 1535, 1778, 492, 3976, 889, 699, 3978, 1617, 1868, 2545, 444, 3
140, 3358, 651, 2147, 946, 242, 3247, 337, 3242, 3582, 3224, 3383, 20, 1845
, 3551, 3444, 37, 1639, 3611, 1733, 2299, 2102, 3548, 3053, 2251, 3865, 349
6, 1900, 270, 3204, 518, 2863, 2549, 3226, 2337, 1493, 2170, 3269, 3538, 22
11, 2021, 2168, 3184, 1056, 770, 3515, 1583, 1195, 1480, 3950, 2240, 2627,
2401, 1648, 3046, 1254, 259, 2776, 2751, 1099, 2224, 1619, 3763, 2907, 906,
1731, 1582, 225, 1902, 315, 1620, 1883, 1268, 2191, 3930, 2691, 1171, 3813,
2962, 3233, 1293, 3008, 2449, 2888, 3137, 1287, 3471, 992, 711, 2729, 2290,
1386, 3731, 715, 890, 2003, 1868, 540, 1789, 2525, 1931, 3606, 1670, 3650,
443, 1849, 1442, 3909, 2281, 2226, 1321, 3732, 745, 574, 803, 3649, 2339, 5
10, 324, 770, 1214, 3315, 2760, 399, 2475, 1750, 3464, 495, 823, 2603, 251,
2347, 1879, 1827, 748, 963, 141, 593, 2076, 2595, 2401, 1990, 2606, 991, 31
93, 1935, 1228, 2665, 3210, 3743, 1388, 1124, 1862, 3646, 1071, 1461, 830,
418, 1528, 1160, 1064, 3797, 2299, 502, 1530, 856, 3441, 3366, 1740, 3801,
3697, 1828, 2983, 2675, 2393, 237, 1109, 3686, 392, 8, 3490, 1534, 516, 364
5, 3662, 996, 1086, 2013, 3238, 2616, 1791, 2926, 1564, 584, 3778, 955, 186
0, 1144, 3039, 326, 2805, 1946, 979, 2637, 2367, 482, 3915, 351, 165, 2501,
1445, 3799, 164, 3658, 1395, 3412, 2191, 2844, 210, 2881, 1068, 2577, 3764,
3917, 2796, 2089, 1578, 2611, 2911, 2249, 1203, 861, 522, 3529, 73, 1081, 2
457, 3960, 3641, 1513, 1097, 202, 1437, 2448, 560, 2376, 1248, 1303, 3586,
500, 404, 1486, 3492, 3231, 98, 1795, 722, 3762, 3996, 3437, 3629, 1032, 17
28, 207, 3056, 3036, 2688, 2552, 1460, 1604, 3030, 2871, 3400, 3935, 517, 2
805, 3083, 3772, 22, 3778, 3349, 2999, 1534, 3707, 2193, 1742, 3149, 3913,
216, 1914, 3263, 399, 3203, 2637, 2238, 465, 432, 443, 119, 3151, 3863, 336
0, 3986, 128, 2155, 2831, 807, 353, 2158, 732, 2990, 2001, 1672, 919, 72, 8
64, 1231, 42, 2735, 1919, 2354, 1779, 2123, 3291, 2489, 1213, 1714, 3204, 4
47, 2206, 2870, 1583, 1577, 315, 829, 1739, 397, 348, 2506, 810, 19, 2069,
2209, 545, 3305, 3003, 2961, 535, 3385, 58, 990, 1557, 2700, 2734, 1044, 82
3, 912, 952, 2213, 2642, 1267, 2909, 451, 553, 2987, 3836, 71, 1505, 799, 1
980, 759, 193, 3902, 2970, 2082, 3428, 750, 3650, 1904, 3578, 3638, 3043, 1
525, 2072, 1000, 3661, 3830, 2239, 1125, 1903, 3362, 604, 1269, 1225, 1074,
2473, 3506, 3383, 402, 1590, 2189, 744, 2173, 1575, 1578, 1691, 2151, 1082,
2903, 471, 2512, 3004, 1232, 967, 2383, 3154, 438, 2707, 3538, 68, 1976, 21
0, 216, 1654, 909, 3232, 1906, 338, 3159, 2268, 998, 2552, 439, 3535, 3772,
3621, 3620, 975, 2164, 3818, 3335, 2365, 2051, 2693, 2933, 3377, 3376, 1505
, 682, 144, 1814, 2779, 1314, 640, 1460, 3463, 3350, 646, 663, 2624, 3998,
3592, 3563, 3320, 3772, 3650, 2645, 2073, 1361, 1031, 594, 2720, 3543, 2550
, 3793, 907, 1890, 1673, 2180, 2755, 3, 1330, 601, 180, 1010, 3822, 297, 36
98, 3590, 380, 1125, 866, 3154, 651, 210, 231, 117, 3840, 1743, 3626, 3732,
2255, 3073, 175, 1044, 855, 374, 432, 3656, 2955, 1522, 1889, 1458, 1038, 3
714, 682, 272, 795, 1287, 1046, 1540, 2267, 3078, 1238, 915, 2815, 2484, 37
5, 1899, 2313, 1437, 2787, 2744, 1497, 730, 2869, 202, 22, 3433, 2371, 1729
, 2026, 3979, 1135, 269, 2379, 3238, 331, 2394, 154, 3514, 1311, 3994, 1789
, 2105, 3518, 2014, 114, 1792, 3829, 2159, 2542, 692, 3775, 2014, 202, 3204
, 3046, 2570, 3734, 2278, 1175, 2413, 1535, 2321, 3557, 1268, 2688, 1475, 3
087, 859, 1456, 2789, 3812, 2665, 1583, 1415, 178, 2032, 37, 158, 78, 1795,
896, 806, 2615, 1105, 516, 2665, 2669, 2447, 85, 2405, 3221, 702, 2949, 140
, 1716, 1251, 3659, 1847, 987, 1013, 3563, 3547, 71, 2082, 3787, 3015, 2171
, 1935, 2458, 3292, 2386, 863, 1078, 2850, 3778, 1413, 2101, 1800, 662, 769
, 819, 3490]
1583 4
                                                                     Out[54]:
1583
Projects
                                                                      In [55]:
#--------------P3-57----------
"""
Note, I'm practicing the use of decorators here, you can replace the
decorators
by adding the before = time() and after = time() in the main functions to
get the same behaviour
Side Note: I only tested up   to 1e5 for timing reasons, feel free to
increase n if you want, but   note that it
takes 100x as long for each   n added
"""
import time
import matplotlib.pyplot as   plt
class Prefix():
    def __init__(self):
        self._timer_array = []
    def timer(func):
        def wrapper(*args, **kwargs):
            before = time.time()
            for _ in range(20):
                func(*args, **kwargs)
            after = time.time()
            args[0]._timer_array.append((after-before)*100000)   #Note,
since the first argument is self, this is how we access self
            print ('Total time is ', after-before)
        return wrapper
   @timer
   def prefix_average1(self, S):
       n = len(S)
       A = [0]*n
       for j in range(n):
           total = 0
           for i in range(j+1):
               total += S[i]
           A[j] = total/(j+1)
       return A
   @timer
   def prefix_average2(self, S):
       n = len(S)
       A = [0]*n
       for j in range(n):
           A[j] = sum(S[0:j+1])/(j+1)
       return A
   @timer
   def prefix_average3(self, S):
       n = len(S)
       A = [0]*n
       total = 0
       for j in range(n):
           total +=S[j]
           A[j] = total/(j+1)
       return A
   def _reset_timers(self):
       self._timer_array = []
    def test_timers(self, e_n = 5):
        results = {}
        for name, func in [('1',self.prefix_average1), ('2',
self.prefix_average2), ('3', self.prefix_average3)]:
            for i in range(1, e_n):
                test_array = list(range(10**i))
                func(test_array)
            results[name] = self._timer_array
            self._reset_timers()
p = Prefix()
x, results = p.test_timers(5)
for f in results.values():
    plt.plot(x, f)
plt.xscale('log')
plt.yscale('log')
Total time is 0.0
Total time is 0.004987239837646484
Total time is 0.5091602802276611
Total time is 47.898741722106934
Total time is 0.0
Total time is 0.001994609832763672
Total time is 0.07579708099365234
Total time is 7.110187768936157
Total time is 0.0
Total time is 0.0
Total time is 0.0019941329956054688
Total time is 0.02792525291442871
                                                                    In [56]:
#------------------P3-56--------------------
#Note, I am subclassing Prefix so you have to run that cell first
"""
Note, I'm not worrying about the fact that some of the lower ones are 0.0
due to rounding since adding more tests
would slow things down at higher values of n.
This is why O(n^3) or O(n^2) are not desirable... they really explode as n
gets large
"""
class SumTimer(Prefix):
    def __init__(self):
        super().__init__()
    def timer(func):
        def wrapper(*args, **kwargs):
            before = time.time()
            for _ in range(20):
                func(*args, **kwargs)
            after = time.time()
            args[0]._timer_array.append((after-before)*100000)   #Note,
since the first argument is self, this is how we access self
            print ('Total time is ', after-before)
        return wrapper
      @timer
      def example1(self,S):
          n = len(S)
          total = 0
          for j in range(n):
              total+= S[j]
          return total
      @timer
      def example2(self, S):
          n = len(S)
          total = 0
          for j in range(0, n, 2):
              total += S[j]
          return total
      @timer
      def example3(self, S):
          n = len(S)
          total = 0
          for j in range(n):
              for k in range(1+j):
                  total += S[k]
          return total
      @timer
      def example4(self, S):
          n = len(S)
          prefix = 0
          total = 0
       for j in range(n):
           prefix += S[j]
           total += prefix
       return total
   @timer
   def example5(self, A,B):
       n = len(A)
       count = 0
       for i in range(n):
           total = 0
           for j in range(n):
               for k in range(1+j):
                   total += A[k]
           if B[i] == total:
               count += 1
       return count
s = SumTimer()
x, results = s.test_timers(4) #Note, even at 4 it takes 454 seconds for
the last step of example5. Increase at your own risk!
for f in results.values():
    plt.plot(x, f)
plt.xscale('log')
plt.yscale('log')
Total time is 0.0
Total time is 0.0
Total time is 0.0009975433349609375
Total time is 0.0
Total time is 0.0
Total time is 0.0009970664978027344
Total time is 0.0
Total time is 0.00498652458190918
Total time is 0.45677876472473145
Total time is 0.0
Total time is 0.0009970664978027344
Total time is 0.000997781753540039
Total time is 0.0009970664978027344
Total time is 0.46526145935058594
Total time is   455.0140790939331
                                                                      In [57]:
#-----------P3-57----------------------
"""
Note, if the method really runs in nlogn time, the output 2^(time/n) should
be linear
import random
import time
import matplotlib.pyplot as plt
n_e_max = 15
times, y = test_sorted(n_e_max)
x = [2**x for x in range(1, n_e_max)]
plt.plot(x, y)
plt.show()
"""
Conclusion: it actually performs better for the test cases that we have
provided!
"""
                                                                     Out[57]:
'\nConclusion: it actually performs better for the test cases that we have
provided!\n\n'
                                                                      In [58]:
#---------P3-58---------------------
import time
class Unique():
    NUM_TESTS_PER_TIMECHECK = 10000
    def __init__(self):
        pass
      def tests_per_minute(func):
          def wrapper(*args, **kwargs):
              total_time = 60 #For one minute
              counter = 0
              while total_time >= 0:
                  before = time.time()
                  for _ in range(args[0].NUM_TESTS_PER_TIMECHECK):
                      out = func(*args, **kwargs)
                  after = time.time()
                  total_time -= after-before
                  counter += 1
                  #print (total_time)
   @tests_per_minute
   def unique1(self, S):
       for j in range(len(S)):
           for k in range(j+1, len(S)):
               if S[j] == S[k]:
                   return False
       return True
   @tests_per_minute
   def unique2(self, S):
       temp = sorted(S)
       for j in range(1, len(temp)):
           if S[j-1] == S[j]:
               return False
       return True
n = 5
u = Unique()
#u.unique3([1,2,3,4,5,6])
results = u.test_each_algo(n)
for key, value in results.items():
    print (f"The total number of tests for {key} is {value} for n = {n}")
The total number of tests for Unique1 is 29020000 for n = 5
The total number of tests for Unique2 is 74270000 for n = 5
The total number of tests for Unique3 is 11090000 for n = 5
Fin :)