Exercises
Exercises
April 7, 2019
IMPORTANT
This document is provided to you in printed form, and is intended for you alone. It
may not be copied, sold (over the internet, in particular) nor redistributed
in any form, whatever the reason.
1 Introduction – complexity
1.1 Factorial
1.1.1 Statement
Show that
n! = ω(2n )
.
1.1.2 Solution
We need to show that for any a > 0, there exists an integer n0 such that the inequality
n! ≥ a.2n
n!
≥a
2n
where a can be as large as desired. To show this, suffice to see that
n! = 1 × 2 × 3 × ... × n
≥ 2 × 3n−2
1
1.2 Statement
Let f, g : N+ → N+ be two functions. Prove or invalidate each of the following claims:
1.3 Solution
1. False. Counter-example: f (n) = n, and g(n) = n2 . We well have f (n) = O(n2 ), since
this is equivalent to n ≤ an2 , or 1/n ≤ a, so suffice to choose a = 1. However, the
converse would require n2 ≤ an, or n ≤ a to hold true, that is, a can’t be a constant.
3. True provided g(n) > 1 for n large enough. Since we know that f (n) = O(g(n)), there
must exist constants n0 > 0 and a > 0 such that
∀n ≥ n0 , f (n) ≤ ag(n)
log f (n) ≤ log a + log g(n)
On the other hand, log f (n) = O(log g(n)) means there would exist b > 0 such that
for n large enough. Can we find a value of b that would keep both inequalities true?
Indeed,
4. True. Again, f (n) = O(g(n)) means there exist a > 0 such that f (n) ≤ ag(n) holds
true for n large enough. On the other hand, log f (n) = O(log g(n)) means we can find
b such that
2
holds true for n large enough. To conclude, suffice to see that
ag(n) ≤ g(n)b
⇔a ≤ g(n)b−1
⇔ log a ≤ (b − 1)g(n)
log a
⇔ +1≤b
g(n)
6. True. If f (n) = O(g(n)), then there exist a > 0 and n0 > 0 such that
n ≥ n0 ⇒f (n) ≤ ag(n)
1
⇔g(n) ≥ f (n)
a
which is the very definition of Ω with the rescaling constant chosen as 1/a.
8. True. Indeed,
for some constant a > 0 and n large enough. Then, very clearly
Find the Ω and O bounds for each of the following recurrences (floors b.c have been omitted
to simplify notations):
• T (n) = 2T (n/3) + n2
• T (n) = T (n − 1) + n
• T (n) = T (n − 1) + 1/n
• T (n) = T (n − 1) + T (n − 2) + n/2
3
• T (n) = T (n/2) + T (n/3)
2.1.2 Solution
• T (n) = 2T (n/3) + n2 = θ(n2 ) by copy pasting the calculation derived in the proof of
the Master theorem, and identifying a = 2, b = 3, d = 2
n(n−1)
• T (n) = T (n − 1) + n = T (n − 2) + n + n − 1 = ... = T (1) + 2 = Θ(n2 )
• T (n) = T (n − 1) + n1 = n1 + n−1
1 1
+ n−2 + ... + T (1) = T (1) + H(n) − 1, by definition of
Rx
the harmonic number. But for positive integers x = n, log(x) = 1 dtt admits a double
bounding as
n n
X 1 X 1
−1 + ≤ log n ≤
i i
i=1 i=1
which show H(n) = Θ(log(n))
• T (n) = T (n/2) + T (n/3) : What can be said is 2T (n/3) < T (n/2) < 2T (n/2). A copy-
paste of the calculation derived in the proof of the Master theorem gives Ω(nlog3 (2) ) ≈
Ω(n0.63 ) for the left hand side, and O(n) on the right-hand side.
• T (n) = 2T (n/2) + log n. Assume n = 2p for the time being. It is easily seen that
T (n) = 2p T (n/2p ) + p−1 i i
P
i=0 2 log(n/2 ). The last term evaluates to
p−1
X n
2i log( )
2i
i=0
p−1
X
p
= (2 − 1) log n − i2i log 2
i=0
= (2 − 1) log n − (2 − 21+p + p2p ) log 2
p
1
T (n) = T (n 2 ) + 1
1
= T (n 4 ) + 1 + 1
1
= T (n 8 ) + 1 + 1 + 1
= ...
h
= T (n1/2 ) + h
4
Two answers are possible then:
– Either we should consider that the argument n in T (n) is non integer, and 21h will
never reach 0 for any h, even arbitrary large : in this case, the recurrence relation
is diverging, and it does not admit any Ω or O bounds
– Or this argument has to be integer, which assumes n must be a power of 2. In
this case, we should stop recursing avec immediately after solving T (n = 2), for
which there is a solution in h:
h
n1/2 < 2
log n
2h >
log 2
log n
h log 2 > log
log 2
• Conclude that the running time of fibo3 must be no more than O(n2 log n)
• Assuming that 2 n-bits integers can be multiplied in O(M (n)) time, can you prove
that fibo3 is in O(M (n)) time?
2.2.2 Solution
Our problem is therefore to compute Mn , for any value of n. We can do this thanks to the
following function (C-like style, where Matrix is a special type for handling matrices):
5
Matrix power(Matrix M, unsigned n)
{
Matrix P;
/* trivial cases */
if (n == 0) return I; /* the identity matrix */
if (n == 1) return M;
/* non-trivial cases */
P= power(M, n/2);
if (n % 2 == 0) /* n is even */
return P*P;
else
return P*P*M; /* n is odd */
}
We would initially call power(M,n), with n as the desired value. If we count in terms of
multiplications, say T (n) for the input argument n, we have that
as the function calls power(M, n/2) only once, and that the final results may eventually be
multiplicated by M, which is only an extra multiplication, so O(1). The above recurrence
relation resolves to T (n) = O(log n), the desired result.
We could also have proposed the following procedural code:
while (n > 0)
{
if (n % 1 == 1)
R = R * T;
T = T * T; /* next square of T */
n= n / 2;
}
return R;
}
which considers the binary decomposition of n = i 2i bi : bit bi tells us whether the next
P
power of 2 of M should be included in the result or not, which is what n % 1 == 1 is testing.
As n can not be made of more than log2 n bits, the while loop runs in O(log n) steps too,
and involves 2 multiplications each time. This agains leads to the same result.
Regarding the number of bits needed to compute Mn , we know that Fn+1 = Fn−1 + Fn and
that (Fn )n is an increasing sequence. So we have that
Fn+1 ≤ 2Fn
6
which repeated n times give
Fn ≤ 2n
The higest number stored in Mn can therefore be no larger than 2n , which requires n bits to
be stored. We have 4 such numbers, a constant. So the number of bits needed is well bounded
by O(n). Since the nave multiplications of 2 n-bits integer takes O(n2 ) time (n additions
of n-bit integers), and that no more than log(n) such multiplications may be involved, the
overal complexity must be at worst O(n2 log n).
Could we conclude that fibo3 is O(M (n)) in time? Strictly speaking, it depends on M (n),
which is expected to be less than quadratic, but certainly more than linear. Let us suppose
M (n) ≤ αnc , where c > 1 is some constant. Then:
The last summand may involve an arbitrary large number of terms, but can never reach the
value of 2. Therefore T (n) ≤ T (1) + 2c αnc , which shows T (n) = O(M (n)) for any c > 1.
2.3 Pr Sempron
2.3.1 Statement
Professor Sempron has n chipsets, not all of them are in good state. He can only test them
pairwise, and according to the following:
• if a chip is good, its opinion on the other chip will always be: positive if the other chip
is good, negatve otherwise
• a test if positive if and only if the opinions of both the chips are positive
1. Show it is impossible to diagnose which chips are reliable if more than n/2 of them are
damaged.
2. Assuming that more than n/2 chips are good, show that only Θ(n) operations are
sufficient to find a good chip amongst the n
3. Show that Θ(n) operations are sufficient to diagnose the good chips, still assuming
more than n/2 of them are good.
2.3.2 Solution
1) For convenience and clarity, we think of the problem in terms of unoriented graphs: a
chipset is represented by a node, and there is an edge between a pair of nodes (n1 , n2 ) if and
only if n1 ’s opinion about n2 is that it is good and n2 ’s opinion about n1 is that it is good.
Three important facts follow :
7
1. There must be an edge between any pair (n1 , n2 ) of good chipsets, for otherwise, at
least one of them would answer negatively, tnus violating its definition.
2. Consequence of point 1: the subset of good chipset must form a clique (a graph with
maximal connectivity)
3. The subset of damaged chipset must be disconnected from the clique of good chipsets:
any edge (good, bad) would link a good and a bad chipset together, which is not possible
as the good one has to answer negatively in such case.
Appart from being disconnected from the set of good chipsets, the set of damaged chipset
has no reason to assume any partical shape, as Fig. 1 shows. In particular, if it is not a
clique, then it is identified as the one representing bad chipsets, even if it contains more than
n/2 of them.
0
1
000
111
00000
11111
111111
000000
0
10
1
000000
111111
000
111
00
11 00000
11111
0
1
000000
111111
000
111
00000
11111
0
1
000
111
00
11
000000
111111
000
111 00000
11111
0
1
0000000
1111111
000
111
000000
111111
000
111 00000
11111
0
1
0000000
1111111
000
111
000000
111111
000
111 00000
11111
0
1
0000000
1111111
000
111
000000
111111
000001
11111
0
1
000
1110000000
1111111
000
111
000000
111111
00000
11111
0
1 0
0000
1111 000
1110000000
1111111
000000000
111111111
00000
11111
000
111
000000
111111
0
1 0
1
0000
1111 000
111
000000000
111111111
00000
11111
000
111
000000
111111
0
1
0000
1111 000
111
000000000
111111111
00000
11111
000
111
000000
111111
0000
11110
1
0000
1111 000
111
000000000
111111111
1
0 00000
11111
000
111
000000
111111
0000
11110
1
0000
1111 00000
11111
000
111
0000
11110
1
good 1111
0000 00000
11111
000
111
0000
11110
1
chips
0000
111100000
11111
000
111
00
11
00000
11111
00
11
00
11
00
11
00
11
no edge between both
1
0
0
1 0000
1111
1
0
0
1 0000
1111 good chip
011
1
00
11
00000
1111100
0000
1111 000
111 111 what it is
000 11
00
0
1 00
11
00000
1111100
11
0000
1111 000
111 000
111000
111
00
11
0000000
1111111
00
11000
111
0
1 00
11
00000
11111
0000
1111 000
111
0000000
1111111 000
111000
111
0000000
1111111
bad chips
0000
1111 000
111
00
11
00000
11111 000
111 000
111000
111
0000000
1111111
000011
000000000
1111111 typically 000
111
1111 000
111
00
11
00000
11111 000
111
00000000
11111111
00
110000000
1111111 000
1110000000
1111111
000
111 0
1
000
111
00000
11111
00
11
00000
11111 1
0
000
111
00000000
11111111
00
11 000
1110000000
1111111
00000
11111
000000000
111111111
000
111 0
1
000
111
00000
11111
00
11
00000
11111
00000000
11111111
00
11 000
111 00000
11111
000000000
111111111
000
111
000
111
0000
111100000
11111
00
11
00000
11111
00000000
11111111
00
11 000
111 00000
11111
000000000
111111111
000
111
11
00 000
111
0000
111100000
11111
00
11
00000
11111 0000
1111
000
111
000000000
111111111
1
0 00000
11111
000
111
000
111
0000
111100000
11111
0
1
00
11
000
111 0000
111100000
11111
000
111
0000
1111 1111
0000 bad chips
0000
111100000
11111
0
1
00
11 11 what it
00 00000
11111
000
111
0000
1111
0000
111100000
11111
000
1111
0 1
0
could be 00000
11111
(a) (b)
Figure 1: Pr Sempron’s problem. (a) results of testing all possible pairs of chipsets; ((b) why
the set of bad chipsets can’t be chosen to be majoritary
However, in the worst case this set may perfectly be a clique too (all chipsets provide a fake
answers). In this latter case, we are facing two cliques with no means of distinguishment...
except if we add the requirement that one has to be majoritary. However, that one can’t be
the sef of damaged chipsets because of the counter-example shown in Fig. 1 (b) : 1 good chip
only making a (degenerated) clique, n-2 damaged chipsets forming a clique, and 1 damaged
chipset not connected to any other one. It is still impossible to distinguish which one of the
isolated node is good or damaged. The result follows.
2) Call G the set of good chipsets, and D that of damaged ones. We pretend that Alg. 1
actually provides the answer to the problem in Θ(n) time.
Why works? The underlying idea is very simple: if you test any pair of chipsets (c1 , c2 ), and
find that the test is positive, then c1 and c2 must be of the same nature: either both good or
both damaged. To see this, check that i) two good chipsets do have to answer positively, for
otherwise the property that a good chipset always reports the truth about the other would
be violated, and ii) the only other possible case for a positive test is that both are damaged.
So in such case, we have 2 chipsets of the same nature, though we do not know (yet) this
nature. It is therefore useless to consider both of them any further: suffice to keep one of
them, and save the information that another chipset is of the same nature than that one.
On the contrary, let us assume that the test is negative. Then at least one of them is
damaged. This means that one of the inequalities |G| − 1 > |D| − 1 or |G| > |D| − 2 must
8
Algorithm 1 Solving Pr. Sempron’s problem.
1. Init: let S := {(c1 , 1), (c2 , 1), ..., (cn , 1)}, where c1 , ..., cn represent the chipsets.
3. Set Z := ∅.
4. Choose any two pairs of couples (c1 , n1 ), (c2 , n2 ) from S such that n1 ≤ n2 . Remove
these couples from S.
hold in this case, therefore the hypothesis |G| > |D| is preserved if we drop these two pairs
(get them definitely away from the problem, without changing its solution).
Repeated applications of the following test leaves us to one of the following cases:
• We have ”bags” of chipsets of size n1 and n2 , and find that that any pair of repre-
sentative tests positively: this amounts to saying that we have a single ”bag” of size
n1 + n2 , all of the same nature
• We find that the test is negative. In that case, n2 − n1 pairs can be removed without
changing the answer of the problem.
Thus, |G| > |D| always holds in our method, and at step 8, |S| becomes at worst b|S−1|/2c+1
in case n is odd, and there is only one more good chipset than damaged ones. Thus,
holds, which resolves to T (n) = O(n) by the Master theorem. For the lower bound, suffice
to see that the best possible case when iterating from step 8 to step 2, is when all the pairs
drawn at step 4 are of different nature, except maybe 1. Yet, this requires n testings, so
Θ(n) time is justified.
NOTE: in a divide-and-conquer fashion, this amounts to go at finest level first (all chipsets
tested pairwise), then build the higher level from conlusions drawn from the actual level.
Though it leads to the same tree, the divide-and-conqer paradigm is somewhat different
there, in the sense the method is better described as ”bottom-up” than ”top-down”.
3) If we can select a good chipset thanks to question 2), and know that it is a good, then
that chipset can be used to diagnose all other ones.. in linear time in all cases, so Θ(n) time
again.
3 Dynamic programming
3.1 Tailcoat’s problem
3.1.1 Statement
You are given a piece of fabric with integer dimensions X × Y . You have a set of n template
objects, each of which requires a piece of fabric with integer dimensions xi × yi to be copied.
9
If you produce a copy of object i, your profit is ci ; you can produce as many copies of
any object you want, or none. You have a machine that can cut any piece of fabric into
two pieces, either horizontally or vertically. Propose an algorithm which tells you how to
maximize your profit.
3.1.2 Solution
In this solution, we make the assumption1 that xi ≤ yi for all objects , and that the same
holds for any piece of fabric at any time (X ≤ Y ).
Denote by Si the set of objects whose either x dimension exactly equals i, and by P (x, y)
the best profit we can make for any piece of fabric with integer dimensions x × y.
For the moment, assume that the dimensions of the piece of fabric are strictly greated than
those of the largest object, in such a way that we must make at least a cut.
Suppose we know an optimal solution to the problem, and consider the first cut we are going
to make. Two cases may occur:
• The optimal solution says we must cut alongside of the X axis, k units away from a
border: we remain with 2 pieces of fabric A and B, whose dimensions are X × k, and
X × (Y − k)
• The optimal solution says we must cut alongside of the Y axis, k units away from a
border: we remain with 2 pieces of fabric A and B, whose dimensions are k × Y , and
(X − k) × Y
The profit we make in each case is always P (A) + P (B), and both P (A) and P (B) must
correspond to optimal solutions too, for otherwise it would mean there would exist another
solution yielding a strictly greater sum, contradicting our optimality hypothesis.
As we don’t know neither in which direction, nor how far we should cut, we consider all
possibilities, and write
to represent the optimal profit we make by cutting an x × y piece of fabric alongside of the
X and Y axes, respectively. Only the best of this profit is of interest.
Let us come back now to the case our hypothesis on the dimensions of the piece of fabric
does not hold. There are two ways to unassert it:
• The dimensions of the piece of fabric are exactly those of an object i whose profit is
maximal. We don’t need any cut, and this case our profit is just pi .
• The piece of fabric is too small to cover any object. We don’t need any cut as weel,
and in this case our profit is 0.
10
P (x, y) = min{Px (x, y), Py (x, y), P0 (x, y)} (4)
(3), then update it for all possible couples (x, y) in case Px or Py does better R
1. We don’t really need P0 : suffice to initiaize a single matrix P from P0 ’s values using
the
R
following code does so, in lines 10, and 59–61
2. We can’t compute Py (x, y) before all Py (x, j), j < y are computed line 64: for x
R
fixed, we compute all successive y starting from x itself (recall y ≥ x by convention)
3. We can’t compute Px (x, y) before all Px (i, y), i < x are computed line 63 increments
R
x after all P (x, y ≤ x) are computed
5. Procedures Px and Py implemenent (1) and (2) and return the best cut index and profit
in Best_K and Best_P
1 procedure D p t a i l o r i s
2
8 Fx : constant P o s i t i v e := 5 ; −− X dim . o f t h e i n i t i a l p i e c e o f f a b r i c
9 Fy : constant P o s i t i v e := 7 ; −− Y dim . . . .
10 P : array ( 1 . . Fx , 1 . . Fy ) of N a t u r a l := ( others => ( others => 0 ) ) ;
11 S o l : array ( 1 . . Fx , 1 . . Fy ) of Cut ; −− t h e s o l u t i o n
12 Kx , Ky , Cx , Cy : N a t u r a l ;
13
11
33 begin
34 Best P := 0 ;
35 Best K := 0 ; −− no c u t ( p o s s i b l e , or o p t i m a l )
36 for K in 1 . . X/2 loop
37 i f Get P (K,Y)+Get P (X−K,Y) > Best P then
38 Best P := Get P (K,Y)+Get P (X−K,Y ) ;
39 Best K := K;
40 end i f ;
41 end loop ;
42 end Px ;
43
56 begin
57 −− p l a c e a few random o b j e c t s w i t h p r o f i t
58 −−
59 Set P ( 1 , 1 , 1 ) ;
60 Set P ( 1 , 2 , 3 ) ;
61 Set P ( 2 , 2 , 1 0 ) ;
62
63 for X in 1 . . Fx loop
64 fo r Y in X . . Fy loop
65 Px (X, Y, Kx , Cx ) ;
66 Py (X, Y, Ky , Cy ) ;
67 i f (Cx > Get P (X,Y) ) then
68 Set P (X, Y, Cx ) ;
69 S o l (X,Y ) . Xcut := Kx ;
70 end i f ;
71 i f (Cy > Get P (X,Y) ) then
72 Set P (X, Y, Cy ) ;
73 S o l (X,Y ) . Ycut := Ky ;
74 end i f ;
75 end loop ; −− Y
76 end loop ; −− X
77
78 −− p r i n t t h e s o l u t i o n
79 for Y in 1 . . Fy loop
80 fo r X in 1 . . Fx loop
81 Put ( Natural ’ Image (P(X,Y) ) & ” : ( ” &
82 Natural ’ Image ( S o l (X,Y ) . Xcut ) & ” , ” &
83 Natural ’ Image ( S o l (X,Y ) . Ycut ) & ” ) ” ) ;
84 end loop ;
12
b1 e1
b2 e2
e7
e0 b8
85 New Line ;
86 end loop ;
87
88 end D p T a i l o r ;
Remark 1. Solving a single subproblem x×y amounts to computing a single element P (x, y),
and takes Θ(x + y) time as we have to examine all possible cuts of this piece of fabric. As
we have exactly XY elemeets to compute to reach the solution, the overall complexity is
Θ(X 2 Y + Y 2 X). It is polynomial in X and Y , but exponential in the number of bits it takes
to encode X and Y , so the real complexity, expressed as the size of its input as usual, is
indeed exponential. One talks about “pseudo-polynomial” solutions in such cases.
Remark 2. If all objects have their xi = 1, then this problem is exactly the integer unbounded
knapsack problem. We only have to examine all possible cuts perpendicular to the Y axis only
in this case, which leads to a complexity of Θ(Y 2 ). Again, and as expected, the complexity
is exponential in the number of bits it takes to encore Y . DP solutions might therefore be
exponential in time, not contradicting the NP-completeness nature of a problem.
Suppose you have a machine that can process only a single task at a time. You have n such
tasks a1 , ..., an . The duration of task j is tj seconds, and its (absolute) execution deadline is
dj . If you terminate task ai before dj , you earn pi Euros; otherwise, you earn nothing.
Propose an algorithm to find the scheduling that maximise your benefit.
You may assume that all datations are integers, and that any task may be schedulded more
than once.
3.2.2 Solution
For convenience, we assume that the tasks are sorted by increasing deadline dates, as shown
in Fig. 2. This will cost O(n log n) time. We also assume that if task i is schedulded at time
t, it terminates slightly before t + di , in such a way that a new task can begin exactly at that
time. We denote by P (t) the best profit we can make at time t.
The underlying principle that we shall apply in both cases is that since tasks are independent
one to each other, the optimal solution to the problem at any time t does not depend on how
the optimal profit has be obtained at passed values t = 1, ..., t − 1 , but only on the values
of these profit. Consider task i: at any time t ≤ di − ti , we have to make a decision:
13
• Either we decide to schedule it : this implies that no other task can be scheduled from
t to t + ti , and the profit at t + ti will be that already obtained at t, plus bi , the profit
for achieving task i
• Or we decide not to schedule it : this implies that either the profit remains unchanged,
or that we schedule another task (if we can).
In other words, it always holds that the optimal profit P (t) at any time t obeys
Remark: a simpler algorithm can also be derived in case any task i is scheduldable only
once with fixed starting date t = di − ti . In that case, it is only needed to update the profit
for every ”event” on the time axis in Fig. 2. Thus, the outer loop ”for t := d1 ..dn ” should
be replaced by 1..2n – although there might pairs of identical ending or starting dates;
yet, the number of events to be examined remains 2n. The final complexity is therefore
O(n log n + 2n) = O(n log n).
3.3.2 Solution
As suggested by the hint in the statement, let us consider the subsquence Sj with maximum
sum ending exactly at index j ∈ [1 : N ], assuming that N = |S| ≥ 1. Because the ending
14
index of Sj is j (it must include T (j)), the only unknown is its starting index, which we shall
denote by bj .
Suppose we know bj , and consider Tj+1 , the best subsequence ending exactly at index j + 1,
and starting at some (unknown) index bj+1 . We claim the following result:
( Pj
bj if i=bj T (i) ≥ 0
bj+1 = (6)
j + 1 otherwise
4 procedure Dp Subseq i s
5
10 Sum : I n t e g e r := T ( 1 ) ; −− sum o f T( b . . i )
11 Best : I n t e g e r := T ( 1 ) ; −− sum o f T( b s . . be )
12
13 begin
14 for i in T’ F i r s t + 1 . .T’ Last loop
15
15
28
4 Greedy algorithms
4.1 A long journey
4.1.1 Statement
You are going on a long journey between Antwerpen and Napoli. Once the tank of your
car is filled, you know you can do at most n km. You have a roadmap that tells you where
the fuel stations are located, and you would like to make as few stops as possible. Give an
efficient algorithm to solve this problem.
Suggested steps:
1. Show that if your current position is p, and x and y are two consecutive towns you must
visit, then the solution of refilling your tank at town y can’t be worse that refilling at
town x.
4.1.2 Solution
Denote by xi , x = 1, ..., k the absolute distance of fuel station i from Antwerpen, and put
x0 = Antwerpen, and xk+1 = Napoli. First observe that if ∃i ∈ [2 : n] : xi − xi−1 < n, then
don’t move and remain home! We shall obviously assume xi+1 − xi ≤ n, ∀i ∈ [2 : n], so
that there is always exist a next fuel station less than n km away from the last one.
The procedure is indeed very simple:
procedure Ga_Car is
N : constant Integer := 8;
K : constant Integer := 7;
X : array(0..K+1) of integer := (0, 2, 5, 8, 11, 18, 29, 34, 45);
I,D : Integer := 0;
begin
while I < K+1 loop
D := X(I);
while (I < K+1) and (X(I)-D < N) loop
I := I+1; -- look ahead for next station
end loop;
if I < K+1 then
Put_Line("Make a break at " & Integer’Image(X(I-1)));
end if;
end loop;
end Ga_Car;
Put briefly, when you leave station i, you always make a break at the farthest fuel station j
ahead from you, but not further away than n km. Why works?
16
• If you stop at k < j, then your tank full, so you can reach station j equally well. The
only difference is : you did an extra stop. So solution k is not better than solution j.
• If you wish to stop further than j, then the condition x(j) − x(i) ≤ n is broken.
Suppose you have a set of n lectures that need be scheduled in classrooms. Each lecture has
fixed (non-modifiable) starting and ending times. You would like to use as few classrooms
as possible to schedule all lectures.
2. Try to improve this solution to an O(n log n) time algorithm, and possibly O(n) under
the condition that all lectures start and end on exact hours, and that the algorithm is
to be run daily.
4.2.2 Solution
Let li = (si , ei ) represent lecture i, starting at date si and ending at date ei . A naive solution
could consist in maintaining a list C of classrooms as follows:
• end for
A worst case occurs when there is as many lectures as classroom, and the beginning of lecture
i is less than ending time of all other already schedulded lectures. In that case, the inner
loop is O(n), so overall O(n2 ).
We may improve this time to O(n log n) as follows. First, consider a quadruplet (i, d, b, c)
where:
• d is either bi or ei
• c is a classroom number
To build this new solution, we need an array T of 2n such quadruplets, and a list F initially
containing all classrooms numbers. The algorithm is given at Alg. 3.
The algorithm works roughly like this:
• T stores all events, which consists of all dates, either of beginning or of end of all
lectures
• We walk T chronologically:
17
Algorithm 3 Greedy scheduling of lectures
for i=1..n do
T [2i] := (i, bi , true, 0)
T [2i + 1] := (i, ei , f alse, 0)
end for
Sort T by increasing d’s first, then “increasing” b’s (false < true)
for i=1..2n do
if T[i].b is false then
Pop the first element f in front of F
Schedule lecture i in classroom f : T [i].c := f
else
Push integer T [i].c in front of F
end if
end for
– If we find that the next event is the end of a course (T [i].b is false), then we declare
the classroom in which it was schedulded as free, and put it back in front of F .
– Otherwise, it is the beginning of a new course, so we schedule it in the first free
classroom we find, and make the classroom unavailable til its end.
Note that in case several lectures begin and and at the same time, lectures who end are
favoured and processed first: this, indeed, ensures optimality of the solution, as
• a classroom can not be reinserted in F earlier than the the end of the current lecture
schedulded in it
• if two classroom are available for a new course, one already used, the other not, then
the fact we use F as a LIFO ensures the former will be favoured over the latter.
The inner loop is in O(n) time, and the complexity is dominated by that of sorting T , that
is, O(n log n). If enumerating all possible date is not too costly (dates are typically integer
hours of a day), it may even be possible to sort T in linear time from the date histogram.
4.3 Permutation
4.3.1 Statement
Suppose you have two sequences of n positive numbers A = {ai }ni=1 and B = {bi }ni=1 . You
are free to reorganize them as you want, after what, you get a profit of Πni=1 abi i . Give a
strategy to maximize your profit.
Suggested steps:
1. Show (by contradiction) that if x = maxi ai and y = maxi bi , then xy must be a factor
of the profit.
4.3.2 Solution
18
First of all, observe that maximizing the profit or its log are identical problems. Therefore
we can try to maximize
Xn
P = bi log ai
i=1
equally well. Let us show by contradiction that bn log an must be a term of P . Suppose
log an be multiplied by something lower than bn , say bj , with both j < n and bj < bn . The
partial profit (for indices n and j) for this solution is
Pj = bj log an + bn log aj
Pn = bn log an + bj log aj
As a consequence
Pn − Pj = (bn − bj ) log an − (bn − bj ) log aj
an
= (bn − bj ) log
aj
>0
1. Consider 3 customers C1 ,C2 ,C3 , with service duration 3,5,7, and priorities 6, 11, 9.
Among the possible schedulings φ1 (1) = 1, φ1 (2) = 3, φ1 (3) = 2 and φ2 (1) = 3, φ2 (2) =
1, φ2 (3) = 2, which one is preferable?
2. Consider two schedulings φ1 and φ2 , identical everywhere except that φ1 makes cus-
tomer j served immediately after customer i, while φ2 does just the opposite. What
does ∆ = P (φ1 ) − P (φ2 ) equal to?
19
Figure 3:
4.4.2 Solution
Question 1
P1 = 0 × 6 + 3 × 9 + 11 × 10 = 137
P2 = 0 × 9 + 6 × 7 + 11 × 10 = 152
• should we schedule customer i or j at rank k, the time at which we’ll do that does only
depend on the past, and neither on i nor j. It is some time T .
• in a similar way, what will be schedulded starting from rank k + 2 does actually not
depend on how schedulded i and i : the starting time is always T 0 = T + di + dj , as +
is a commutative operator.
If we schedule customer i first, then he will wait T minutes, and will cost T pi to the service.
Then will come customer j, who will wait T + di , and will cost (T + di )pj . The total cost is
therefore
P (φ1 ) = T pi + (T + di )pj
On the other hand, if we schedule customer j first, and j next, then we get a cost of
P (φ2 ) = T pj + (T + dj )pi
20
penalty penalty
pi , duration di pj , duration dj
customer
i customer
j
generic scheduling
time
0 T T’
rank k rank k+1
Figure 4:
Question 3
We have
∆ > 0 ⇔di pj > dj pi
di pi
− >0
dj pj
Eq. 7 gives a condition to determine whether two customers ranked consecutively must be
swapped or not, and it works everywhere : this is the very definition requested for the sorting
operator in any sorting algorithm.
Hence, to solve the problem, suffice to sort according to f . The algorithm is greedy in that
it selects first the customer with lowest possible cost. Any sortging algorithm can do the
job, for instance, mergesort. Its complexity is O(n log n).
Consider the problem of giving back change on n cents using as few coins as possible.
21
1. Give a greedy algorithm that gives back change using coins of 50, 20, 10, 5 and 1
cents. Show that it is optimal. Hints: you should do this in a case-by-case fashion: by
considering all possible values of n in [1:100], compare, in each case, how many coins
the greedy solution requires to how many a concurrent can require at best.
2. Suppose now you have k + 1 coins whose values are powers of some constant γ > 0,
that is, 1, γ, γ 2 , ..., γ k . Prove that your greedy algorithm is still optimal. Suggested
steps:
(a) Show that if h is the highest integer for which the numbers of coins with value γ h
differ between the greedy and a concurrent solution, then it is impossible that the
concurrent solution returns more coins of that value than the greedy.
(b) Taking this result in account, show that any concurrent solution will return at
least one more coin than the greedy solution. Hints: i) try to see what happens
for γ = 2; ii) generalize for any γ > 0, and remember that for any r 6= 1,
n
X rn+1 − 1
rn =
r−1
i=0
3. Give a set of values for the coins for which the greedy solution is not optimal.
4. Give an optimal O(nk) algorithm which gives back change whatever the values of the
coins – but assuming there is always a coin of 1 cent. Suggested steps:
(a) Suppose the optimal solution involves the subproblem of giving back change on
p < n cents. Which other subproblem does this implies?
(b) Using this, and by considering p can span all possible values of the coins, give the
recurrence relation.
(c) Derive the code which will fill the table. Simply check its complexity is O(nk).
4.5.2 Solution
Question 1 Put v1 = 1, v2 = 5, v3 = 10, v4 = 20, v5 = 50. A code for the greedy algorithm
could merely look as follows:
procedure Gr_Coins is
begin
Put("Give n > 0: ");
Get(Item => N);
Why is it optimal? Consider the two expansions n = 5i=1 gi vi and n = 5i=1 ci vi , that
P P
correspond to the greedy, and to a concurrent (non-greedy) solution, respectively. Let h be
the highest integer such that ch 6= gh . Observe the following two points:
22
h vh n∈ concurrent uses greedy uses
.. coins at most .. coins
1 1 [1 : 5[ N/A N/A
2 5 [5 : 10[ n n−4
3 10 [10 : 20[ k k − 1 since some subset must sum up to 10
4 20 [20 : 50[ k k − 1 since some subset must sum up to 20
5 50 [50 : 100[ k k − 2 since some subset must sum up to 50
1. By definition of the division, we necessarily have ch < gh , for otherwise we would have
ch ≥ gh + 1 and the concurrent expansion would exceed n.
2. Inequality gi vi < vi+1 holds for 1 ≤ i ≤ 4, for otherwise the greedy algorithm would
be able to put at least one additional coin of value vi+1 , a contradiction.
So gi < vi+1 /vi from point 2, and ch < gh from point 1, which implies ch < gh < vh+1 /vh .
If we apply this double inequality with possible values for h and the additional constraint
that n ≤ 99, we get that gh ≤ 1 for h = 2, 3, and g4 ≤ 2. So there is always room so that
the greedy solution can be distinguished with at least a coin of value vh , and the concurrent
solution must have fewer coins of that value.
Consider now an optimal solution that does include a coin of value ch . Then the other coins
represent the optimal solution to the problem of giving change on n − vh cents : if that was
not true, it would mean there would exist a solution for giving change on n − vh cents using
fewer coins, contradicting the optimality hypothesis of the whole solution without that it be
possible to remove the coin with value vh itself. This proves the optimal substructure of the
problem.
To prove optimality, suffice to show that the concurrent solution, which includes fewer coins
of value vh , can always be improved to the greedy solution for each value of h. This is
summarized in table 1 in a case-by-case fashion, as different values of h yield different possible
intervals for n.
Question 2 We keep the same notations as in question 1. Inequalities established there still
apply and will prove useful:
ch < gh (8)
∀i ∈ [1 : k], ci < vh+1 /vh = γ (9)
Remind that (9) results from the assumption of optimality of the concurrent solution. We
shall show that this assumption leads to a contradiction.
Denote by Sg and Sc the expansions of n according to the greedy and the concurrent solutions:
h−1
X
h
Sg = gh γ + gi γ i (10)
i=0
h−1
X
Sc = ch γ h + ci γ i (11)
i=0
23
From (9) and (11) we get
h−1
X
h
Sc ≤ ch γ + (γ − 1) γi
i=0
γh − 1
≤ ch γ h + (γ − 1)
γ−1
h
≤ (ch + 1)γ − 1 (12)
Sc − Sg ≤ (ch + 1)γ h − 1 − gh γ h
≤ (ch − gh + 1)γ h − 1
<0
The last line holds because of (8), and γ > 1 (otherwise, distinguishing powers of γ is irrel-
evant). This contradicts Sp > Sg , therefore any concurrent solution cannot be better than
the greedy solution.
n = 35 = 4 × 8 + 3 × 1 → 7 coins
=5×7 → 5 coins
Question 3 Since the greedy choice does not always apply, the only exploitable property of
the problem is that of optimal substructure: if an optimal solution to the problem of giving
change on n cents involves a coin of value v, then the other coins represent the solution to
the problem for giving change on n − v cents, and must be optimal too.
The only problem now is that we don’t know which coin should be included in the solution;
so we try them all, discarding all those whose value is greater than n. In addition, the trivial
case is when we have to give change on n = 0 cents, with no coins at all as solution.
Denote by C(n) the “cost” for giving change on n cents, which equals the number of coins
we use. This gives the relation :
= min 1 + C(n − vi )
i∈[1:k],vi ≤n
because the cost for giving change on exactly vi cents is always 1 (1 coin of that value and
nothing else). In addition, we have the trivial case
C(0) = 0
24
K : constant Positive := 4;
V : constant array(1..K) of Positive := (1, 3, 8, 9);
begin
for J in 1..N loop -- give change on J coins
I := 1;
Best_Sum := Natural’Last;
while V(I) <= J loop
if 1+C(J-V(I)) < Best_Sum then
Best_Sum := 1+C(J-V(I));
Best_I := I;
end if;
I:=I+1;
exit when I > K;
end loop;
end Tabulate;
In this code we have taken advantage of the inner loop on J to also tabulate an extra array
nammed Coins, which represent the optimal number of coins needed to give change on each
value of n.
The complexity is obviously O(nk) as the outer loop runs over the different values of n, and
the inner loop examines k different values of vi each time.
4.6 Party
4.6.1 Statement
Your boss asks you to organize a party in which new colleagues can meet. So that the party
be successful, you believe it reasonable not to invite someone if he/she knows more than
n − p persons out of the n, or fewer than p. However, you would like to invite as many
people as possible.
• Model the problem as a graph: each vertex is a person, each edge represent the
fact two persons know each other
25
• A subgraph is obtained by removing a given vertex in a graph. Observe that
any vertex can’t be connected to more vertices in any subgraph than it is in the
original graph.
• Show that knowing n − 1 persons in the graph of knowledges is equivalent to not
knowning 1 in the graph’s complement.
4.6.2 Solution
Question 1
We represent the problem with an unoriented graph G = (V, E): v ∈ V represents person v,
while e = (u, v) ∈ E ⊆ V × V encodes the fact that persons u and v know each other. Recall
that the degree dV (x) of a vertex x is the number of edges it is involved in, or equivalently,
the number of other vertices it is connected to.
For p = 1, we are looking for the largest possible subgraph G0 = (V 0 , E 0 ) of G, satisfying the
following two conditions:
C1 ∀x ∈ V 0 , ∃y ∈ V 0 : y 6= x ⇒ (x, y) ∈ E 0
C2 ∀x ∈ V 0 , ∃y ∈ V 0 : y 6= x ⇒ (x, y) ∈
/ E0
Without discussing the unicity of the solution (we just want one), we pretend that Alg. 4
computes a correct solution to the problem. Its principle is very simple : initially, we set
V 0 = V , then we examine in turn each node x of V 0 and remove it if either d(x) = 0 or
d(x) = |V 0 | − 1 hold, until stability. Why does this work? All relies, indeed, on this obvious
result:
In other words, a vertex cannot be more connected in a subgraph than it is in a given graph..
which falls under common sense. A consequence is that if a node is disconnect in a graph,
then so is it any subgraph as well.
Since ∃y ∈ V 0 : y 6= x ⇒ (x, y) ∈ E 0 is equivalent to dV 0 (x) = 0, and that dV 0 (x) = 0 ⇒
dV 00 (x) = 0 for any V 00 ⊂ V 0 in virtue of the lemma, we conclude that all nodes with degree
0 of G cannot belong to the solution because they don’t satisfy C1, and this holds whatever
C2 says or think of them.
V 0 is therefore at best equal to V 00 = {x ∈ V : d(x) > 0}, and V 00 is obviously the largest
possible subset of V satisfying condition C1.
Let us now turn to C2. Consider V 00 as computed before, and x ∈ V 00 . If x satisfies C2, then
we are done. If it does not, it means it is connected to all other nodes of V 00 . Consider then
G00 = (V 00 , V 00 × V 00 \ E 00 ), the complementary graph of V 00 in its set of edges. In this graph,
x is connected to nothing. So dG00 (x) = 0, and by virtue of the lemma, dG000 (x) = 0 holds
too for any subgraph G000 of G00 . This means there is no subgraph of G00 which can contain
x and be a solution at the same time. Therefore, x cannot belong to the solution and has to
be removed.
To conclude, observe that a vertex x which violates C1 can be removed at any time, without
contradicting dV 00 (x) = 0, and independently of V 00 . This justify that “submitting” V 00 to C1
for removal, then to C2 for removal, is equivalent to checking both conditions, and possibly
remove,node by node in a single loop.
26
Algorithm 4 The largest successful party for p = 1
V 0 := V , n := |V 0 |
continue := true
for x ∈ V 0 do
if d(x) = 0 or d(x) = n − 1 then
for y ∈ V 0 : (x, y) ∈ E 0 do
Set d(y) := d(y) − 1
E 0 := E 0 \ {(x, y)}
end for
n := n − 1, V 0 := V 0 \ {x}
end if
end for
Invite V 0 to the party
If G is given in the form of a sorted planar map2 then we can iterate on the neighbours of
any node in logiarthmic time, so line E 0 := E 0 \ {(x, y)} takes O(log n) time at worst, which
occurs when G is a clique. This line would be executed O(n) times for the loop on y’s, which
in turn will be executed O(n) times by the loop on x’s. Altogether, the the algorithm is
therefore O(|V 2 | log |V |). In the best case (the graph is a line), there is no removal at all, so
Ω(|V |).
Question 2
The above proof can be easily extended when p ≥ 2:
• For condition C1, if dG (x) < p, then dH (x) < p too for any subgraph H of G; so x has
to be removed
• For condition C2, if dḠ (x) < p, then dH̄ (x) < p too for any subgraph H of G; so x has
to be removed too.
5 NP-completeness
5.1 ZOE
5.1.1 Statement
Consider ZOE (zero-one equations): given an m × n matrix A filled with 0’s or 1’s, is there
an n-vector x filled of 0’s or 1’s only such that Ax = 1?
5.1.2 Solution
Question 1) ZOE ∈ P : the certificate is just the solution itself, and it takes O(mn) binary
multiplications +O(n) comparisons to check the validity of any solution.
2
To each vertex is attached the set of pointers to edges the vertex is involved in, which is furthermore
assumed as sorted
27
ZOE is NP-hard : we shall show this by reducing ZOE to n-SAT. Let us first rewrite Ax = 1
this way :
a11 a12 . . . a1n x1 1
.. .. .. .. = ..
. . . . . (14)
am1 am2 . . . amn xn 1
Consider, for the moment, the i-th row of this linear system of equations alone:
n
X
aij xj = 1 (15)
j=1
From now on, we assume that i is fixed and focus only on line i. Let S be the set of indices
j for which aij = 1, and put, for convenience, yu = xiSu for all u = 1, . . . , |S|. Equation (15)
sum can be rewritten as
|S|
X
yu = 1 (16)
u=1
Because the yu ’s can only be 0 or 1 (just as the xi ’s), we can think of this last inequality
in terms of logics: with a slight abuse of notation, it merely says that only one yu variable
must be true, while all others need be false. If |Si | = k, this is general k-XOR condition :
which is in disjunctive normal form (DNF). To recast the problem as a k − SAT problem, we
need to convert F (y1 , . . . , yu ) into a CNF; this unfortunately results in a conjunction with 2k
clauses before simplification (as shown in class), or an O(2n ) in worst case, and our reduction
to k-SAT would this be invalid (remind, our 1-to-1 transformation needs be polynomial in
time).
There is a more straightforward way to rewrite F using a polynomial number of clauses only:
suffice, indeed, to see that F (y1 , . . . , yu ) is true if and only if i) one of the yu ’s is true, and
ii) all pairs (ya , yb ) are false whenever a 6= b. If all pairs are false, then so will be all triplets;
and so will be all quadruplets; and so on. It is therefore easily seen that
At last, we apply De Morgan’s law on all clauses ¬(yu ∧ yv ) to get ¬(yu ∧ yv ) = ¬yu ∨ ¬yv ,
then
F (y1 , . . . , yu ) = (y1 ∨ ... ∨ yk )∧
(¬y1 ∨ ¬y2 ) ∧ (¬y1 ∨ ¬y3 ) ∧ ... ∧ (¬y1 ∨ ¬yk )
(¬y1 ∨ ¬y2 ) ∧ (¬y1 ∨ ¬y3 ) ∧ ... ∧ (¬y1 ∨ ¬yk )
∧...
∧(¬yk−1 ∧ ¬yk )
which is the conjunction of a k-CNF with O(k 2 ) = O(n2 ) additional 2-CNFs, so overall, a
k-CNF whenever k ≥ 2. Since we have m independent lines to satify, the resulting CNF
will be at worst an n-CNF with O(mn2 ) clauses, a polynomial number of both m and n.
28
We have thus found a procedure to rewrite any instance of ZOE as an instance of n-SAT
in polynomial time, and our transformation works forwards, but also backward. Therefore,
ZOE ≥p n-SAT, and since n-SAT is NP-hard, so must be ZOE.
Question 2) Integer linear programming (ILP) is a problem of the form
maximize bt x
subject to Ax ≤ c
where A is a matrix with integer coefficients, x, b, and c are vectors with integer coefficients,
and t denotes transposition. It is an optimization problem. Its corresponding decision version
amounts to determining whether the objective function can be larger than some constant k.
In other words: does the system of inequalities
(
bt x ≥ k
(17)
Ax ≤ c
admits a solution in x or not? The latter problem is clearly in N P : the certificate is just
the solution x itself, and if A is an m × n matrix, it will take O(mn) multiplications and
addition to check the correctness of the solution. To show that ILP is NP-hard, consider
ILP problems of the form :
(1 . . . 1)t x ≥ 0
Ax ≤ 1
−Ax ≤ −1 (18)
0 ≤ x
x≤1
and the former two constraint x to have components in {0, 1}. That is, a general ZOE
problem, whose solution leaves the first line of (18) always s In other words, any ILP problem
of the form (18) can be rewritten as one of the form (19), and conversely. Therefore, there is
at least a subset of ILP problems for which ZOE ≤P ILP holds true. But ZOE is NP-hard,
so ILP can be claimed to be NP-hard too.
Bonnie and Clyde have just robbed a bank. They have a loot consisting of n dollars, and
they would like to split it into 2 parts of equal size. State whether splitting the loot is an
NP-complete problem or not in each of the following cases:
• same, but don’t mind if the difference between the 2 parts is less that 100 dollars
29
5.2.2 Solution
We first notice that for the 3 first questions, the problem is not NP-hard whenever n 6= 2p,
simply because the answer to the decision problem in such cases is negative. We shall
therefore assume n = 2p in the sequel, and focus on that case. Clearly, it takes a polynomial
number of multiplications and additions to check whether a proposal will sum to p or not,
or is in [p − 100 : p + 100] (last question). Thus in all cases, the solutions can be checked
in polynomial time, therefore the algorithms are in N P, and it remains to examine their
NP-hardness.
1) They have a coins of x dollars, and b of y dollars. We assume n = a + b = 2p. Suppose
Bonnie receives u coins of value x and v of value y. Then the question amounts to determining
whether
x(a − u) + y(b − v) = p
i
X i
X
n= αj 2j , p= βj 2j
j=0 j=0
where the βi ’s are the coefficients of the binary expansion of p. Since we want n = 2p,
and the binary decompositions of p and n are unique, our problem is therefore to determine
whether these two decompositions can be considered as equivalent by possibly relocating
exceeding coins of value 2i to coins of larger values 2j , j > i, if this situation ever happens.
We claim that the following (greedy) algorithm precisely achieves this:
• Init: pack every coin in a separate bag (see why further in the text).
• For j=0,...,i do
– If αj < 2βj or αj − 2βj is even, then claim the loot is not separable, and stop.
– If βj > 0, then give 1 bag of value 2j to Bonnie, and another one to Clyde.
– Pack the αj −2βj remaining bags with value 2j , by pairs, into new bags. Put these
new bags along with those of value 2j+1 and update αj+1 ← αj+1 + αj /2 − βj
Why works? The binary decomposition of p is unique. At iteration 0, if p is even, then its
lowest bit has to be 1, so β0 = 1. This tells us that Bonnie must receive a coin of value 1,
just as Clyde, for otherwise the split would be unfair. Two cases may then arise:
• It is not feasible because α0 < 2 (we have no, or just one coin of value 1). Then the
problem has no solution : all other coins have a value ≥ 2, therefore any combination
of them will be a multiple of 2, but never 1.
• On the contrary, if α0 ≥ 2, then we can give a coin of value 1 to Bonnie, and another
to Clyde. But we might be left with a new problem: what to do of the remaining coins,
if any? As we said earlier, all the remaining bags now have a value of at least 2. This
implies two things:
30
1. We don’t need to care of coins of value 1 any longer: the ”unit” is now 2, not
1. So we group pairs of coins of value 1 into bags of value 2, and put these new
bags alongst with those of value 2 : this is a new problem, but strictly identical
in nature to the former one.
2. If we have an even number of remaining coins, again, we find ourselves with 1 coin
of value 1 and we are stuck: it cannot be part of the solution.. so the problem,
again, has no solution.
If β0 = 0, then there are no coin to distribute, but the problem remains the same (what to
do of the remaining coins of value 1, if any?), and our tests involving αj − 2βj still work.
We then start a new iteration, and face exacly the same problem with packs of value 4/2
instead of 2/1.. a.s.o, til the end. In short, the correctness of this algorithm relies on the
fact that the question of the ”precision” of the split, at any iteration j, must be answered
immediately, and does not depend on subsequent arrangements of the loot.
The algorithm is clearly polynomial in time : i times operations involving at worst substrac-
tions.
3) They have bearer bonds of arbitrary value. Let v1 , ..., vn be the values of these bonds.
The loot is separable if and only if we can find x1 , ..., xn ∈ {0, 1}, such that
n
X
xi vi = p (20)
i=1
Assume that all vi ’s can be coded using at most k bits. Call v1 , ..., vn ∈ Zk the colon vectors
whose components are the binary decompositions of v1 , ..., vn , and p that of p. Then Eq.
(20) is immediately rewritten as
x1 p1
v1 v2 ... vn
.. = .. (21)
. .
xk pk
This is almost a k × n ZOE problem, except that there might be indices i for which pi = 0,
a violation of the statement of ZOE. To correct that, let us consider one such i. The
corresponding row implies that all variables xj for which Aij 6= 0 be set to 0: for otherwise,
the equality would just not hold. Call Sj the set of indices of these variables. Then, the set
of all variables which need to be set to 0 is nothing but
[
V := Sj
j:pj =0
a set whose size is arbitrary, and does only depend on A and p. Thus, solving (21) amounts
to solving a true (k − |V |) × n ZOE problem, obtained by taking (21), and droping all rows
and variables for which pi = 0. Because |V | is independent of k (rows of A for which pi 6= 0
can still be chosen arbitrarilly), there exist instances of the Bonnie and Clyde problem (BC)
such that ZOE ≤p BC. But ZOE is NP-hard, so BC has to be NP-hard too.
4) Same, but don’t mind if the difference between the 2 parts is less that 100 dollars. This
does not change the complexity of the problem, which is still NP-hard. The question ad-
dressed now is: does
Ax = z0
admit a solution x in {0, 1}k for any z0 being the binary representation of p − 100, p − 99, ...,
p + 100? If the problem were not NP-hard, that is, in P, that would mean all of its instances
31
could be solvable in polynomial time. However, for arbitary p large enough, all of them are
known to be NP-hard, as instances of problems of the question 3. A contradiction. So the
problem remains NP-hard.
Consider the HITTING SET problem: given n sets of elements S1 , . . . , Sn , and a contant
k > 0, find a set S such that |S| ≤ k and ∀i = 1, . . . , n : S ∩ Si 6= ∅. Is HITTING SET
NP-hard or not?
5.3.2 Solution
NOTE : the related optimization version is : given n set of values S1 , . . . , Sn , find the smallest
possible set S such that ∀i = 1, . . . , n : S ∩ Si 6= ∅.
Let us show that this is nothing but a disguised ILP problem. Consider the union of all
subsets. Say it is made of p values v1 , . . . , vp . Now define x1 , ..., xp as
(
1 if xi should be in S
xi =
0 if if should not
This system of equations is identical to that of (17), except that sign must be changed in
both lines3 . Any instance of HITTING SET is therefore equivalent to an ILP problem. But
ILP is NP-hard, and so must be HITTING SET.
3
But we made no assumptions regarding the sign of k, b, or c for ILP
32