KEMBAR78
16_Greedy_Algorithms Greedy_AlgorithmsGreedy_Algorithms | PPT
Introduction to Algorithms
Chapter 16: Greedy Algorithms
Greedy Algorithms
 Similar to dynamic programming, but simpler approach
 Also used for optimization problems
 Idea: When we have a choice to make, make the one
that looks best right now
 Make a locally optimal choice in hope of getting a globally optimal
solution
 Greedy algorithms don’t always yield an optimal solution
 Makes the choice that looks best at the moment in order
to get optimal solution.
Fractional Knapsack Problem
 Knapsack capacity: W
 There are n items: the i-th item has value vi and weight
wi
 Goal:
 find xi such that for all 0  xi  1, i = 1, 2, .., n
 wixi  W and
 xivi is maximum
50
Fractional Knapsack - Example
 E.g.:
10
20
30
50
Item 1
Item 2
Item 3
$60 $100 $120
10
20
$60
$100
+
$240
$6/pound $5/pound $4/pound
20
---
30
$80
+
Fractional Knapsack Problem
 Greedy strategy 1:
 Pick the item with the maximum value
 E.g.:
 W = 1
 w1 = 100, v1 = 2
 w2 = 1, v2 = 1
 Taking from the item with the maximum value:
Total value taken = v1/w1 = 2/100
 Smaller than what the thief can take if choosing the
other item
Total value (choose item 2) = v2/w2 = 1
Fractional Knapsack Problem
Greedy strategy 2:
 Pick the item with the maximum value per pound vi/wi
 If the supply of that element is exhausted and the thief can
carry more: take as much as possible from the item with the
next greatest value per pound
 It is good to order items based on their value per pound
n
n
w
v
w
v
w
v


 ...
2
2
1
1
Fractional Knapsack Problem
Alg.: Fractional-Knapsack (W, v[n], w[n])
1. While w > 0 and as long as there are items remaining
2. pick item with maximum vi/wi
3. xi  min (1, w/wi)
4. remove item i from list
5. w  w – xiwi
 w – the amount of space remaining in the knapsack (w = W)
 Running time: (n) if items already ordered; else (nlgn)
Huffman Code Problem
 Huffman’s algorithm achieves data
compression by finding the best variable
length binary encoding scheme for the
symbols that occur in the file to be
compressed.
Huffman Code Problem
 The more frequent a symbol occurs, the
shorter should be the Huffman binary word
representing it.
 The Huffman code is a prefix-free code.
 No prefix of a code word is equal to another
codeword.
Overview
 Huffman codes: compressing data (savings of 20% to
90%)
 Huffman’s greedy algorithm uses a table of the
frequencies of occurrence of each character to build
up an optimal way of representing each character as
a binary string
C: Alphabet
Example
 Assume we are given a data file that contains only 6 symbols,
namely a, b, c, d, e, f With the following frequency table:
 Find a variable length prefix-free encoding scheme that
compresses this data file as much as possible?
Huffman Code Problem
 Left tree represents a fixed length encoding scheme
 Right tree represents a Huffman encoding scheme
Example
Constructing A Huffman Code
O(lg n)
O(lg n)
O(lg n)
Total computation time = O(n lg n)
// C is a set of n characters
// Q is implemented as a binary min-heap O(n)
Cost of a Tree T
 For each character c in the alphabet C
 let f(c) be the frequency of c in the file
 let dT(c) be the depth of c in the tree
 It is also the length of the codeword. Why?
 Let B(T) be the number of bits required to
encode the file (called the cost of T)
B(T )  f (c)dT (c)
cC

Huffman Code Problem
In the pseudocode that follows:
 we assume that C is a set of n characters and that
each character c €C is an object with a defined
frequency f [c].
 The algorithm builds the tree T corresponding to the
optimal code
 A min-priority queue Q, is used to identify the two
least-frequent objects to merge together.
 The result of the merger of two objects is a new
object whose frequency is the sum of the
frequencies of the two objects that were merged.
Running time of Huffman's algorithm
 The running time of Huffman's algorithm assumes
that Q is implemented as a binary min-heap.
 For a set C of n characters, the initialization of Q in
line 2 can be performed in O(n) time using the
BUILD-MINHEAP
 The for loop in lines 3-8 is executed exactly n - 1
times, and since each heap operation requires
time O(lg n), the loop contributes O(n lg n) to the
running time. Thus, the total running time of
HUFFMAN on a set of n characters is O(n lg n).
Prefix Code
 Prefix(-free) code: no codeword is also a prefix of some other
codewords (Un-ambiguous)
 An optimal data compression achievable by a character code can
always be achieved with a prefix code
 Simplify the encoding (compression) and decoding
 Encoding: abc  0 . 101. 100 = 0101100
 Decoding: 001011101 = 0. 0. 101. 1101  aabe
 Use binary tree to represent prefix codes for easy decoding
 An optimal code is always represented by a full binary tree, in which
every non-leaf node has two children
 |C| leaves and |C|-1 internal nodes Cost:


C
c
T c
d
c
f
T
B )
(
)
(
)
(
Frequency of c
Depth of c (length of the codeword)
Huffman Code
 Reduce size of data by 20%-90% in general
 If no characters occur more frequently than others,
then no advantage over ASCII
 Encoding:
 Given the characters and their frequencies, perform the
algorithm and generate a code. Write the characters
using the code
 Decoding:
 Given the Huffman tree, figure out what each character
is (possible because of prefix property)
Application on Huffman code
 Both the .mp3 and .jpg file formats use
Huffman coding at one stage of the
compression
Dynamic Programming vs. Greedy Algorithms
 Dynamic programming
 We make a choice at each step
 The choice depends on solutions to subproblems
 Bottom up solution, from smaller to larger subproblems
 Greedy algorithm
 Make the greedy choice and THEN
 Solve the subproblem arising after the choice is made
 The choice we make may depend on previous choices,
but not on solutions to subproblems
 Top down solution, problems decrease in size

16_Greedy_Algorithms Greedy_AlgorithmsGreedy_Algorithms

  • 1.
  • 2.
    Greedy Algorithms  Similarto dynamic programming, but simpler approach  Also used for optimization problems  Idea: When we have a choice to make, make the one that looks best right now  Make a locally optimal choice in hope of getting a globally optimal solution  Greedy algorithms don’t always yield an optimal solution  Makes the choice that looks best at the moment in order to get optimal solution.
  • 3.
    Fractional Knapsack Problem Knapsack capacity: W  There are n items: the i-th item has value vi and weight wi  Goal:  find xi such that for all 0  xi  1, i = 1, 2, .., n  wixi  W and  xivi is maximum
  • 4.
    50 Fractional Knapsack -Example  E.g.: 10 20 30 50 Item 1 Item 2 Item 3 $60 $100 $120 10 20 $60 $100 + $240 $6/pound $5/pound $4/pound 20 --- 30 $80 +
  • 5.
    Fractional Knapsack Problem Greedy strategy 1:  Pick the item with the maximum value  E.g.:  W = 1  w1 = 100, v1 = 2  w2 = 1, v2 = 1  Taking from the item with the maximum value: Total value taken = v1/w1 = 2/100  Smaller than what the thief can take if choosing the other item Total value (choose item 2) = v2/w2 = 1
  • 6.
    Fractional Knapsack Problem Greedystrategy 2:  Pick the item with the maximum value per pound vi/wi  If the supply of that element is exhausted and the thief can carry more: take as much as possible from the item with the next greatest value per pound  It is good to order items based on their value per pound n n w v w v w v    ... 2 2 1 1
  • 7.
    Fractional Knapsack Problem Alg.:Fractional-Knapsack (W, v[n], w[n]) 1. While w > 0 and as long as there are items remaining 2. pick item with maximum vi/wi 3. xi  min (1, w/wi) 4. remove item i from list 5. w  w – xiwi  w – the amount of space remaining in the knapsack (w = W)  Running time: (n) if items already ordered; else (nlgn)
  • 8.
    Huffman Code Problem Huffman’s algorithm achieves data compression by finding the best variable length binary encoding scheme for the symbols that occur in the file to be compressed.
  • 9.
    Huffman Code Problem The more frequent a symbol occurs, the shorter should be the Huffman binary word representing it.  The Huffman code is a prefix-free code.  No prefix of a code word is equal to another codeword.
  • 10.
    Overview  Huffman codes:compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence of each character to build up an optimal way of representing each character as a binary string C: Alphabet
  • 11.
    Example  Assume weare given a data file that contains only 6 symbols, namely a, b, c, d, e, f With the following frequency table:  Find a variable length prefix-free encoding scheme that compresses this data file as much as possible?
  • 12.
    Huffman Code Problem Left tree represents a fixed length encoding scheme  Right tree represents a Huffman encoding scheme
  • 13.
  • 14.
    Constructing A HuffmanCode O(lg n) O(lg n) O(lg n) Total computation time = O(n lg n) // C is a set of n characters // Q is implemented as a binary min-heap O(n)
  • 15.
    Cost of aTree T  For each character c in the alphabet C  let f(c) be the frequency of c in the file  let dT(c) be the depth of c in the tree  It is also the length of the codeword. Why?  Let B(T) be the number of bits required to encode the file (called the cost of T) B(T )  f (c)dT (c) cC 
  • 16.
    Huffman Code Problem Inthe pseudocode that follows:  we assume that C is a set of n characters and that each character c €C is an object with a defined frequency f [c].  The algorithm builds the tree T corresponding to the optimal code  A min-priority queue Q, is used to identify the two least-frequent objects to merge together.  The result of the merger of two objects is a new object whose frequency is the sum of the frequencies of the two objects that were merged.
  • 17.
    Running time ofHuffman's algorithm  The running time of Huffman's algorithm assumes that Q is implemented as a binary min-heap.  For a set C of n characters, the initialization of Q in line 2 can be performed in O(n) time using the BUILD-MINHEAP  The for loop in lines 3-8 is executed exactly n - 1 times, and since each heap operation requires time O(lg n), the loop contributes O(n lg n) to the running time. Thus, the total running time of HUFFMAN on a set of n characters is O(n lg n).
  • 18.
    Prefix Code  Prefix(-free)code: no codeword is also a prefix of some other codewords (Un-ambiguous)  An optimal data compression achievable by a character code can always be achieved with a prefix code  Simplify the encoding (compression) and decoding  Encoding: abc  0 . 101. 100 = 0101100  Decoding: 001011101 = 0. 0. 101. 1101  aabe  Use binary tree to represent prefix codes for easy decoding  An optimal code is always represented by a full binary tree, in which every non-leaf node has two children  |C| leaves and |C|-1 internal nodes Cost:   C c T c d c f T B ) ( ) ( ) ( Frequency of c Depth of c (length of the codeword)
  • 19.
    Huffman Code  Reducesize of data by 20%-90% in general  If no characters occur more frequently than others, then no advantage over ASCII  Encoding:  Given the characters and their frequencies, perform the algorithm and generate a code. Write the characters using the code  Decoding:  Given the Huffman tree, figure out what each character is (possible because of prefix property)
  • 20.
    Application on Huffmancode  Both the .mp3 and .jpg file formats use Huffman coding at one stage of the compression
  • 21.
    Dynamic Programming vs.Greedy Algorithms  Dynamic programming  We make a choice at each step  The choice depends on solutions to subproblems  Bottom up solution, from smaller to larger subproblems  Greedy algorithm  Make the greedy choice and THEN  Solve the subproblem arising after the choice is made  The choice we make may depend on previous choices, but not on solutions to subproblems  Top down solution, problems decrease in size