Algorithms by KbW38bg

VIEWS: 0 PAGES: 41

• pg 1
```									              Algorithms

Greedy Algorithms

CS 8833   Algorithms
Greedy Algorithms
 Used to solve optimization problems
 Always makes the choice that looks
best at the moment
 When a greedy algorithm leads to an
optimal solution, it is because a locally
optimal choice leads to a globally
optimal solution
 Usually simple and fast

CS 8833   Algorithms
Common Situation for Greedy
Problems
   a set (or a list) of candidates
   the set of candidates that have already been used
   a function that checks whether a particular set of
candidates provides a solution to the problem
   a function to check feasibility
   a selection function that indicates the most promising
candidate not yet used
   an objective function that gives the value of the
solution

CS 8833    Algorithms
function greedy (C: set) :set
S
while not solution (S) and C   do
x  an element of C maximizing
select(x)
C  C - {x}
if feasible (S{x}) then
S  S{x}
if solution (S) then return S
else return “no solutions”

CS 8833 Algorithms
Change Making Problem
   Problem–make change for a customer
using the smallest number of coins
– candidates: finite set of coins (1,5,10,25)
with at least one coin of each type
– solution: total value of amount needed
– feasible set: set that does not exceed
amount to be paid
– selection: highest valued coin available
– objective function: # of coins used
(minimize)
CS 8833      Algorithms
Activity Selection Problem
 Problem: schedule a resource among
several competing activities given the
start and finish times of each activity
 Goal: Select a maximum-size set of
mutually compatible activities

CS 8833   Algorithms
Formal Definition of Problem
   Let S={1,2, . . . ,n} be the set of activities. Only one
activity can be done at a time.
   Each activity i has a start time si and finish time fi and
si  fi.
   An activity takes place in the half-open interval [si  fi)
   Activities i and j are compatible if [si  fi) and [sj  fj) do
not overlap, (i.e. si  fj or sj  fi )
   The activity selection problem is to select a maximum
size set of mutually compatible activities

CS 8833       Algorithms
The Greedy Algorithm
 Assumes that start and finish times are
stored in arrays s and f
 Activities are sorted in order of
increasing finish times:
f1 f2 fn
If not the case, can sort them in O(n lg n)
time

CS 8833   Algorithms
GREEDY-ACTIVITY-SELECTOR(s,f)
1   n  length[s]
2   A a1}
3   i 1
4   for m  2 to n do
5           if sm  fi then
6                   A  Aam}
7                   i m
8  return A

CS 8833   Algorithms
i   si   fi
1   1    5

2   4    7

3   6    10

4   5    11

5   3    12

6 10     12

7 13     14

0   1   2   3   4   5   6   7   8   9 10 11 12 13 14

CS 8833           Algorithms
Analysis of Algorithm
 The activity selected for consideration is
always the one with the earliest finish
 Why does this work? Intuitively, it
always leaves the maximum time
possible to schedule more activities
 The greedy choice maximizes the
amount of unscheduled time remaining
 What is the space and time complexity?

CS 8833   Algorithms
Proving the Greedy Algorithm
Finds the Optimal Solution
Theorem 16.1 Algorithm GREEDY-
ACTIVITY-SELECTION produces
solutions of maximum size for the
activity selection problem
General form of proofs
– prove that first greedy choice is correct
– show by induction that all other subsequent
greedy choices are correct
CS 8833   Algorithms
   Proof:
Let S = {1, 2, 3, . . ., n} be the set of activities sorted by finish times.
Activity 1 has the earliest finish time.
Step 1: Show that there is an optimal solution that contains activity 1.
Suppose A is a subset of S that is an optimal solution.
Order the activities in A by increasing finish time.
Suppose the first activity in A is k.
If k = 1, then schedule A begins with a greedy choice.
If k 1, then we need to show that there is another
optimal solution B that begins with the greedy choice of 1.
Let B = (A-{k}) {1}
We need to show that B is still optimal and that the
activities do not conflict when we replace k with 1.
1. B has the same number of activities as A so it is
optimal
2. Since f1 fk , activity 1 will finish before the second
activity in B begins, so there are no conflicts.

CS 8833          Algorithms
   Proof continued:
Step 2: Show that a greedy choice of activity 1 results in a smaller
problem that consists of finding an optimal solution for the activity-
selection problem over those activities in S that are compatible with
activity 1.
We want to show that if A is an optimal solution to the original
problem S, then A’= A - {1} is an optimal solution to the activity-selection
problem S’ = {i S: si f1}.
Suppose that we could find a solution B’ to S’ with more
activities than A’. Then we could add activity 1 to B’ and
have a solution to S with more activities than A. But since
we assumed that A was optimal this is not possible and
Thus, after each greedy choice is made, we are left with an
optimization problem of the same form as the original problem. By
induction on the number of choices made, making a greedy choice at
every step produces an optimal solution.

CS 8833         Algorithms
Elements of the Greedy
Strategy
 Sometimes a greedy strategy results in
an optimal solution and sometimes it
does not.
 No general way to tell if the greedy
strategy will result in an optimal solution
 Two ingredients usually necessary
– greedy-choice property
– optimal substructure

CS 8833   Algorithms
Greedy-Choice Property
 A globally optimal solution can be
arrived at by making a locally optimal
(greedy) choice.
 Unlike dynamic programming, we solve
the problem in a top down manner
 Must prove that the greedy choices
result in a globally optimal solution

CS 8833   Algorithms
Optimal Substructure
 Like dynamic programming, the optimal
solution must contain within it optimal
solutions to sub-problems.
 Given a choice between using a greedy
algorithm and a dynamic programming
algorithm for the same problem, in
general which would you choose?

CS 8833   Algorithms
Greedy versus Dynamic
Programming
 Both greedy and dynamic programming
exploit the optimal substructure property
 Optimal substructure: a problem exhibits
optimal substructure if an optimal solution
to the problem contains within it optimal
solutions to the sub-problems.
 Knapsack problem illustrates differences

CS 8833   Algorithms
Two Knapsack Problems
   0-1 knapsack problem
– A thief robbing a store finds n items
– Item i is worth vi dollars and weighs wi
pounds (both vi and wi integers)
– Can carry at most W pounds in knapsack
– Goal: determine the set of items to take
that will result in the most valuable load
   Fractional knapsack problem
– same setup
– allow thief to take fractions of items
CS 8833   Algorithms
Optimal Substructure
Property of Two KS Problems
   0-1 Knapsack
– Consider optimal load of weight W
– If item j is removed from the load, the resulting load is the
most valuable load weighing at most W - wj that can be
taken from n - 1 original items excluding item j
   Fractional Knapsack
– Consider optimal load of weight W
– If we remove weight w of item j , the remaining load is the
optimal load weighing W-w that the thief can take from the
original n-1 original items plus wj - w pounds of item j

CS 8833       Algorithms
Fractional KS can be Solved
Using Greedy
   What is the greedy selection criterion?

   What is the running time?

CS 8833   Algorithms
Greedy Algorithm
Item 1   lb   \$60
lb

Item 2   lb   \$100
lb

lb

Item 3   lb   \$120
Knapsack    Optimal
W = 50 lb   Solution

CS 8833        Algorithms
Greedy Does Not Work for 0-1
KS
Item 1   lb   \$60

lb   lb
Item 2   lb   \$100                     lb

lb           lb

lb   \$120             lb
Item 3

\$220     \$160    \$180

CS 8833        Algorithms
Other Possible Greedy
Strategies
   Pick the heaviest item first?

   Pick the lightest item first?

   Need dynamic programming. For each
item, consider an optimal solution that
does and does not include the item.

CS 8833   Algorithms
0-1 Knapsack Solution
   The dynamic programming solution to this
problem is similar to the LCS problem. At
each step, consider including or not including
each item in a solution
   Let xi be 0 if item i is not included and 1 if it is
included
   Our goal is to maximize the value of the pack
while keeping the weight <= W

CS 8833    Algorithms
0-1 Knapsack Recurrence
n
Maximize
w x
i 1
i i

n

Subject to      v x
i 1
i i   W

Consider the subsequence from x1 to xj
The best solution to this sub-problem is the max of
the solution that contains xj
the solution that does not contain xj

CS 8833            Algorithms
Let KNAP(l, j, W) represent the problem

Maximize    v x
1i  j
i i

Subject to  wi xi  W
1i  j

The 0 - 1 Knapsack problem is then KNAP(1, n, W)

CS 8833    Algorithms
Huffman Codes
 Effective technique for data
compression
 Savings of 20-90% are typical
 Uses a table of frequencies of
occurrences of characters to build up an
optimal way of representing each
character as a binary string

CS 8833   Algorithms
Examples of Different Binary
Encodings
   Fixed length code
– Each character in file represented by a
different fixed length code
– Length of encoded file depends only on the
number of characters in the file
– Example
» 6 character alphabet (3 bit code), 25,000
character file takes 75,000 bits

CS 8833   Algorithms
Binary encodings continued
   If shorter binary strings are used for
more frequent characters, a shorter
encoding could be used
A      B      C     D        E    F

Frequency (in thousands) 5      2     3     4    10       1
Fixed-length codeword   000    001   010   011   100     101
Variable-length codeword 111   1001 101    110   0       1000

   Variable length encoding uses 58,000
bits
CS 8833      Algorithms
Encoding and Decoding
   Encoding
– substitute code for the character
   Decoding
– Fixed length: take x number of characters
at a time and look up character
corresponding to code
– Variable length: must be able to determine
when one code ends and another begins

CS 8833   Algorithms
Prefix Codes
   Each code has a unique prefix.
0   101 100
   Prefix constraint
– The prefixes of an encoding of one
character cannot be equal to a complete
encoding of another character
   Decoding is never ambiguous
– identify the first character
– remove it from the file and repeat
CS 8833   Algorithms
Problem
   Given a text (a sequence of characters)
find an encoding for the characters that
satisfies the prefix constraint and that
minimizes the number of bits need to
encode the text.

CS 8833   Algorithms
Binary Tree Representation of
Prefix Code
 Each leaf represents a character
 A left child represents the character 0
and a right child represents the
character 1.
 The path from the root to the leaf
represents the encoding for the leaf

CS 8833   Algorithms
E

C   D   A

F          B

CS 8833   Algorithms
Characteristics of Binary Tree
 Not a binary search tree
 The optimal code for a file is always
represented by a full binary tree in
which every non-leaf node has two
children.
 If C is the alphabet, then a tree for the
optimal prefix code has
– |C| leaves
– 8833 Algorithms
CS |C|-1 internal nodes
Cost of a Tree T
   For each character c in the alphabet C
– let f(c) be the frequency of c in the file
– let dT(c) be the depth of c in the tree
» It is also the length of the codeword. Why?
   Let B(T) be the number of bits required
to encode the file (called the cost of T)
B(T )   f (c)dT (c)
cC

CS 8833   Algorithms
Constructing a Huffman Code
 Greedy algorithm for constructing an
optimal prefix code was invented by
Huffman
 Codes constructed using the algorithm
are called Huffman codes
 Bottom up algorithms
– perform a sequence of |C|-1 merging
operations
CS 8833   Algorithms
HUFFMAN(C)
1 n |C|
2 Q C       ; Characters are in a priority queue
3 for i to n-1
4     do z ALLOCATE-NODE()
5       x left[z] EXTRACT-MIN(Q)
6       y right[z] EXTRACT-MIN(Q)
7       f[z] f[x] + f[y]
8       INSERT(Q,z)
9 return EXTRACT-MIN(Q)
CS 8833    Algorithms
F:1     B:2    C:3   D:4   A:5   E:10

CS 8833    Algorithms
Running Time of Huffman’s
Algorithm
 Assume Q implemented as a binary
heap
 Assume n characters in alphabet

CS 8833   Algorithms

```
To top