VIEWS: 0 PAGES: 41 POSTED ON: 7/20/2012
Algorithms Greedy Algorithms CS 8833 Algorithms Greedy Algorithms Used to solve optimization problems Always makes the choice that looks best at the moment When a greedy algorithm leads to an optimal solution, it is because a locally optimal choice leads to a globally optimal solution Usually simple and fast CS 8833 Algorithms Common Situation for Greedy Problems a set (or a list) of candidates the set of candidates that have already been used a function that checks whether a particular set of candidates provides a solution to the problem a function to check feasibility a selection function that indicates the most promising candidate not yet used an objective function that gives the value of the solution CS 8833 Algorithms function greedy (C: set) :set S while not solution (S) and C do x an element of C maximizing select(x) C C - {x} if feasible (S{x}) then S S{x} if solution (S) then return S else return “no solutions” CS 8833 Algorithms Change Making Problem Problem–make change for a customer using the smallest number of coins – candidates: finite set of coins (1,5,10,25) with at least one coin of each type – solution: total value of amount needed – feasible set: set that does not exceed amount to be paid – selection: highest valued coin available – objective function: # of coins used (minimize) CS 8833 Algorithms Activity Selection Problem Problem: schedule a resource among several competing activities given the start and finish times of each activity Goal: Select a maximum-size set of mutually compatible activities CS 8833 Algorithms Formal Definition of Problem Let S={1,2, . . . ,n} be the set of activities. Only one activity can be done at a time. Each activity i has a start time si and finish time fi and si fi. An activity takes place in the half-open interval [si fi) Activities i and j are compatible if [si fi) and [sj fj) do not overlap, (i.e. si fj or sj fi ) The activity selection problem is to select a maximum size set of mutually compatible activities CS 8833 Algorithms The Greedy Algorithm Assumes that start and finish times are stored in arrays s and f Activities are sorted in order of increasing finish times: f1 f2 fn If not the case, can sort them in O(n lg n) time CS 8833 Algorithms GREEDY-ACTIVITY-SELECTOR(s,f) 1 n length[s] 2 A a1} 3 i 1 4 for m 2 to n do 5 if sm fi then 6 A Aam} 7 i m 8 return A CS 8833 Algorithms i si fi 1 1 5 2 4 7 3 6 10 4 5 11 5 3 12 6 10 12 7 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 CS 8833 Algorithms Analysis of Algorithm The activity selected for consideration is always the one with the earliest finish Why does this work? Intuitively, it always leaves the maximum time possible to schedule more activities The greedy choice maximizes the amount of unscheduled time remaining What is the space and time complexity? CS 8833 Algorithms Proving the Greedy Algorithm Finds the Optimal Solution Theorem 16.1 Algorithm GREEDY- ACTIVITY-SELECTION produces solutions of maximum size for the activity selection problem General form of proofs – prove that first greedy choice is correct – show by induction that all other subsequent greedy choices are correct CS 8833 Algorithms Proof: Let S = {1, 2, 3, . . ., n} be the set of activities sorted by finish times. Activity 1 has the earliest finish time. Step 1: Show that there is an optimal solution that contains activity 1. Suppose A is a subset of S that is an optimal solution. Order the activities in A by increasing finish time. Suppose the first activity in A is k. If k = 1, then schedule A begins with a greedy choice. If k 1, then we need to show that there is another optimal solution B that begins with the greedy choice of 1. Let B = (A-{k}) {1} We need to show that B is still optimal and that the activities do not conflict when we replace k with 1. 1. B has the same number of activities as A so it is optimal 2. Since f1 fk , activity 1 will finish before the second activity in B begins, so there are no conflicts. CS 8833 Algorithms Proof continued: Step 2: Show that a greedy choice of activity 1 results in a smaller problem that consists of finding an optimal solution for the activity- selection problem over those activities in S that are compatible with activity 1. We want to show that if A is an optimal solution to the original problem S, then A’= A - {1} is an optimal solution to the activity-selection problem S’ = {i S: si f1}. Use a proof by contradiction: Suppose that we could find a solution B’ to S’ with more activities than A’. Then we could add activity 1 to B’ and have a solution to S with more activities than A. But since we assumed that A was optimal this is not possible and thus we have a contradiction. Thus, after each greedy choice is made, we are left with an optimization problem of the same form as the original problem. By induction on the number of choices made, making a greedy choice at every step produces an optimal solution. CS 8833 Algorithms Elements of the Greedy Strategy Sometimes a greedy strategy results in an optimal solution and sometimes it does not. No general way to tell if the greedy strategy will result in an optimal solution Two ingredients usually necessary – greedy-choice property – optimal substructure CS 8833 Algorithms Greedy-Choice Property A globally optimal solution can be arrived at by making a locally optimal (greedy) choice. Unlike dynamic programming, we solve the problem in a top down manner Must prove that the greedy choices result in a globally optimal solution CS 8833 Algorithms Optimal Substructure Like dynamic programming, the optimal solution must contain within it optimal solutions to sub-problems. Given a choice between using a greedy algorithm and a dynamic programming algorithm for the same problem, in general which would you choose? CS 8833 Algorithms Greedy versus Dynamic Programming Both greedy and dynamic programming exploit the optimal substructure property Optimal substructure: a problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to the sub-problems. Knapsack problem illustrates differences CS 8833 Algorithms Two Knapsack Problems 0-1 knapsack problem – A thief robbing a store finds n items – Item i is worth vi dollars and weighs wi pounds (both vi and wi integers) – Can carry at most W pounds in knapsack – Goal: determine the set of items to take that will result in the most valuable load Fractional knapsack problem – same setup – allow thief to take fractions of items CS 8833 Algorithms Optimal Substructure Property of Two KS Problems 0-1 Knapsack – Consider optimal load of weight W – If item j is removed from the load, the resulting load is the most valuable load weighing at most W - wj that can be taken from n - 1 original items excluding item j Fractional Knapsack – Consider optimal load of weight W – If we remove weight w of item j , the remaining load is the optimal load weighing W-w that the thief can take from the original n-1 original items plus wj - w pounds of item j CS 8833 Algorithms Fractional KS can be Solved Using Greedy What is the greedy selection criterion? What is the running time? CS 8833 Algorithms Greedy Algorithm Item 1 lb $60 lb Item 2 lb $100 lb lb Item 3 lb $120 Knapsack Optimal W = 50 lb Solution CS 8833 Algorithms Greedy Does Not Work for 0-1 KS Item 1 lb $60 lb lb Item 2 lb $100 lb lb lb lb $120 lb Item 3 $220 $160 $180 CS 8833 Algorithms Other Possible Greedy Strategies Pick the heaviest item first? Pick the lightest item first? Need dynamic programming. For each item, consider an optimal solution that does and does not include the item. CS 8833 Algorithms 0-1 Knapsack Solution The dynamic programming solution to this problem is similar to the LCS problem. At each step, consider including or not including each item in a solution Let xi be 0 if item i is not included and 1 if it is included Our goal is to maximize the value of the pack while keeping the weight <= W CS 8833 Algorithms 0-1 Knapsack Recurrence n Maximize w x i 1 i i n Subject to v x i 1 i i W Consider the subsequence from x1 to xj The best solution to this sub-problem is the max of the solution that contains xj the solution that does not contain xj CS 8833 Algorithms Let KNAP(l, j, W) represent the problem Maximize v x 1i j i i Subject to wi xi W 1i j The 0 - 1 Knapsack problem is then KNAP(1, n, W) CS 8833 Algorithms Huffman Codes Effective technique for data compression Savings of 20-90% are typical Uses a table of frequencies of occurrences of characters to build up an optimal way of representing each character as a binary string CS 8833 Algorithms Examples of Different Binary Encodings Fixed length code – Each character in file represented by a different fixed length code – Length of encoded file depends only on the number of characters in the file – Example » 6 character alphabet (3 bit code), 25,000 character file takes 75,000 bits CS 8833 Algorithms Binary encodings continued If shorter binary strings are used for more frequent characters, a shorter encoding could be used A B C D E F Frequency (in thousands) 5 2 3 4 10 1 Fixed-length codeword 000 001 010 011 100 101 Variable-length codeword 111 1001 101 110 0 1000 Variable length encoding uses 58,000 bits CS 8833 Algorithms Encoding and Decoding Encoding – substitute code for the character Decoding – Fixed length: take x number of characters at a time and look up character corresponding to code – Variable length: must be able to determine when one code ends and another begins CS 8833 Algorithms Prefix Codes Each code has a unique prefix. 0 101 100 Prefix constraint – The prefixes of an encoding of one character cannot be equal to a complete encoding of another character Decoding is never ambiguous – identify the first character – remove it from the file and repeat CS 8833 Algorithms Problem Given a text (a sequence of characters) find an encoding for the characters that satisfies the prefix constraint and that minimizes the number of bits need to encode the text. CS 8833 Algorithms Binary Tree Representation of Prefix Code Each leaf represents a character A left child represents the character 0 and a right child represents the character 1. The path from the root to the leaf represents the encoding for the leaf CS 8833 Algorithms E C D A F B CS 8833 Algorithms Characteristics of Binary Tree Not a binary search tree The optimal code for a file is always represented by a full binary tree in which every non-leaf node has two children. If C is the alphabet, then a tree for the optimal prefix code has – |C| leaves – 8833 Algorithms CS |C|-1 internal nodes Cost of a Tree T For each character c in the alphabet C – let f(c) be the frequency of c in the file – let dT(c) be the depth of c in the tree » It is also the length of the codeword. Why? Let B(T) be the number of bits required to encode the file (called the cost of T) B(T ) f (c)dT (c) cC CS 8833 Algorithms Constructing a Huffman Code Greedy algorithm for constructing an optimal prefix code was invented by Huffman Codes constructed using the algorithm are called Huffman codes Bottom up algorithms – Start with a set of |C| leaves – perform a sequence of |C|-1 merging operations CS 8833 Algorithms HUFFMAN(C) 1 n |C| 2 Q C ; Characters are in a priority queue 3 for i to n-1 4 do z ALLOCATE-NODE() 5 x left[z] EXTRACT-MIN(Q) 6 y right[z] EXTRACT-MIN(Q) 7 f[z] f[x] + f[y] 8 INSERT(Q,z) 9 return EXTRACT-MIN(Q) CS 8833 Algorithms F:1 B:2 C:3 D:4 A:5 E:10 CS 8833 Algorithms Running Time of Huffman’s Algorithm Assume Q implemented as a binary heap Assume n characters in alphabet CS 8833 Algorithms