Approximation_Algorithms by huanghengdong


									 Approximation Algorithms

           By: Ryan Kupfer,
             Luis Colón,
               Joe Parisi

CMSC 435 Algorithm Analysis and Design
         What is an Approximation
   Approximation Algorithms:
       Run in polynomial time
       Find solutions that are guaranteed to be close to
    Different Problems where
approximate Algorithms are used.
   11.1 Greedy Algorithm & Bounds on the Optimum: A load
    Balancing problem.
   11.2 The Center Selection Problem
   11.3 Set Cover: A General Greedy Heuristic
   11.4 The Pricing Method: Vertex Cover
   11.5 Maximization via the Pricing Method: the Disjoint
   Traveling Salesman Problem
Greedy Algorithm Load Balancing
     Problem -The Problem
•   Problem: Balance the load on each of the servers in order to
    split up the work load across them
•   Declare the load on a machine Mi, minimize a quantity known
    as the makespan (maximum load on any machine)
•   Take a greedy approach to this problem, the algorithm makes
    one pass through the jobs in any order and puts that job on
    the machine with the smallest load
     Load Balancing Analyzing
•   Our makespan for our algorithm is not much
    bigger than the makespan of the optimal
    solution, however, we cannot compare that
    because we cannot compute that optimal
    value (due to time constraints)
•   Therefore, we need to set a lower bound on
    the optimum quantity with the guarantee that
    no matter how good the optimum is, it cannot
    be less than this bound in order to make a
    better approximation algorithm
How can we improve this algorithm?

 •   Guarantee being within a factor strictly smaller
     than 2 away from the optimum
      11.2 The Center Selection
•   Similar to load balancing but also where to
    place server in order to keep the formulation
    clean and simple
•   The Center Selection Problem provides an
    example of when a natural greedy algorithm
    does not give the optimal solution, instead a
    slightly different greedy version can guarantee
    us a near optimal solution
    Designing and Analyzing the
•   The regular greedy Algorithm would not work
    with a problem between two sites s and z and
    there distance is 2 the regular greedy
    algorithm would place them half way between
    each other while the actual optimum solution
    would just place them at the same location
    and the radius around it would be zero
•   Knowing the Optimal Radius helps.
11.3 Set Cover: A General Greedy
•   Set Cover is something that can be used for
    special case algorithms. Approximate
    Algorithms is something that the set Cover
    Algorithm can be applied to.
•   Set Cover is a collection of subsets of U
    whose union is equal to all U.
       Designing the Algorithm

•   The Greddy Algorithm for this problem will
    have the property that it builds the cover set
    one at a time then it picks the next set
    depending on what will reach the goal more
•   A good set has the properties of small weight
    and it covers overs lots of elements. However,
    this is not enough to make a good
    approximation algorithm we need to combine
    the two properties and find the cost per
    element. Which will be a good guide.
       Analyzing the Set Cover
•   Our algorithm is good to find a solution but we
    have to wonder how much larger is the weight
    of this set cover than the weight of the optimal
•   The set cover C selected by the Greedy Set
    Cover has weight at most H(d*) times the
    optimal weight.
•   From this we can find a desired bound for
    trying to find the optimal weight and a good
    approximation algorithm.
    11.4 The Pricing Method: Vertex
•    We want to find a vertex cover S for which
     w(S) is minimum. When all weights are equal
     to 1, deciding of there is a vertex cover of
     weight at most K is the standard decision
     version of the vertex cover.
•    A vertex cover for an undirected graph G =
     (V,E) is a subset S of its vertices's such that
     each edge has at le ast one endpoint in S. In
     other words, for each edge ab in E, one of a or
     b must be an element of S.
•    Vertex Cover <= Set cover if all weights are
     equal to 1.
      Designing Pricing Method
•   For the case of the Vertex Cover Problem , we
    think of the weights on the nodes as Costs,
    and we will think of each edge as having to
    pay for its share of the cost of the vertex cover
    we find
•   The goal of this approximation algorithm is to
    find a vertex cover, set prices at the same
    time, and use these prices to select the nodes
    for the vertex cover
    11.5 Maximization via the Pricing
       Method: The Disjoint Paths
•    This usually problem usually arises in network
     routing, the special case that we are dealing
     with is where each path to be routed has its
     own designated starting node, S, and ending
     node, T
•    Treat (S, T) as a routing request which asks
     for a path from S to T
Solving the Disjoint Path Problem
     with a Pricing Algorithm
•       For this algorithm:
    •     Have the paths pay for the edges
    •     Edges can be shared among paths, however the
          more that edge is used, the more costly it
    •     Distinguish the difference between the short and
          long paths
    11.6 Linear Programming and
     Rounding an Application to
            Vertex Cover
•   Linear programming is a technique that can be
    very powerful in applying to different sets of
•   We can apply it to the Vertex Cover Problem
•   Linear programming can be seen as following
    a more complex version of regular algebraic
    expressions just with inequalities instead of
Traveling Salesman Problem
      •   We know that the
          traveling salesman
          problem is more of an
          optimization problem but
          it applies to approximate
          algorithms because this
          problem is of type NP-
      •   The Problem: Given a
          number of cities and the
          costs of traveling from
          any city to any other city,
          what is the least-cost
          round-trip route that visits
          each city at least once
          and then returns to the
          starting city.
    Solving the salesman Problem
•   Since the salesman problem is a problem of NP
    Hardness, we are able to solve it in the same way we
    can solve an NP-Hard problem… using an
    Approximation Algorithm.
•   It will give us a solution that can be 2% - 3% away from
    the optimal solution which could be faster and more cost
    effective than an exact solution algorithm.
        Problem 1 Overview
•       A ship arrives with n containers of
        weight (w1, w2, ..., wn)
•       There are a set of trucks which
        can hold K units of weight
•       Minimize the number of trucks needed to carry
        all the containers
•       The Greedy Algorithm:
    •     Start with an empty truck and pile the containers
          in the order that they came in and move on to the
          next truck if the next container does not fit
    •     Repeat until there are no more containers
              Problem 1 Continued
•       a) Given an example of set of weights and
        there value we figure this out.
•       If K is 10 and the set of containers is: {6, 5, 4,
    •     The first truck would be loaded with container 1,
          weight 6 and since the next container is of weight
          5, it would be overflowing so send truck 1 off with
          container 1. The next truck would contain
          containers 2 and 3 (weights 5 and 4), and the last
          truck would only have container 4, weight 3.
•       The optimal solution to this would be to load
        containers 1 and 3 (weights 6 and 4) into truck
        1 and containers 2 and 4 (weights 5 and 3)
        into truck 2.
                Problem 1 Continued
•       b) Show that the number of trucks used by this
        algorithm is within a factor of 2 of the minimum
        possible number of trucks for any set of
        weights and any value of K.
    •       Suppose that each truck could hold a maximum
            weight of 10 (K = 10)
    •       Given the set of containers:
        •     S = {6, 5, 6, 5, 6, 5, 6, 5} (Worst-case scenario for K)
    •       This would require 8 trucks
    •       A better algorithm which could look ahead would
            require only 6 trucks
          Problem 2 Overview

•   The idea of this problem is to build a
    representative set for a large collection of
    protein molecules whose properties are not
    completely understood. This would allow the
    researchers to study the smaller
    representative set and by inference learn
    about the whole set
                Problem 2 Continued
•       a) Given a large set of proteins P, using the
        similarity-distance function dist(p, q) which
        returns a delta value of similarity (<= delta
        being similar), give an algorithm to determine
        the smallest representative set R.
    •       Initialize a representative set R with the empty set
    •       Start with a set P of proteins
    •       While P still has proteins
        •     Select a protein p from P at random
        •     Choose a protein q from P which maximizes the
              function dist(p, q) <= delta
        •     Add q to R
        •     Remove p, q, and all proteins where dist(p, q) <= delta
       Problem 2 Continued
b) The algorithm listed for the Center Selection
problem does not solve our protein problem
because increasing the delta to 2 * delta would
not be relevant.
          Problem 3 Overview

•   We would like to find a subset S of A which is
    the maximum feasible value (the sum of
    subset S does not exceed a certain given
                    Problem 3 Solution
•       a) Given the following algorithm:
        •        S={}
        •        T=0
        •        For I = 1, 2, …, n
             •      If T + ai <= B
                  •      S     S U {ai}
                  •      T     T + ai
             •      End If
        •        End For
    •       Give an instance in which the total sum of the set
            S returned by this algorithm is less than half the
            total sum of some other feasible subset of A
    •       A = {1, 3, 10}, B = 11, the algorithm above would
            only return 1 and 3 (total of 4) where an optimal
            one would return 1 and 10 (total of 11)
  Problem 3 Solution Continued

• The way that our algorithm works is:
  • First it sorts the contents of the set in ascending
    order using quicksort
  • It then alternates between the largest and smallest
    elements adding them to a total value to compare to
    the given B
  • If the new would-be total value is less than or equal
    to B then add the chosen element to the feasible set
    and add that chosen element to the total value.
   Problem 3 Solution Continued
• Now we have a feasible set that contains
  elements of the full set A and the summation of
  the values of the feasible set (Total). With the
  value of Total, we can see that we have the best
  feasible set for any given run of this algorithm.
• To be certain that this algorithm ran as I claim, I
  ran five million examples of it on randomly
  generated data. For each test run, the set A had
  a length of range 3-50 and each element was in
  a range of 1-50. The random value for B is in
  the range of the smallest – sum.
   Problem 3 Solution Continued
• During the 5,000,000 test runs, there were no
  cases which returned a feasible subset of A in
  which the summation was less than half of the
  total possible summation of all elements of A
• Because it is not possible for an algorithm to
  have its half point higher than the total half point
  of A, our algorithm will always return a result
  more than one half of any other subset returned
  by any other algorithm
          Problem 5 Overview

•   A company has a business where clients bring
    in jobs each day for processing and each job
    has a processing time t on ten machines.
•   The company is running the Greedy Balance
    algorithm and it may not be the best
    approximation algorithm that can be used.
•   We must prove that the greedy- Balance
    algorithm will always give a makespan o fat
    most 20 percent above the average load.
               Problem 5 Solution

•       Show that the company’s greedy algorithm will
        always find a solution whose makespan is at
        most 20% above the average load
    •     Assuming the total load is at its lowest value, 3000
          and the optimal load is 1/10 of that total load
          (since there are 10 machines), 300, therefore the
          average load per machine cannot be higher than
    •     Assume you have 10 jobs at 25 (average job size)
    •     300 / 250 = 120%
                 Problem 10 Overview
•       We are given an n x n grid graph G.
•       Associated with each node v is a weight w(v) which is a
        non-negative integer. All the weights of all nodes are
•       The goal is to choose an independent set S of nodes of
        the grid so that the sum of the weights of node S is as
        large as possible.
    •       Start with S = { }
    •       While G still has nodes
        •      Choose the highest weighted node, v
        •      Add v to S
        •      Delete v and its neighbors from G
    •       Return S
               Problem 10 Solution

•       a) Let S be the independent set returned, let T
        be any other independent set in G. Show that
        for each node v in T, either v is in S, or there
        is a node v’ in S so that the weight of v is less
        than or equal to the weight of v’
    •     This is true simply due to the fact that it is asking
          for independent sets, if a node in the “random”
          independent set is not in the max weighted
          independent set, it must have a neighbor in that
          set with a higher or equal weight
              Problem 10 Solution

•       b) Show that the greedy algorithm given
        returns an S at least 1/4th the maximum total
        weight of any independent set
    •     In order to prove this, you’d have to assume for
          any node chosen in any independent set that each
          of its four neighbors could be a higher value
    •     Assign the node in the independent set being
          tested four times the weight.
    •     Summing the node weights results in a value 4
          times greater than that of the greedy algorithm

To top