# Approximation_Algorithms by huanghengdong

VIEWS: 7 PAGES: 34

• pg 1
```									 Approximation Algorithms

By: Ryan Kupfer,
Luis Colón,
Joe Parisi

CMSC 435 Algorithm Analysis and Design
What is an Approximation
Algorithm
   Approximation Algorithms:
   Run in polynomial time
   Find solutions that are guaranteed to be close to
optimal
Different Problems where
approximate Algorithms are used.
   11.1 Greedy Algorithm & Bounds on the Optimum: A load
Balancing problem.
   11.2 The Center Selection Problem
   11.3 Set Cover: A General Greedy Heuristic
   11.4 The Pricing Method: Vertex Cover
   11.5 Maximization via the Pricing Method: the Disjoint
Problem
   Traveling Salesman Problem
Problem -The Problem
•   Problem: Balance the load on each of the servers in order to
split up the work load across them
•   Declare the load on a machine Mi, minimize a quantity known
as the makespan (maximum load on any machine)
•   Take a greedy approach to this problem, the algorithm makes
one pass through the jobs in any order and puts that job on
the machine with the smallest load
Algorithms
•   Our makespan for our algorithm is not much
bigger than the makespan of the optimal
solution, however, we cannot compare that
because we cannot compute that optimal
value (due to time constraints)
•   Therefore, we need to set a lower bound on
the optimum quantity with the guarantee that
no matter how good the optimum is, it cannot
be less than this bound in order to make a
better approximation algorithm
How can we improve this algorithm?

•   Guarantee being within a factor strictly smaller
than 2 away from the optimum
11.2 The Center Selection
Problem
•   Similar to load balancing but also where to
place server in order to keep the formulation
clean and simple
•   The Center Selection Problem provides an
example of when a natural greedy algorithm
does not give the optimal solution, instead a
slightly different greedy version can guarantee
us a near optimal solution
Designing and Analyzing the
Algorithm
•   The regular greedy Algorithm would not work
with a problem between two sites s and z and
there distance is 2 the regular greedy
algorithm would place them half way between
each other while the actual optimum solution
would just place them at the same location
and the radius around it would be zero
•   Knowing the Optimal Radius helps.
11.3 Set Cover: A General Greedy
Heuristic
•   Set Cover is something that can be used for
special case algorithms. Approximate
Algorithms is something that the set Cover
Algorithm can be applied to.
•   Set Cover is a collection of subsets of U
whose union is equal to all U.
Designing the Algorithm

•   The Greddy Algorithm for this problem will
have the property that it builds the cover set
one at a time then it picks the next set
depending on what will reach the goal more
naturally.
•   A good set has the properties of small weight
and it covers overs lots of elements. However,
this is not enough to make a good
approximation algorithm we need to combine
the two properties and find the cost per
element. Which will be a good guide.
Analyzing the Set Cover
Algorithm
•   Our algorithm is good to find a solution but we
have to wonder how much larger is the weight
of this set cover than the weight of the optimal
set.
•   The set cover C selected by the Greedy Set
Cover has weight at most H(d*) times the
optimal weight.
•   From this we can find a desired bound for
trying to find the optimal weight and a good
approximation algorithm.
11.4 The Pricing Method: Vertex
Cover
•    We want to find a vertex cover S for which
w(S) is minimum. When all weights are equal
to 1, deciding of there is a vertex cover of
weight at most K is the standard decision
version of the vertex cover.
•    A vertex cover for an undirected graph G =
(V,E) is a subset S of its vertices's such that
each edge has at le ast one endpoint in S. In
other words, for each edge ab in E, one of a or
b must be an element of S.
•    Vertex Cover <= Set cover if all weights are
equal to 1.
Designing Pricing Method
Algorithm
•   For the case of the Vertex Cover Problem , we
think of the weights on the nodes as Costs,
and we will think of each edge as having to
pay for its share of the cost of the vertex cover
we find
•   The goal of this approximation algorithm is to
find a vertex cover, set prices at the same
time, and use these prices to select the nodes
for the vertex cover
11.5 Maximization via the Pricing
Method: The Disjoint Paths
Problem
•    This usually problem usually arises in network
routing, the special case that we are dealing
with is where each path to be routed has its
own designated starting node, S, and ending
node, T
•    Treat (S, T) as a routing request which asks
for a path from S to T
Solving the Disjoint Path Problem
with a Pricing Algorithm
•       For this algorithm:
•     Have the paths pay for the edges
•     Edges can be shared among paths, however the
more that edge is used, the more costly it
becomes
•     Distinguish the difference between the short and
long paths
11.6 Linear Programming and
Rounding an Application to
Vertex Cover
•   Linear programming is a technique that can be
very powerful in applying to different sets of
problems
•   We can apply it to the Vertex Cover Problem
•   Linear programming can be seen as following
a more complex version of regular algebraic
expressions just with inequalities instead of
equations
Traveling Salesman Problem
•   We know that the
traveling salesman
problem is more of an
optimization problem but
it applies to approximate
algorithms because this
problem is of type NP-
Hard
•   The Problem: Given a
number of cities and the
costs of traveling from
any city to any other city,
what is the least-cost
round-trip route that visits
each city at least once
and then returns to the
starting city.
Solving the salesman Problem
•   Since the salesman problem is a problem of NP
Hardness, we are able to solve it in the same way we
can solve an NP-Hard problem… using an
Approximation Algorithm.
•   It will give us a solution that can be 2% - 3% away from
the optimal solution which could be faster and more cost
effective than an exact solution algorithm.
Problem 1 Overview
•       A ship arrives with n containers of
weight (w1, w2, ..., wn)
•       There are a set of trucks which
can hold K units of weight
•       Minimize the number of trucks needed to carry
all the containers
•       The Greedy Algorithm:
in the order that they came in and move on to the
next truck if the next container does not fit
•     Repeat until there are no more containers
Problem 1 Continued
•       a) Given an example of set of weights and
there value we figure this out.
•       If K is 10 and the set of containers is: {6, 5, 4,
3}:
•     The first truck would be loaded with container 1,
weight 6 and since the next container is of weight
5, it would be overflowing so send truck 1 off with
container 1. The next truck would contain
containers 2 and 3 (weights 5 and 4), and the last
truck would only have container 4, weight 3.
•       The optimal solution to this would be to load
containers 1 and 3 (weights 6 and 4) into truck
1 and containers 2 and 4 (weights 5 and 3)
into truck 2.
Problem 1 Continued
•       b) Show that the number of trucks used by this
algorithm is within a factor of 2 of the minimum
possible number of trucks for any set of
weights and any value of K.
•       Suppose that each truck could hold a maximum
weight of 10 (K = 10)
•       Given the set of containers:
•     S = {6, 5, 6, 5, 6, 5, 6, 5} (Worst-case scenario for K)
•       This would require 8 trucks
•       A better algorithm which could look ahead would
require only 6 trucks
Problem 2 Overview

•   The idea of this problem is to build a
representative set for a large collection of
protein molecules whose properties are not
completely understood. This would allow the
researchers to study the smaller
representative set and by inference learn
Problem 2 Continued
•       a) Given a large set of proteins P, using the
similarity-distance function dist(p, q) which
returns a delta value of similarity (<= delta
being similar), give an algorithm to determine
the smallest representative set R.
•       Initialize a representative set R with the empty set
•       While P still has proteins
•     Select a protein p from P at random
•     Choose a protein q from P which maximizes the
function dist(p, q) <= delta
•     Remove p, q, and all proteins where dist(p, q) <= delta
Problem 2 Continued
b) The algorithm listed for the Center Selection
problem does not solve our protein problem
because increasing the delta to 2 * delta would
not be relevant.
Problem 3 Overview

•   We would like to find a subset S of A which is
the maximum feasible value (the sum of
subset S does not exceed a certain given
value)
Problem 3 Solution
•       a) Given the following algorithm:
•        S={}
•        T=0
•        For I = 1, 2, …, n
•      If T + ai <= B
•      S     S U {ai}
•      T     T + ai
•      End If
•        End For
•       Give an instance in which the total sum of the set
S returned by this algorithm is less than half the
total sum of some other feasible subset of A
•       A = {1, 3, 10}, B = 11, the algorithm above would
only return 1 and 3 (total of 4) where an optimal
one would return 1 and 10 (total of 11)
Problem 3 Solution Continued

• The way that our algorithm works is:
• First it sorts the contents of the set in ascending
order using quicksort
• It then alternates between the largest and smallest
elements adding them to a total value to compare to
the given B
• If the new would-be total value is less than or equal
to B then add the chosen element to the feasible set
and add that chosen element to the total value.
Problem 3 Solution Continued
• Now we have a feasible set that contains
elements of the full set A and the summation of
the values of the feasible set (Total). With the
value of Total, we can see that we have the best
feasible set for any given run of this algorithm.
• To be certain that this algorithm ran as I claim, I
ran five million examples of it on randomly
generated data. For each test run, the set A had
a length of range 3-50 and each element was in
a range of 1-50. The random value for B is in
the range of the smallest – sum.
Problem 3 Solution Continued
• During the 5,000,000 test runs, there were no
cases which returned a feasible subset of A in
which the summation was less than half of the
total possible summation of all elements of A
• Because it is not possible for an algorithm to
have its half point higher than the total half point
of A, our algorithm will always return a result
more than one half of any other subset returned
by any other algorithm
Problem 5 Overview

•   A company has a business where clients bring
in jobs each day for processing and each job
has a processing time t on ten machines.
•   The company is running the Greedy Balance
algorithm and it may not be the best
approximation algorithm that can be used.
•   We must prove that the greedy- Balance
algorithm will always give a makespan o fat
most 20 percent above the average load.
Problem 5 Solution

•       Show that the company’s greedy algorithm will
always find a solution whose makespan is at
most 20% above the average load
•     Assuming the total load is at its lowest value, 3000
(since there are 10 machines), 300, therefore the
average load per machine cannot be higher than
360
•     Assume you have 10 jobs at 25 (average job size)
•     300 / 250 = 120%
Problem 10 Overview
•       We are given an n x n grid graph G.
•       Associated with each node v is a weight w(v) which is a
non-negative integer. All the weights of all nodes are
distinct.
•       The goal is to choose an independent set S of nodes of
the grid so that the sum of the weights of node S is as
large as possible.
•       While G still has nodes
•      Choose the highest weighted node, v
•      Delete v and its neighbors from G
•       Return S
Problem 10 Solution

•       a) Let S be the independent set returned, let T
be any other independent set in G. Show that
for each node v in T, either v is in S, or there
is a node v’ in S so that the weight of v is less
than or equal to the weight of v’
•     This is true simply due to the fact that it is asking
for independent sets, if a node in the “random”
independent set is not in the max weighted
independent set, it must have a neighbor in that
set with a higher or equal weight
Problem 10 Solution

•       b) Show that the greedy algorithm given
returns an S at least 1/4th the maximum total
weight of any independent set
•     In order to prove this, you’d have to assume for
any node chosen in any independent set that each
of its four neighbors could be a higher value
•     Assign the node in the independent set being
tested four times the weight.
•     Summing the node weights results in a value 4
times greater than that of the greedy algorithm

```
To top