# Approximability of Combinatorial Optimization Problems with Submodular Cost Functions by hcj

VIEWS: 2 PAGES: 57

• pg 1
```									Approximability of Combinatorial Optimization
Problems with Submodular Cost Functions

Pushkar Tripathi
Georgia Institute of Technology

Based on joint work with Gagan Goel, Chinmay Karande, and Wang Lei
Motivation
Network Design Problem
f

g

h

Objective: Find minimum spanning tree that can be
built collaboratively by these agents
Functions which capture
Additive Cost Function                     economies of scale
cost(a) = 1                                cost(a) = 1
cost(b) = 1                                cost(b) = 1
cost(a,b) = 2                              cost(a,b) = 1.5

How to mathematically model these functions?
- We use Submodular Functions as a starting point.

Can one design efficient approximation algorithms under
Submodular Cost Functions?
Assumptions over cost functions
ØNormalized:

ØMonotone:

ØDecreasing Marginal: Submodularity

+             ≥              +
Submodular Functions
General Framework
 Ground set X and collection C µ 2X
 C: set of all tours, set of all spanning trees

 k agents, each specifies  fi: 2X  → R+
 fi is submodular and monotone
S
 Find S1, …, Sk such that:                          f(S)   ORACLE
 [ Si 2 C
 i fi(Si) is minimized
Our Results
Single Agent                  Multiple Agents

Problem            Upper Bound       Lower Bound    Upper Bound      Lower Bound
Vertex Cover            2                   2-ϵ       2. log n          (log
￿￿ n)

Shortest Path        O(n2/3 )               (n2/3
￿￿ )       O(n2/3 )           (n2/3
￿￿ )

Spanning Tree           n                    (n)
￿￿           n                   (n)
￿￿
Perfect Matching        n                    (n)
￿￿           n                   (n)
￿￿

Lower Bounds : Information theoretic
Upper Bounds : Rounding of configurational LPs,
Approximating sumdodular functions and Greedy
Selected Related Work
Ø [Grötschel, Lovász, Schrijver 81] Minimizing non-monotone
submodular function is poly-time
Ø [Feige, Mirrokni, Vondrak 07] Maximizing non-monotone function
is hard. 2/5-Approximation Algorithm.
Ø [Calinescu, Chekuri, Pal, Vondrak 08] Maximizing monotone function
subject to Matroid constraint: 1-1/e Approximation.

Ø [Svitkina, Fleischer 09] Upper and lower bounds for Submodular
load balancing, Sparsest Cut, Balanced Cut
Ø[Iwata, Nagano 09] Bounds for Submodular Vertex Cover, Set Cover
Ø[Chekuri, Ene 10] Bounds for Submodular Multiway Partition
In this talk
 Submodular Shortest Path with single agent
ü O(n 2/3) approximation algorithm
ü Matching hardness of approximation
In this talk
 Submodular Shortest Path with single agent
ü O(n 2/3) approximation algorithm
ü Matching hardness of approximation
Submodular Shortest Path
t

s

G=(V,E)
|V| =n , |E| =m
Given: Graph G, Two nodes s and t
f : 2E  → R+
Submodular, Monotone

Goal: Find path P s.t. f(P) is minimized
Attempt 1: Approximate by Additive
function
 Let we = f({e})
 Idea : e 2 OPT · OPT ·  2we
we             e OPT

t   1. Guess
e* = argmax{we| e 2OPT }
s                                         2. Pruning: Remove edges
costlier than e*
3. Search: Find the shortest
length s-t path in the residual
graph

ALG · diameter(G’).we* · diameter(G’).OPT
Attempt 2: Ellipsoid Approximation
John’s theorem : For every polytope P, there exists
an ellipsoid contained in it that can be scaled by a factor
of O(√n) to contain P

P

[GHIM 09]: If the convex body is a polymatroid , then
there is a poly-time algorithm to compute the ellipse.
Attempt 2: Ellipsoid Approximation

P

[GHIM 09]: If the convex body is a polymatroid , then
there is a poly-time algorithm to compute the ellipse.

∀S: ∑e 2 S x(e) ≤ f(S)
∀e: x(e) ≥ 0           Polymatroid
f: Submodular, monotone
Approximating Submodular Functions
Polynomial
d4             time
d5

|X| = n          d1       d6     f : Monotone submodular function
d3
d2               g(S) = √ e S de
2

X

g(S) · f(S) · √ n g(S)
Attempt 2: Ellipsoid Approximation
STEP 1:

[GHIM ‘09]
f: 2E → R+                                               {de}
Submodular, Monotone                                        g(S): = √  de

STEP 2:

Min g(S)
s.t. S 2 PATH(s,t)

* Minimizing over g(S) is equivalent to minimizing just the additive part

Analysis: f(P) ≤         g(P)
√E                                    P: Optimum path under g
≤ √E g(O) ≤          √E   f(O)            O: Optimum path under f
Recap.
ü Approximating by linear functions : Works for graphs with
small diameter
ü Approximating by ellipsoid functions : Works for sparse
graphs

n/2

n/2

Dense Graph with large diameter
Algorithm for Shortest Path
STEP 1: Pruning
- Guess edge e* = argmax {we | e ϵ OPT path}
- Remove edges costlier than we*
Algorithm for Shortest Path
STEP 1: Pruning
- Guess edge e* = argmax {we | e ϵ OPT path}
- Remove edges costlier than we*
STEP 2 : Contraction
- if ∃ v , s.t. degree(v) > n1/3, contract neighborhood of v
- repeat
s                        s

t

t

Dense connected component
Algorithm for Shortest Path
STEP 1: Pruning
- Let we = f({e})
- Guess edge e* = argmax {we | e ϵ OPT path}
- Remove edges costlier than we*
STEP 2 : Contraction
- if ∃ v , s.t. degree(v) < n1/3, contract neighborhood of v
- repeat
STEP 3 : Ellipsoid Approximation
- Calculate ellipsoidal approximation (d,g) for the residual
graph
Algorithm for Shortest Path
STEP 1: Pruning
- Let we = f({e})
- Guess edge e* = argmax {we | e ϵ OPT path}
- Remove edges costlier than we*
STEP 2 : Contraction
- if ∃ v , s.t. degree(v) < n1/3, contract neighborhood of v
- repeat
STEP 3 : Ellipsoid Approximation
- Calculate ellipsoidal approximation (d,g) for the residual
graph
STEP 4 : Search
- Find shortest s-t path according to g.
s

t
Algorithm for Shortest Path
STEP 1: Pruning
- Let we = f({e})
- Guess edge e* = argmax {we | e ϵ OPT path}
- Remove edges costlier than we*
STEP 2 : Contraction
- if ∃ v , s.t. degree(v) < n1/3, contract neighborhood of v
- repeat
STEP 3 : Ellipsoid Approximation
- Calculate ellipsoidal approximation (d,g) for the residual graph
STEP 4 : Search
- Find shortest s-t path according to g.
STEP 5 : Reconstruction
- Replace the path through each contracted vertex with one having the
fewest edges.
Path having
fewest edges
s

t
Analysis

s
R
P1

P2

t
Bounding the cost of P1

P1

s                                P2
R

f(P1) ≤   √ E(R) .g(P1)
≤   √ E(R).g(OPT)
Has at most                  ≤   √ E(R) .f(OPT)
n4/3 edges                   ≤   n2/3 f(OPT)
t
Bounding the cost of P2

G1
Diam(Gi) · |Gi|/n1/3
s

G2

f(P2) ≤   (dia(G1) +.. +dia(Gk) ) we*
≤   (|G1| / n1/3 + …. ) we*
G3              ≤    (n / n1/3) we*
t          ≤   n2/3 f(OPT)
In this talk
 Submodular Shortest Path with single agent
ü O(n 2/3) approximation algorithm
ü Matching hardness of approximation
Information Theoretic Lower Bound

S1                                   f(S1)
S2                 f                 f(S2)
S3                                   f(S2)

 Polynomial number of queries to the oracle
 Algorithm is allowed unbounded amount of time to process the
results of the queries
 Not contingent on P vs NP
General Technique
 Cost functions f , g satisfying
 OPT( f ) >> OPT( g )
 f (S) = g(S) for ‘most’ sets S

 A – any randomized algorithm

 f(Q ) = g( Q ) with high probability for every query Q made
by A. Probability over random bits in A.
Yao’s Lemma

f(Q) = g(Q) with high probability for every query Q made by
randomized algorithm A.

f and a distribution D from which we choose g, such that for an
arbitrary query Q , f(Q) = g(Q) with high probability
Non-combinatorial Setting
X : Ground set
f(S) = min{ |S|, ®
}

D : R µ X, |R| = ®
gR(S) = min{| S Å Rc| + min( S Å R, ¯ ) }
Optimal Query
Claim : Optimal query has size ®
Case 1 : |Q| < ®

Probability can only increase if we increase |Q|
Case 2 : |Q| > ®

Probability can only increase if we decrease |Q|

Optimal query size to distinguish f and gR is ®
Distinguishing f and gR

Chernoff
Bounds

f and g are hard to distinguish   ¯ = (1+ ±) E[|Q Å R|]
Hardness of learning submodular
functions
 Set ® = n1/2log n
 Optimal query size = ® = n1/2log n
 |R| = ® = n1/2log n
 E[ Q Å R] = log2n                Super logarithmic
 ¯ = (1+±) E[ Q Å R] = (1+±)log2n

f and g are indistinguishable
f(R) = min{ |R|, ® } = |R| = ® = n1/2 log n
gR(R) = min{| R Å Rc| + min( R Å R, ¯ ) } = ¯ = log2 n
Corollary : Hard to learn a submodular function to a factor better
than n1/2/log n in polynomial value queries.
Difficulty in Combinatorial Setting
   Randomly chosen set may not be a feasible solution in the
combinatorial setting.

Eg. Randomly chosen set of edges rarely yield a s-t path.

Solution :
1. Do not choose R randomly from the entire domain X.
2. Use a subset of R as a proxy for the solution.
Base Graph G

s                            …...   t
n1/3
vertices

n2/3 levels
Functions f and g
Y

s                    …     …          t
….    ….

B

f(S) = f( S Å B ) & g(S) = g( S Å B
)
Functions f and g
Y

s                        …    …    t
….   ….

B
f(S) = min( |S Å B|,
α)
Functions f and g
Solution :
1. Do not choose R randomly from the entire domain X.
2. Use a subset of R as a proxy for the solution.
Y

s                      …    …         t
…    …
.    .
Uniform random
B                  subset of B of size ®

gR(S) = min{| S Å R Å B| + min( S Å R Å B, ¯ )}
Functions f and g
Solution :
1. Do not choose R randomly from the entire domain X.
2. Use a subset of R as a proxy for the solution.
Y

R = n2/3 log2 n
s                            …     …            t
….    ….

B
gR(S) = min{| S Å R Å B| + min( S Å R Å B, ¯ )
Setting the constants
 Set ® = n2/3 log2 n
 Optimal Query size = ® = n2/3 log2 n
 ¯ = log2 n

f and g are indistinguishable

f(OPT) = min{ |R|, ® } = |R| = ® = O(n2/3 log2 n)
gR(OPT) = min{| R Å Rc| + min( R Å R, ¯ ) } = ¯ = log2 n

Theorem : Submodular Shortest Path problem is hard to
approximate to a factor better than O(n2/3)
Summary
Single Agent                  Multiple Agents

Problem            Upper Bound       Lower Bound    Upper Bound        Lower Bound
Vertex Cover            2                   2-ϵ       2. log n            (log
￿￿ n)

Shortest Path        O(n2/3 )               (n2/3
￿￿ )       O(n2/3 )             (n2/3
￿￿ )

Spanning Tree           n                    (n)
￿￿           n                   (n)
￿￿
Perfect Matching        n                    (n)
￿￿           n                   (n)
￿￿

n: # of vertices in
graph G

What’s the right model to study economies of scale?
 Discount Models
f
E           R

g
E           R

E
hR
Payment                    Sub modular
functions

Cost

f(a) + f(b) + f(c) ….

Task: Minimize sum of payments
Approximability under Discounted Costs[GTW 09]

Problem          Lower Bound      Upper
bound
Edge Cover                   O(log n)       O(log n)
Spanning Tree                O(log n)       O(log n)
Shortest Path               O(poly log n)      n
Minimum Perfect Matching    O(poly log n)      n
Shortest Path : O(logc n) hardness
S                U           s

t
Agents -
Cost of every edge is 1

1

1
Set Cover Instance

Claim : Set cover of size |S|   ↔ Shortest path of length |S|
Hardness Gap Amplification
s                                        s

t

Original Instance

•   Replace each edge by a copy
of the original graph.

•   Edges of the same color get
the same copy.

•   Edges of different colors gets                         t
copies with new
colors(agents)                       Harder Instance
 Claim : The new instance has a solution of cost α2 iff the
original instance has a solution of cost α.

 For any fixed constant c iterate this construction c times to
further amplify the lower bound to O(logcn).
   Q.E.
Why is it so hard to distinguish f and g ?
 Observation: fR(S ) is at most g(S ) for any set S.
 Case 1: ‘Small’ size queries - |Q | ≤ n

 This probability can only increase if we increase |Q |
 Case 2: ‘Large’ size queries - |Q | ≥ n

 This probability can only increase if we decrease |Q |
Combinatorial Optimization
ØC - Ground set
Øf - Valuation function over subsets of C
ØX - Collection of some subsets C having a special
property
ØTask - Find the set in X that has minimum cost under a
given valuation function.
General Technique cont.

S1                           f(S1) = g(S1)
S2                 f         f(S2) = g(S2)

S3                           f(S3) = g(S3)

A cannot distinguish        Output is at least
between f and g             OPT( g )
α ≥ OPT( g )
OPT( f )
Plan
 Fix a cost function f
 Fix a distribution D of functions such that for every g in D
 OPT(f ) >> OPT (g)
 For an arbitrary query Q , f(Q) = g(Q) with high probability
Optimal size queries
 Queries of size n- |Q | = n

```
To top