# Approximation Algorithms for Stochastic Optimization

Document Sample

```					Risk-averse Stochastic Optimization:
Models + Algorithms

Chaitanya Swamy
University of Waterloo
Risk-averse Stochastic Optimization:
Probabilistically-constrained models +
Algorithms for Black-box
Distributions

Chaitanya Swamy
University of Waterloo
Two-Stage Recourse Model
Given : Probability distribution over inputs.
or hedge against uncertainty.
Observe the actual input scenario.
Stage II : Take recourse. Can augment earlier
solution paying a recourse cost.

Choose stage I decisions to minimize
(stage I cost) + (expected stage II recourse cost).
2-Stage Stochastic Facility Location
Distribution over clients gives
the set of clients to serve.

Stage I: Open some facilities in
advance; pay cost fi for facility i.
facility   stage I facility   Stage I cost = ∑(i opened) fi .
client set D
2-Stage Stochastic Facility Location
Distribution over clients gives
the set of clients to serve.

Stage I: Open some facilities in
advance; pay cost fi for facility i.
facility   stage I facility      Stage I cost = ∑(i opened) fi .
stage II facility
Actual scenario A = {       clients to serve}, materializes.
Stage II: Can open more facilities to serve clients in A; pay cost
fiA to open facility i. Assign clients in A to facilities.
Stage II cost = ∑ i opened in fiA + (cost of serving clients in A).
scenario A
Want to decide which facilities to open in stage I.
Goal: Minimize Total Cost =
(stage I cost) + EA D [stage II cost for A].

How is the probability distribution specified?
• A short (polynomial) list of possible scenarios
• Independent probabilities that each client exists
• A black box that can be sampled. black-box setting
Risk-averse stochastic optimization
• E[.] measure does not adequately model the
“risk” associated with stage-I decisions
• Same E[.] value  same “risk involved”: given
two solutions with same E[.] cost, prefer solution
with more “assured” or “reliable” second-stage
component (costs). E.g. portfolio investment
• Want to capture above notion of risk-averseness,
where one seeks to avoid disaster scenarios
Modeling risk-aversion: attempt 1
Choose stage I decisions to minimize        Budget model
(stage I cost) + (expected stage II recourse cost)
subject to
(stage II cost of scenario A) ≤ B for every scenario A
Gupta-Ravi-Sinha: considered stochastic Steiner tree in this
budget model in the polynomial-scenario setting
Budget model provides greatest degree of risk-aversion
Modeling risk-aversion: attempt 1
Choose stage I decisions to minimize        Budget model
(stage I cost) + (expected stage II recourse cost)
subject to
(stage II cost of scenario A) ≤ B for every scenario A
Gupta-Ravi-Sinha: considered stochastic Steiner tree in this
budget model in the polynomial-scenario setting
Budget model provides greatest degree of risk-aversion, BUT
– limited modeling power: cannot get any approximation
guarantees in black-box setting with bounded sample size
– overly conservative: protects every scenario regardless of its
probability
Closely-related model
Choose stage I decisions to minimize        Robust model
(stage I cost) + (maximum stage II recourse cost)
• Dhamdhere et al. considered this model, again in the
polynomial-scenario setting
•   “Guessing” B = max. (stage II cost) “reduces” robust-
problem to the budget problem
•   Modeling issues: not clear how to even specify
exponentially many scenarios
– Feige et al.: scenarios specified by cardinality constraint;
seems rather stylized for stochastic optimization
– Will consider distribution-based robust model:
scenario-collection = support of distribution
• Same drawbacks as in the budget model – no guarantees
possible in black-box setting
Modeling risk-aversion: attempt 2
recall,
budget model
Choose stage I decisions to minimize
(stage I cost) + (expected stage II recourse cost)
subject to
(stage II cost of scenario A) ≤ B for every scenario A
• For the budget-model, one can prove approximation
results if one is allowed to violate the budget-
constraints with a small probability
• Can turn the above solution concept around and
incorporate it into the model to arrive at the
following new model
Modeling risk-aversion: attempt 2
Risk-averse
Choose stage I decisions to minimize         budget model
(stage I cost) + (expected stage II recourse cost)
subject to
PrA[(stage II cost of scenario A) > B] ≤ r
r: input – can tradeoff risk-averseness vs. conservatism
• Called probabilistically- or chance- constrained program
• Chance constraint called Value-at-Risk (VaR) constraint in
finance literature: popular for risk-optimization in finance
• Related robust model: minimize (stage I cost) +
(1-r)-quantile of (stage II recourse cost)
Approximation Algorithm
Hard to solve the problem exactly.
Even special cases are #P-hard.
Settle for approximate solutions. Give polytime
algorithm that always finds near-optimal solutions.
A is a a-approximation algorithm if,
•A runs in polynomial time.
•A(I) ≤ a.OPT(I) on all instances I,
a is called the approximation ratio of A.
Our Results
• Obtain approximation algorithms for various risk-averse
budgeted (and robust) problems in the black-box setting:
facility location, set cover, vertex cover, multicut on trees, min cut
• Give a fully polynomial approximation scheme for solving the
LP-relaxations of a large class of risk-averse problems  can
use existing algorithms for deterministic or 2-stage version of
problem to get approximation algorithm for risk-averse problem
• First approximation results for chance-constrained programs
and black-box distributions (Kleinberg-Rabani-Tardos consider
chance-constrained versions of bin-packing, knapsack but for
specialized product distributions)
Related Work
• Gupta et al.: gave a const.-approx. for stochastic Steiner
tree in the poly-scenario budget model
• Dhamdhere et al., Feige et al.: gave approx. algorithms for
various problems in robust model with poly-scenarios,
cardinality-collections
• So-Zhang-Ye: consider another risk measure called
conditional VaR; give an approx. scheme for solving the
LP-relaxations of problems in black-box setting
– Can use our techniques to solve a generalization of their
model, where one has probabilistic budget constraints
• Lots of work in the standard 2-stage model: Dye et al., Ravi-
Sinha, Immorlica et al., Gupta et al.+, Shmoys-S, S-Shmoys ……
Risk-averse Set Cover (RASC)
Universe U = {e1, …, en }, subsets S1, S2, …, Sm  U, set S has
weight wS.
Deterministic problem (DSC): Pick a minimum weight
collection of sets that covers each element.
Risk-averse budgeted version: Target set of elements to be
covered is given by a probability distribution.
– choose some sets initially paying wS for set S
– subset A  U to be covered is revealed
– can pick additional sets paying wSA for set S.
Minimize (w-cost of sets picked in stage I) +
EA U [wA-cost of new sets picked for scenario A].
subject to PrA U [wA-cost for scenario A > B] ≤ r
Fractional risk-averse set cover
Fractional risk-averse problem: can buy sets
fractionally in stage I and in each scenario A to cover
the elements in A to an extent of 1

Not clear how to solve even the fractional problem
in the polynomial-scenario setting.
Why? The set of feasible solutions
{(x, {yA}A) : (x, yA) covers A for each scenario A,
PrA[∑S wAS yA,S > B] ≤ r}
is NOT a convex set.
How to get an LP-relaxation?
An LP for fractional RASC
For simplicity, consider wSA = WS for every scenario A.
xS : indicates if set S is picked in stage I
rA : indicates if budget-constraint is NOT met for A
{yA,S} : decisions in scenario A when budget-constraint is met for A
{zA,S}: decisions in scenario A when budget-constraint is not met for A

Minimize ∑S wSxS + ∑AU pA ∑S WS(yA,S + zA,S)
subject to,          ∑A pArA           ≤r
∑S WS yA,S        ≤B         for each A
∑S:eS xS + ∑S:eS yA,S + rA    ≥1         for each A, eA
∑S:eS xS + ∑S:eS (yA,S + zA,S) ≥ 1       for each A, eA
xS, yA,S, zA,S    ≥0        for each S, A.
Minimize ∑S wSxS + ∑AU pA ∑S WS(yA,S + zA,S)
subject to,          ∑A pArA          ≤r       Coupling constraint
∑S WS yA,S       ≤B     for each A
∑S:eS xS + ∑S:eS yA,S + rA   ≥1     for each A, eA
∑S:eS xS + ∑S:eS (yA,S + zA,S) ≥ 1   for each A, eA
xS, yA,S, zA,S    ≥0    for each S, A.
• Exponential number of variables and exponential
number of constraints.
• The scenarios are no longer separable: i.e., a first-stage
solution x alone is not enough to specify LP solution:
need to specify the rAs – what does solving LP mean?
– Contrast wrt. standard 2-stage model, or fractional risk-
averse problem
Theorem 1: For any e,k>0, in time poly(1/ekr), can compute a first-
stage soln. x that extends to an LP-soln. (x, {(yA,zA,rA)}A) of
cost ≤ (1+e)OPT where ∑A pArA ≤ r(1+k).
Dependence on 1/kr is unavoidable in black-box setting.

Theorem 2 (rounding theorem): Given a soln. x that extends to an
LP-soln. (x, {(yA,zA,rA)}A) of cost C and ∑A pArA = P, can round x to
• a soln. x' for fractional RASC s.t. w.x' + EA[opt. frac. cost of A] ≤ 2C,
PrA[opt. frac. cost of A > 2B] ≤ 2P
[Can now use any LP-based “local” approx. for 2-stage SC to round x']
• a soln (X, {YA}A) for (integer) RASC s.t.
w.X + EA[W.YA] ≤ 4aC,         PrA[W.YA > 4aB] ≤ 2P
using any LP-based a-approx. algo. for DSC.
Rounding the LP
Given a soln. x that extends to an LP-soln. (x, {(yA,zA,rA)}A) of
cost C and ∑A pArA = P
LP constraints:        ∑S WS yA,S         ≤B         for each A
∑S:eS xS + ∑S:eS yA,S + rA       ≥1         for each A, eA
∑S:eS xS + ∑S:eS (yA,S + zA,S)   ≥1         for each A, eA
For every A, either we have
rA ≥ 0.5       OR      ∑S:eS xS + ∑S:eS yA,S ≥ 0.5 for each eA
“Threshold rounding”: if rA ≥ 0.5, set r'A = 1, else r'A = 0; set x' = 2x
Let fA(x') = opt. fractional cost of scenario A given stage-I soln. x'
fA(x') ≤ W.(yA+zA)  w.x' + EA[fA(x')] ≤ 2C
In scenario A, if rA ≤ 0.5, then (x', 2yA) covers A  fA(x') ≤ 2B
So      PrA[fA(x') > 2B] ≤ PrA[rA > 0.5] ≤ 2 ∑A pArA = 2P
Rounding (contd.)
Rounding x' to an integer soln. to RASC: can use an a-approximation
algorithm for 2-stage stochastic problem that is
(i) LP-based, (ii) “local”, i.e., gives per-scenario cost guarantees,
[(iii) can be implemented given only a first-stage solution]
to obtain integer solution (X, {YA}A) of cost ≤ a.2C, and
PrA[cost of A > a.2B] ≤ 2P
• set cover, vertex cover, multicut on trees: Shmoys-S gave such a 2b-
approx. algorithm using an LP-based b-approx. algo. for deterministic
problem  get ratios of 4log n, 8, 8 respectively
• min s-t cut: can use O(log n)-approx. algorithm of Dhamdhere et al.
for stochastic min s-t cut, which is local
• Also, facility location: not set cover, but very similar rounding; get 11-
approx. using variant of Shmoys-S algorithm for 2-stage FL
Solving the fractional-RASC LP:
Sample Average Approximation
Sample Average Approximation (SAA) method:
– Sample some N times from distribution
– Estimate pA by qA = frequency of occurrence of scenario A = nA/N.
– Construct sample average LP, where pA is replaced by qA in LP
How large should N be?

Wanted result: With poly-bounded N,
x is an optimal solution to sample average problem 
x is a near-optimal soln. to true problem with small blow-up of r
Solving the fractional-RASC LP
Minimize ∑S wSxS + ∑AU pA ∑S WS(yA,S + zA,S)
subject to,          ∑A pArA          ≤r     (*)
∑S WS yA,S       ≤B     for each A
∑S:eS xS + ∑S:eS yA,S + rA   ≥1     for each A, eA
∑S:eS xS + ∑S:eS (yA,S + zA,S) ≥ 1   for each A, eA
xS, yA,S, zA,S    ≥0    for each S, A.

1) Lagrangify coupling constraint (*) to get a separable
problem
Solving the fractional-RASC LP
Minimize ∑S wSxS + ∑AU pA ∑S WS(yA,S + zA,S)
subject to,          ∑A pArA          ≤r     (*) ] x D ≥ 0
∑S WS yA,S       ≤B     for each A
∑S:eS xS + ∑S:eS yA,S + rA   ≥1     for each A, eA
∑S:eS xS + ∑S:eS (yA,S + zA,S) ≥ 1   for each A, eA
xS, yA,S, zA,S    ≥0    for each S, A.

1) Lagrangify coupling constraint (*) to get a separable
problem
Solving the fractional-RASC LP
h(D; x) = w.x + ∑AU pA gA(D; x)

MaxD ≥ 0 [-Dr + min ( ∑S wSxS + ∑AU pA (D rA + ∑S WS(yA,S + zA,S)))]
OPT(D)
subject to,             ∑S WS yA,S        ≤B       for each A
∑S:eS xS + ∑S:eS yA,S + rA      ≥1       for each A, eA
∑S:eS xS + ∑S:eS (yA,S + zA,S)   ≥1       for each A, eA
xS, yA,S, zA,S     ≥0       for each S, A.

After Lagrangification, inner minimization problem
becomes a separable 2-stage problem
Solving the fractional-RASC LP
MaxD ≥ 0 [-Dr + min ( h(D; x) = w.x + ∑AU pA gA(D; x))]
2) Argue that for each fixed D, can compute efficiently a
“near-optimal” solution to inner-minimization problem
3) Use this to search for “right” value of the Lagrange-multiplier D:
Solving the fractional-RASC LP
MaxD ≥ 0 [-Dr + min ( h(D; x) = w.x + ∑AU pA gA(D; x))]
2) Argue that for each fixed D, can compute efficiently a
“near-optimal” solution to inner-minimization problem
3) Use this to search for “right” value of the Lagrange-multiplier D:
search is complicated by (i) only have approx. solns. for each D, (ii)
cannot actually compute ∑A pArA but have to estimate it
Problems with 2): Cannot compute a “good” optimal solution;
2-stage problem does not fall into the solvable class in Shmoys-S, or
Charikar-Chekuri-Pal – their arguments do not directly apply
Crucial insight: For the search in 3) to work, suffices to prove the
weak guarantee: can compute x s.t. h(D; x) ≈ (1+s)OPT(D) + hD
Weak enough that can show that sample-average-approximation
works, by using approx.-subgradient proof technique (S-Shmoys)
2) Near-optimal soln. for fixed D
Use sample average approximation:
replace minxP (h(D; x) = w.x + ∑AU pA gA(D; x))      (PD)
with    minxP (h'(D; x) = w.x + ∑AU qA gA(D; x))    (SA-PD)
where qA = frequency of occurrence of scenario A in N samples

Want to show: With poly-bounded N,
(*) if x solves (SA-PD) then h(D; x) ≈ (1+s)OPTD + hD.

h(D; .) and h'(D; .) can take very different values; BUT can prove
(*) by showing that their “slopes” are “close” to each other
2) Near-optimal soln. for fixed D
Use sample average approximation: replace problem
minxP (h(D; x) = w.x + ∑AU pA gA(D; x))  (PD) with
minxP (h'(D; x) = w.x + ∑AU qA gA(D; x)) (SA-PD)
h(D; x)
where qA = frequency of occurrence of scenario A in N samples
h'(D; x)
To show: With poly-bounded N,
(*) if x solves (SA-PD) *then h(D; x) ≈ (1+s)OPTD + hD.
x x                    x

h(D; .) and h'(D; .) can take very different values; BUT can prove
(*) by showing that their “slopes” are “close” to each other
m
For a (convex) function g:   ,
dm is a subgradient of g(.) at u, if "v, g(v) – g(u) ≥ d.(v–u).
d is an (e,h)-subgradient of g at u, if "v, g(v) – g(u) ≥ d.(v–u) – e.g(v) – e.g(u) – h.

Closeness-in-subgradients: At “most” points u in P, \$vector d'u such that
(#) d'u is a subgradient of g'(.) at u, AND an (e,h)-subgradient of g(.) at u.
Lemma (S-Shmoys): For any convex functions g(.), g'(.), if (#) holds then,
x solves minxP g'(x)  x is a near-optimal solution to minxP g(x).
[(#) holds with high probability for h(D; .) and h'(D; .) (for suitable e,h).]
dm is a subgradient of g(.) at u, if "v, g(v) – g(u) ≥ d.(v–u).
d is an (e,h)-subgradient of g at u, if "v, g(v) – g(u) ≥ d.(v–u) – e.g(v) – e.g(u) – h.
Closeness-in-subgradients: At “most” points u in P, \$vector d'u such that
(#) d'u is a subgradient of g'(.) at u, AND an (e,h)-subgradient of g(.) at u.
Lemma: For any convex functions g(.), g'(.), if (#) holds then,
x solves minxP g'(x)  x is a near-optimal solution to minxP g(x).
Intuition:
• Minimizer of convex function is determined by subgradient.
P
• Ellipsoid-based algorithm of SS04 for convex minimization
only uses (e-) subgradients: uses (e-) subgradient to cut                 g(x) ≤ g(u)
u
ellipsoid at a feasible point u in P                        du
(#)  can run SS04 algorithm on both minxP g(x) and
minxP g'(x) using same vector d'u to cut ellipsoid at uP
 algorithm will return x that is near-optimal for both problems.
h(D; .) and h'(D; .)
True problem:           minxP (h(D; x) = w.x + ∑AU pA gA(D; x))             (PD)
Sample average problem: minxP (h'(D; x) = w.x + ∑AU qA gA(D; x))            (SA-PD)

To show: At “most” points u in P, \$vector d'u such that
d'u is a subgradient of h'(D; .) at u, AND an (e,hD)-subgradient of h(D; .) at u.

Fix uP. Let l = maxS WS /wS.
• subgradient of h(D; .) at u is du = (du,S) with du,S = wS – ∑A pAzA,S = wS – E[zA,S],
where zA,S = quantity derived from optimal dual soln. to gA(D; u)
• subgradient of h'(D; .) at u is d'u = (d'u,S) with d'u,S = wS – ∑A qAzA,S = wS –
E'[zA,S]
• structure of dual implies that zA,S ≤ WS + D for all S
 using poly(l/eh) samples can ensure that |d'u,S – du,S| ≤ e.wS + hD/2m "S whp.
suffices to show that d'u is an (e, hD)-subgradient of h(D; .) at u whp.
Union bound shows that this holds for “most” points in P
Summary and Extensions
• Although LP-relaxation of (fractional) problem is non-
separable, has exponential size, can still compute near-
optimal LP-first-stage decisions: present an FPTAS
– LP-first-stage decisions are sufficient to round and obtain near-
optimal solution to fractional problem, which can be further
rounded using various known approx. algorithms.
– Many applications: set cover, vertex cover, facility location,
min s-t cut, multicut on trees: obtain first approximation
algorithms for chance-constraints + black-box model
• Get same results for (i) non-uniform budgets; (ii) risk-averse
robust problems; (iii) simultaneous budget constraints, e.g.,
Pr[facility cost > BF or service cost > BS or total cost > B] ≤ r
• (iv) B=0 problem: interesting one-stage problem; choose
initial decisions so as to satisfy “most” scenarios
Open Questions
• Approximation results for other problems
in the risk-averse models.
• Models and algorithms for multi-stage risk-
averse stochastic optimization (in black-box
setting).
• Risk-averse stochastic scheduling.
• Other combinations of multiple probabilistic
budget constraints.
Thank You.

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 3 posted: 2/14/2012 language: English pages: 36