VIEWS: 3 PAGES: 32 POSTED ON: 4/19/2011 Public Domain
Approximation Algorithms for Stochastic Optimization Chaitanya Swamy Caltech and U. Waterloo Joint work with David Shmoys Cornell University Stochastic Optimization • Way of modeling uncertainty. • Exact data is unavailable or expensive – data is uncertain, specified by a probability distribution. Want to make the best decisions given this uncertainty in the data. • Applications in logistics, transportation models, financial instruments, network design, production planning, … • Dates back to 1950’s and the work of Dantzig. Stochastic Recourse Models Given : Probability distribution over inputs. Stage I : Make some advance decisions – plan ahead or hedge against uncertainty. Uncertainty evolves through various stages. Learn new information in each stage. Can take recourse actions in each stage – can augment earlier solution paying a recourse cost. Choose initial (stage I) decisions to minimize (stage I cost) + (expected recourse cost). 2-stage problem k-stage problem 2 decision points k decision points stage I 0.2 stage I 0.2 0.4 0.3 stage II 0.3 0.1 0.02 0.5 stage II scenarios scenarios in stage k 2-stage problem k-stage problem 2 decision points k decision points stage I 0.2 stage I 0.2 0.4 0.3 stage II 0.3 0.1 0.02 0.5 stage II scenarios scenarios in stage k Choose stage I decisions to minimize expected total cost = (stage I cost) + Eall scenarios [cost of stages 2 … k]. 2-Stage Stochastic Facility Location Distribution over clients gives the set of clients to serve. Stage I: Open some facilities in advance; pay cost fi for facility i. facility stage I facility Stage I cost = ∑(i opened) fi . client set D 2-Stage Stochastic Facility Location Distribution over clients gives the set of clients to serve. Stage I: Open some facilities in advance; pay cost fi for facility i. facility stage I facility Stage I cost = ∑(i opened) fi . Actual scenario A = { clients to serve}, materializes. Stage II: Can open more facilities to serve clients in A; pay cost fiA to open facility i. Assign clients in A to facilities. Stage II cost = ∑ i opened in fiA + (cost of serving clients in A). scenario A Want to decide which facilities to open in stage I. Goal: Minimize Total Cost = (stage I cost) + EA D [stage II cost for A]. How is the probability distribution specified? • A short (polynomial) list of possible scenarios • Independent probabilities that each client exists • A black box that can be sampled. Approximation Algorithm Hard to solve the problem exactly. Even special cases are #P-hard. Settle for approximate solutions. Give polytime algorithm that always finds near-optimal solutions. A is a a-approximation algorithm if, •A runs in polynomial time. •A(I) ≤ a.OPT(I) on all instances I, a is called the approximation ratio of A. Overview of Previous Work • polynomial scenario model: Dye, Stougie & Tomasgard; Ravi & Sinha; Immorlica, Karger, Minkoff & Mirrokni. • Immorlica et al.: also consider independent activation model proportional costs: (stage II cost) = l(stage I cost), e.g., fiA = l.fi for each facility i, in each scenario A. • Gupta, Pál, Ravi & Sinha (GPRS04): black-box model but also with proportional costs. • Shmoys, S (SS04): black-box model with arbitrary costs. approximation scheme for 2-stage LPs + rounding procedure “reduces” stochastic problems to their deterministic versions. for some problems improve upon previous results. Boosted Sampling (GPRS04) Proportional costs: (stage II cost) = l(stage I cost) Note: l is same as s in previous talk. – Sample l times from distribution – Use “suitable” algorithm to solve deterministic instance consisting of sampled scenarios (e.g., all sampled clients) – determines stage I decisions Analysis relies on the existence of cost-shares that can be used to share the stage I cost among sampled scenarios. Shmoys, S ’04 vs. Boosted sampling Both work in the black-box model: arbitrary distributions. – Can handle arbitrary costs Need proportional costs: in the two stages. (stage II cost) = l(stage I cost) l can depend on scenario. – LP rounding: give an Primal-dual approach: cost- algorithm to solve the shares obtained by exploiting stochastic LP. structure via primal-dual schema. – Need many more samples Need only l samples. to solve stochastic LP. Stochastic Set Cover (SSC) Universe U = {e1, …, en }, subsets S1, S2, …, Sm U, set S has weight wS. Deterministic problem: Pick a minimum weight collection of sets that covers each element. Stochastic version: Set of elements to be covered is given by a probability distribution. – choose some sets initially paying wS for set S – subset A U to be covered is revealed – can pick additional sets paying wSA for set S. Minimize (w-cost of sets picked in stage I) + EA U [wA-cost of new sets picked for scenario A]. A Linear Program for SSC For simplicity, consider wSA = WS for every scenario A. wS : stage I weight of set S pA : probability of scenario A U. xS : indicates if set S is picked in stage I. yA,S : indicates if set S is picked in scenario A. Minimize ∑S wSxS + ∑AU pA ∑S WSyA,S subject to, ∑S:eS xS + ∑S:eS yA,S ≥ 1 for each A U, eA xS, yA,S ≥ 0 for each S, A. Exponential number of variables and exponential number of constraints. A Rounding Theorem Assume LP can be solved in polynomial time. Suppose for the deterministic problem, we have an a-approximation algorithm wrt. the LP relaxation, i.e., A such that A(I) ≤ a.(optimal LP solution for I) for every instance I. e.g., “the greedy algorithm” for set cover is a log n-approximation algorithm wrt. LP relaxation. Theorem: Can use such an a-approx. algorithm to get a 2a-approximation algorithm for stochastic set cover. Rounding the LP Assume LP can be solved in polynomial time. Suppose we have an a-approximation algorithm wrt. the LP relaxation for the deterministic problem. Let (x,y) : optimal solution with cost OPT. ∑S:eS xS + ∑S:eS yA,S ≥ 1 for each A U, eA for every element e, either ∑S:eS xS ≥ ½ OR in each scenario A : eA, ∑S:eS yA,S ≥ ½. Let E = {e : ∑S:eS xS ≥ ½}. So (2x) is a fractional set cover for the set E can round to get an integer set cover S for E of cost ∑SS wS ≤ a(∑S 2wSxS) . S is the first stage decision. Rounding (contd.) Sets Set in S Elements Element in E A Consider any scenario A. Elements in A E are covered. For every e A\E, it must be that ∑S:eS yA,S ≥ ½. So (2yA) is a fractional set cover for A\E can round to get a set cover of W-cost ≤ a(∑S 2WSyA,S) . Using this to augment S in scenario A, expected cost ≤ ∑SS wS + 2a.∑ AU pA (∑S WSyA,S) ≤ 2a.OPT. Rounding (contd.) An a-approx. algorithm for deterministic problem gives a 2a-approximation guarantee for stochastic problem. In the polynomial-scenario model, gives simple polytime approximation algorithms for covering problems. • 2log n-approximation for SSC. • 4-approximation for stochastic vertex cover. • 4-approximation for stochastic multicut on trees. Ravi & Sinha gave a log n-approximation algorithm for SSC, 2-approximation algorithm for stochastic vertex cover in the polynomial-scenario model. Rounding the LP Assume LP can be solved in polynomial time. Suppose we have an a-approximation algorithm wrt. the LP relaxation for the deterministic problem. Let (x,y) : optimal solution with cost OPT. ∑S:eS xS + ∑S:eS yA,S ≥ 1 for each A U, eA for every element e, either ∑S:eS xS ≥ ½ OR in each scenario A : eA, ∑S:eS yA,S ≥ ½. Let E = {e : ∑S:eS xS ≥ ½}. So (2x) is a fractional set cover for the set E can round to get an integer set cover S of cost ∑SS wS ≤ a(∑S 2wSxS) . S is the first stage decision. A Compact Convex Program pA : probability of scenario A U. xS : indicates if set S is picked in stage I. Minimize h(x) = ∑S wSxS + ∑AU pAfA(x) s.t. xS ≥ 0 for each S (SSC-P) where fA(x) = min. ∑S WSyA,S s.t. ∑S:eS yA,S ≥ 1 – ∑S:eS xS for each eA yA,S ≥ 0 for each S. Equivalent to earlier LP. Each fA(x) is convex, so h(x) is a convex function. The General Strategy 1. Get a (1+e)-optimal fractional first-stage solution (x) by solving the convex program. 2. Convert fractional solution (x) to integer solution – decouple the two stages near-optimally – use a-approx. algorithm for the deterministic problem to solve subproblems. Obtain a c.a-approximation algorithm for the stochastic integer problem. Many applications: set cover, vertex cover, facility location, multicut on trees, … Solving the Convex Program Minimize h(x) subject to xP. h(.) : convex Ellipsoid method P y • Need a procedure that at any point y, if yP, returns a violated inequality which shows that yP Solving the Convex Program Minimize h(x) subject to xP. h(.) : convex Ellipsoid method P • Need a procedure that at any point y, if yP, returns a violated inequality y h(x) ≤ h(y) which shows that yP d if yP, computes the subgradient of h(.) at y d is a subgradient of h(.) at u, if "v, h(v)-h(u) ≥ d.(v-u). m • Given such a procedure, ellipsoid runs in polytime and returns points x1, x2, …, xkP such that mini=1…k h(xi) is close to OPT. Computing subgradients is hard. Evaluating h(.) is hard. Solving the Convex Program Minimize h(x) subject to xP. h(.) : convex Ellipsoid method P • Need a procedure that at any point y, if yP, returns a violated inequality y h(x) ≤ h(y) which shows that yP if yP, computes an approximate d' subgradient of h(.) at y m d' is an e-subgradient at u, if "vP, h(v)-h(u) ≥ d'.(v-u) – e.h(u). Can compute e-subgradients by sampling. • Given such a procedure, can compute point xP such that h(x) ≤ OPT/(1-e) + r without ever evaluating h(.)! Putting it all together Get solution x with h(x) close to OPT. Sample initially to detect if OPT is large – this allows one to get a (1+e).OPT guarantee. Theorem: (SSC-P) can be solved to within a factor of (1+e) in polynomial time, with high probability. Gives a (2log n+e)-approximation algorithm for the stochastic set cover problem. A Solvable Class of Stochastic LPs Minimize h(x) = w.x + ∑AU pAfA(x) s.t. x P m 0 ≥ where fA(x) = min. wA.yA + cA.rA s.t. DA rA + TA yA ≥ jA – TA x yA m, rA n, yA, rA ≥ 0. Theorem: Can get a (1+e)-optimal solution for this class of stochastic programs in polynomial time. Includes covering problems (e.g., set cover, network design, multicut), facility location problems, multicommodity flow. Moral of the Story • Even though the stochastic LP relaxation has exponentially many variables and constraints, we can still obtain near-optimal fractional first-stage decisions • Fractional first-stage decisions are sufficient to decouple the two stages near-optimally • Many applications: set cover, vertex cover, facility locn., multicommodity flow, multicut on trees, … • But we have to solve convex program with many samples (not just l)! Sample Average Approximation Sample Average Approximation (SAA) method: – Sample initially N times from scenario distribution – Solve 2-stage problem estimating pA with frequency of occurrence of scenario A How large should N be to ensure that an optimal solution to sampled problem is a (1+e)-optimal solution to original problem? Kleywegt, Shapiro & Homem De-Mello (KSH01): – bound N by variance of a certain quantity – need not be polynomially bounded even for our class of programs. S, Shmoys ’05 : – show using e-subgradients that for our class, N can be poly-bounded. Charikar, Chekuri & Pál ’05: – give another proof that for a class of 2-stage problems, N can be poly-bounded. Multi-stage Problems Given : Distribution over inputs. k-stage problem k decision points Stage I : Make some advance decisions – hedge against uncertainty. stage I 0.2 0.4 Uncertainty evolves in various stages. 0.3 stage II 0.5 Learn new information in each stage. Can take recourse actions in each stage – can augment earlier solution paying a recourse cost. scenarios in stage k Choose stage I decisions to minimize expected total cost = (stage I cost) + Eall scenarios [cost of stages 2 … k]. Multi-stage Problems Fix k = number of stages. LP-rounding: S, Shmoys ’05 – Ellipsoid-based algorithm extends Computing e-subgradients – SAA method also works is significantly harder, black-box model, arbitrary costs need several new ideas Rounding procedure of SS04 can be easily adapted: lose an O(k)-factor over the deterministic guarantee – O(k)-approx. for k-stage vertex cover, facility location, multicut on trees; k.log n-approx. for k-stage set cover Gupta, Pál, Ravi & Sinha ’05: boosted sampling extends but with outcome-dependent proportional costs – 2k-approx. for k-stage Steiner tree (also Hayrapetyan, S & Tardos) – factors exponential in k for k-stage vertex cover, facility location Open Questions • Combinatorial algorithms in the black box model and with general costs. What about strongly polynomial algorithms? • Incorporating “risk” into stochastic models. • Obtaining approximation factors independent of k for k-stage problems. Integrality gap for covering problems does not increase. Munagala has obtained a 2-approx. for k-stage VC. • Is there a larger class of doubly exponential LPs that one can solve with (more general) techniques? Thank You.