Document Sample

Solving Large-Scale Computational Problems Using Insights from Statistical Physics Bart Selman Dept. of Computer Science Cornell University Joint work with Carla Gomes. Computational Challenges Many core computational tasks have been shown to be computationally intractable, i.e. solution times scales exponentially with problem size. We have results in e.g. Reasoning (logical and probabilistic) Planning and Scheduling Machine Learning Hardware and Software Design Exponential Complexity Growth Planning (single-agent): find the right sequence of actions HARD: 10 actions, 10! = 3 x 106 possible plans Contingency planning (multi-agent): actions may or may not produce the desired effect! 4 out 2 out of 8 1 out of 9 … of 10 REALLY HARD: 10 x 92 x 84 x 78 x … x 2256 = exponential 10224 possible contingency plans! 3 polynomial Computational Complexity Hierarchy Hard EXP-complete: games like Go, “chess”… EXP PSPACE-complete: QBF, adversarial PSPACE planning, … PH NP-complete: SAT, scheduling, graph coloring, … NP P-complete: circuit-value, … P In P: sorting, shortest path, compilers, databases… Easy Note: widely believed hierarchy; know P ≠ EXP for sure 4 An abundance of negative complexity results in comp. sci. (literally 10,000+). Results often apply to very restricted formalisms, and also to finding approximate solutions. However, results are based on a worst-case analysis and there continues to be a debate on their practical relevance. Contradictory experiences with practical algorithms. Question: When and where do computationally hard instances show up? New Developments A --- A better understanding of the nature of computationally hard problems. B --- New solution methods. Overview PART A. Computationally Hard Instances worst-case vs. average-case critically-constrained problems phase transitions (starts connection with statistical physics) PART B. New Solution Methods Survey Propagation (derived from cavity equations) Solution clustering More structured problems Summary PART A. Computationally Hard Instances I’ll use the propositional satisfiability problem (SAT) to illustrate ideas and concepts throughout this talk. SAT: prototypical hard combinatorial search and reasoning problem. More general concept: constraint satisfaction problems. Boolean Satisfiability Problem (SAT) SAT: Set of Boolean variables with domains {0,1} or {true, false} with logical constraints between the variables. k-SAT: All constraints logical “ORs”with exactly k variables each. Example 3-SAT formula/instance: F = (Øx Ú y Ú z) Ù ( x ÚØy Ú z) Ù ( x Ú y ÚØz) a b g Read as: ((NOT x) OR y OR z) AND (x OR (NOT y) OR z) AND … etc. Example satisfying assignment: x = False, y = False, z = False. But, the assignment x = False, y = False, z = True does not satisfy the formula. Computational task: Given a k-SAT instance (formula) find an assignment that satisfies all constraints or show that no such assignment exists. With N Boolean variables, we have a search space of 2^N. Complexity class: NP-Complete. (Cook 1971) $1 million prize for providing a polytime algorithm or showing none exists. (P vs. NP. Clay Millenium Prize problem.) SAT provides a general problem encoding language. E.g., exam scheduling: x_1 for “CS2800 exam @ Mon 7pm” x_2 for “CS2800 exam @ Mon 8pm” x_3 for “CS2800 exam @ Tue 7pm” x1 x2 x3 x1 x2 x1 x3 x2 x3 5,000+ NP-complete problems identified so far (including, scheduling and planning problems, hardware and software verification, protein folding, graph coloring, etc.) All NP-complete problems are fully equivalent from a computational perspective. Key concept: Polynomial time reductions. Aside: Quantum computer most likely won’t help. Exponential Complexity Growth: The Challenge of Complex Domains Note: rough estimates, for propositional reasoning 1M Multi agent 10301,020 5M systems Case complexity 0.5M VLSI 10150,500 1M Verification 100K Military Logistics 450K 106020 20K Chess (20 steps deep) 100K No. of atoms 103010 10K Deep space mission control on the earth 50K Seconds until 1047 heat death of sun 100 1030 200 Car repair diagnosis Protein folding Calculation (petaflop-year) 100 10K 20K 100K 1M Variables 12 Rules (Constraints) [Credit: Kumar, DARPA; Computer World magazine] How well can SAT be solved in practice? Generating Hard Random Formulas Generate M clauses uniformly at random. Critical parameter: ratio of the number of clauses to the number of variables (M/N). Hardest 3SAT problems at ratio = 4.3 Hardness of 3SAT 4000 50 var 40 var 3000 20 var DP Calls 2000 1000 0 2 3 4 5 6 7 8 Ratio of Clauses-to-Variables Intuition At low ratios: few clauses (constraints) many assignments easily found At high ratios: many clauses inconsistencies easily detected In between: critically constrained The 4.3 Point 4000 50 var 40 var 20 var 3000 DP Calls 2000 1000 0 1.0 0.8 50% sat Probability 0.6 0.4 0.2 0.0 2 3 4 5 6 7 8 Ratio of Clauses-to-Variables Mitchell, Selman, and Levesque 1991 200 var 3-SAT 18 Exact Location of Threshold Surprisingly challenging problem ... Current rigorously proved results: 3SAT threshold lies between 3.42 and 4.506. Motwani et al. 1994; Broder et al. 1992; Frieze and Suen 1996; Dubois 1990, 1997; Kirousis et al. 1995; Friedgut 1997; Beame, Karp, Pitassi, and Saks 1998; Bollobas, Borgs, Chayes, Han Kim, and Wilson 1999, 2001; Achlioptas, Beame and Molloy 2003; Frieze 2001; Kirousis et al. 2006; Achlioptas et al. ’05, 07; and ongoing… Using techniques from statistical physics: 4.26 (disordered systems; replica / cavity method; energy = num. of violated constraints) Monasson and Zecchina ’97; Biroli, Monasson, and Weight ’00; Zecchina et al. ’05; ) Empirical: 4.25 Mitchell, Selman, and Levesque ’92, Crawford ’93. Finite-Size Scaling For 3SAT 1.0 100 0.8 50 20 40 12 24 0.6 UNSAT Phase 0.4 Fraction of Formulae Unsatisfied SAT ÒSlow DownÓ Phase 0.2 Transition for High N 0 3 4 5 6 7 M/N 1.0 0.8 0.6 0.4 0.2 0.0 -10 0 10 20 Phase Transition for 3-SAT, N = 12 to 100 Data Rescaled Using c = 4.17, = 1.5 (Kirkpatrick and Selman, Science, May 1994) Finite-Size Scaling For 4SAT 1.0 65 0.8 50 24 12 UNSAT Phase 0.6 0.4 Fraction of Formulae Unsatisfied SAT ÒSlow DownÓ 0.2 Phase Transition for High N 0 6 8 10 12 14 16 M/N 1.0 0.8 0.6 0.4 Universal Form: e -2-y 0.2 0.0 -20 0 20 40 Phase Transition for 4-SAT, N = 12 to 65 Data Rescaled Using c = 9.7, = 1.25 Recap Computationally hard problem instances Hardest ones are critically-constrained. Under- and over-constrained ones can be surprisingly easy. Critically-constrained instances at phase- transition boundaries. Properties of transition can be analyzed with tools from statistical physics. Critically-constrained --- Practical relevance Airline fleet scheduling (example, Nemhauser ‘96) Delta airlines aircraft scheduling heuristic solution: 395 planes optimal solution (five months of computation): 394.5 planes Why? Economic factors had driven problem into criticality --- at the edge of infeasibility! Many real-world computational problems live at the phase transition boundaries. PART B. Algorithmic techniques 24 Phase Random 3-SAT as of 2010 transition Linear time algs. Random Walk DP DP ’ GSAT Walksat SP 25 Mitchell, Selman, and Levesque ’92 Linear time results --- Random 3-SAT Random walk up to ratio 1.36 (Alekhnovich and Ben Sasson 03). empirically up to 2.5 Davis Putnam (DP) up to 3.42 (Kaporis et al. ’02) empirically up to 3.6 exponential, ratio 4.0 and up (Achlioptas and Beame ’02) approx. 400 vars at phase transition GSAT up till ratio 3.92 (Selman et al. ’92, Zecchina et al. ‘02) approx. 1,000 vars at phase transition Walksat up till ratio 4.1 (empirical, Selman et al. ’93) approx. 100,000 vars at phase transition Survey propagation (SP) up till 4.2 (empirical, Mezard, Parisi, Zecchina ’02) approx. 1,000,000 vars near phase transition Unsat phase: little algorithmic progress. Exponential resolution lower-bound (Chvatal and Szemeredi 1988) 26 Linear time results --- Random 3-SAT Random walk up to ratio 1.36 (Alekhnovich and Ben Sasson 03). empirically up to 2.5 Davis Putnam (DP) up to 3.42 (Kaporis et al. ’02) empirically up to 3.6 exponential, ratio 4.0 and up (Achlioptas and Beame ’02) approx. 400 vars at phase transition GSAT up till ratio 3.92 (Selman et al. ’92, Zecchina et al. ‘02) approx. 1,000 vars at phase transition Walksat up till ratio 4.1 (empirical, Selman et al. ’93) approx. 100,000 vars at phase transition Survey propagation (SP) up till 4.2 (empirical, Mezard, Parisi, Zecchina ’02) approx. 1,000,000 vars near phase transition Unsat phase: little algorithmic progress. Exponential resolution lower-bound (Chvatal and Szemeredi 1988) 27 Linear time results --- Random 3-SAT Random walk up to ratio 1.36 (Alekhnovich and Ben Sasson 03). empirically up to 2.5 Davis Putnam (DP) up to 3.42 (Kaporis et al. ’02) empirically up to 3.6 exponential, ratio 4.0 and up (Achlioptas and Beame ’02) approx. 400 vars at phase transition GSAT up till ratio 3.92 (Selman et al. ’92, Zecchina et al. ‘02) approx. 1,000 vars at phase transition Walksat up till ratio 4.1 (empirical, Selman et al. ’93) approx. 100,000 vars at phase transition Survey propagation (SP) up till 4.2 (empirical, Mezard, Parisi, Zecchina ’02) approx. 1,000,000 vars near phase transition Unsat phase: little algorithmic progress. Exponential resolution lower-bound (Chvatal and Szemeredi 1988) 28 Linear time results --- Random 3-SAT Random walk up to ratio 1.36 (Alekhnovich and Ben Sasson 03). empirically up to 2.5 Davis Putnam (DP) up to 3.42 (Kaporis et al. ’02) empirically up to 3.6 exponential, ratio 4.0 and up (Achlioptas and Beame ’02) approx. 400 vars at phase transition GSAT up till ratio 3.92 (Selman et al. ’92) approx. 1,000 vars at phase transition Walksat up till ratio 4.1 (empirical, Selman et al. ’93) approx. 100,000 vars at phase transition Survey propagation (SP) up till 4.2 (empirical, Mezard, Parisi, Zecchina ’02) approx. 1,000,000 vars near phase transition Unsat phase: little algorithmic progress. Exponential resolution lower-bound (Chvatal and Szemeredi 1988) 29 Survey Propagation (SP) Mezard et al. 2002. New reasoning / combinatorial search paradigm. Applies probabilistic reasoning technique for solving combinatorial search problems. Basic idea: Let N be the total number of satisfying assignments. N_x+ the number of satisfying assigns with x set to True. N_x- with x set to False. Define: P_x+ = N_x+ / N and P_x- = N_x- / N. I.e., P_x+ is the probability of seeing x assigned True when randomly sampling satisfying assignments. SP, cont. Consider the following “decimation” strategy: If P_x+ >= P_x- then set x to True else set x to False. I.e. set variable to its most likely value. Simplify instance and repeat, until a satisfying assignment is reached. Sure, but only a physicist would think of such a strategy! [Almost took this out for today’s talk…] Since computing the probabilities is believed to be much harder (#P-complete) than finding a satisfying assignment (NP-complete)… But, perhaps one can efficiently compute good approximations of P_x+ and P_x- Strategy is to iteratively solve a set of recursive equations. Linear time. The so-called SP equations are rather involved. They are a form of probabilistic reasoning called Belief Propagation (reaching back to Bethe ’35) . Intuitively, the idea is to consider the effect of adding a clause (constraint) to a set of clauses. Example: start with the empty set of clauses over two variables p and q. So, P_p+ = P_p- = ½ and P_q_+ = P_q- = ½. [all 4 solns equally likely] Now add a clause (p OR (NOT q)). What happens to P_p+ and P_q+? First should go up a bit and the other down a bit… (p OR (NOT q)) is satisfied by (T, F), (T, T), and (F,F). So, P_p+ = 2/3 and P_p- = 1/3 and P_q+ = 1/3 and P_q- = 2/3. Now consider adding ((NOT p) OR q OR r). P_p+ should go down a bit. P_q+ and P_r+ up a bit. Etc. Brute force enumeration quickly infeasible but SP equations model the changes in these probabilities directly to capture the addition of clauses/constraints. Clauses and variables interact, so we will have to look for a fixed point of a set of coupled recursive equations. The CNF: The “Factor” Graph: (Graphical Model. Bayesian Net) The equations: SP is surprisingly effective on hard random k-SAT and graph coloring. 10M var instances with 42M clauses can be solved in linear time (around one hour of cpu time; sets batches of variables, never backtracks , finds satisfying assignment!) Walksat, a biased random walk strategy, is the next best but would require 100+ hrs of cpu time. A formal understanding of SP is has emerged only relatively recently. Zecchina et al. 2004; Wainright et al. 2006; Kroc et al. 2007, 2008. SP computes marginal probabilities of so-called “covers”. Each cover represents a cluster of satisfying assignments. Two satisfying assignments are in the same cluster if you can “flip variables” to go from one assignment to the other visiting only satisfying assignments. Covers are even harder to find than satisfying assignments but SP is remarkably accurate and fast at computing the marginal probabilities of variable settings in covers. Even much faster than finding a single cover! (Kroc, Sabharwal, and Selman 2007) Hard random 3-SAT. 5,000 var; 21,000 clauses 1.0 SP Marginal Probabilities SP marginals in seconds. Cover marginals 100+ hrs (direct computation) 0.0 0.0 1.0 True Cover Marginal Probabilities Solution Clusters Clusters Combinatorial Statistical notions physics 1. High 2. Enclosing 3. Filling notion density regions hypercubes hypercubes BP for BP BP for “covers” BP for Z(-1) The original SP First rigorous More direct derivation from derivation of SP (variational) stat. mechanics for SAT approach to clusters. [Mezard et al. ’02] [Braunstein et al. ’04] [Kroc et al’ 08] [Mezard et al. ’09] [Maneva et al. ’05] [Kroc et al’ 09] [Kroc et al. ‘07] 38 Representing Soln. Clusters Clusters are subsets of solutions, possibly exponential in size Impractical to work with in explicit form To compactly represent clusters, we trade off expressive power for shorter representation Will loose some details about the cluster, but will be able to work with it. We will approximate clusters by hypercubes “from outside” and “from inside”. 010 110 • E.g. with = {0,1}, y = (1) y = (1) is a 011 111 2-dimensional hypercube 000 100 in 3-dim space 001 101 From outside: The (unique) minimal hypercube enclosing the whole cluster. From inside: A (non-unique) maximal hypercube fitting inside the cluster. 39 Factor Graph for Clusters To reason about clusters, we seek a factor graph representation We can do approximate inference on factor graphs Need to count clusters with an expression similar to Z for solutions: = 1 iff x is a solution We derived an approximating expression for # of clusters: Checks whether all points # (y) counts the number of elements of y in y are good Exactly counts clusters under certain conditions Inclusion / exclusion style expression. [Kroc, Sabharwal, Selman ’08] 40 Formal Results for Z(-1) On what kind of solution spaces does Z(-1) count clusters exactly? Theorem: Z(-1) is exact for any 2-SAT problem. Theorem: Z(-1) is exact for a 3-COL problem on G, if every connected component of G has at least one triangle. Any connected graph Theorem: Z(-1) is exact if the solution space decomposes into “recursively-monotone subspaces”. 41 Empirical Results: Z(-1) for SAT Random 3-SAT, n=90, =4.0 Random 3-SAT, n=200, =4.0 One point per instance One instance One point per variable Remarkable fit for Z(-1) 42 Empirical Results: Z(-1) for SAT Z(-1) is accurate even for many structured formulas (encoding real-world problem): [Kroc, Sabharwal, Selman ‘09] 43 SP = BP for Z(-1) Need to efficiently evaluate Z(-1) to count clusters: This expression is in a form that is very similar to the standard partition function of the original problem, which we can approximate with BP. Z(-1) approximated with BP style eqs. (variational method derivation): The BP(-1) iterative equations: The black part is BP For SAT: BP(-1) is equivalent to SP For COL: BP(-1) is different from SP (possibly better) 44 SP, final observation The use of probabilistic techniques for solving SAT problems provides an intriguing alternative to the existing two main search paradigms: (1) complete, backtrack search, and (2) local search. Random 3-SAT Linear time algs. 5.19 Random Walk 5.081 Upper bounds by combinatorial DP 4.762 arguments DP 4.643 ’ GSAT 4.601 Walksat 4.596 SP 4.506 46 Physics contributing to computation 80’s --- Simulated annealing General combinatorial search technique, inspired by physics. (Kirkpatrick et al., Science ’83) 90’s --- Phase transitions in computational systems Discovery of “physical phenomena” (e.g. 1st and 2nd order transitions) in computational systems. (Cheeseman et al. ’91; Selman et al. ’92); Explicit connection to physics: Kirkpatrick and Selman, Science ’94 (finite-size scaling); Monasson et al., Nature ’99. (order of phase transition)) ’02 --- Survey Propagation Analytical tool from statistical physics leads to powerful algorithmic method. (Mezard et al., Science ’02). More expected! Capturing Problem Structure Results and algorithms for hard random k-SAT problems have had significant impact on development of practical SAT solvers. However… Next challenge: Dealing with SAT problems with more inherent structure. 48 I) Mixtures: The 2+p-SAT problem Motivation: Most real-world computational problems involve some mix of tractable and intractable sub-problems. Study: mixture of binary (2-SAT) and ternary clauses (3-SAT) p = fraction ternary p = 0.0 --- 2-SAT / p = 1.0 --- 3-SAT Note: 2-SAT can be solved in linear time; 3-SAT NP-complete. What happens in between? 49 Phase Transition for 2+p-SAT We have good approximations for location of thresholds. (Monasson, Zecchina, Kirkpatrick, Selman, Troyansky, Nature 1999.) Computational Cost: 2+p-SAT Tractable substructure can dominate! > 40% 3-SAT --- exponential scaling Medium cost Mixing 2-SAT (tractable) & 3-SAT (intractable) clauses. <= 40% 3-SAT --- linear scaling (Monasson et al. 99; Achlioptas ‘02) Num variables Results for 2+p-SAT p < = 0.4 --- model behaves as 2-SAT search algorithm “sees” only binary constraints smooth, continuous phase transition (2nd order) p > 0.4 --- behaves as 3-SAT (exponential scaling) abrupt, discontinuous transition (1st order) Note: problem is NP-complete for any p > 0. 52 Observation In a worst-case intractable problem --- such as 2+p-SAT --- having a sufficient amount of tractable problem substructure (possibly hidden) can lead to provably poly-time --- in fact linear --- average case behavior. Conjecture: Our world may be “friendly enough” to make many typical computational tasks poly-time --- challenging the value of the conventional worst-case complexity view in CS. II) Backdoors to the real-world Backtrack search Observation: Complete backtrack search SAT solvers (e.g. DPLL) display a remarkably wide range of run times. Even when repeatedly solving the same problem instance; variable branching is choice randomized. Run time distributions are often “heavy-tailed”. Orders of magnitude difference in run time on different runs. (Gomes et al. ’00; ‘04) 54 Heavy-tails on structured problems 50% runs: solved with 1 backtrack Unsolved fraction 10% runs: > 100,000 backtracks 1 Number backtracks (log) 100,000 Eliminating Heavy Tails: Randomized Restarts Solution: randomize the backtrack strategy Add noise to the heuristic branching (variable choice) function Cutoff and restart search after a fixed number of backtracks Eliminates heavy tailed behavior. In practice: rapid restarts with low cutoff can dramatically improve performance Exploited in many current SAT solvers combined with clause learning and non-chronological backtracking. (Chaff etc.) 56 Sample Results Random Restarts 3 Deterministic R Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. (*) not found after 2 days 57 Formal Model Yielding Heavy-Tailed Behavior T - the number of leaf nodes visited up to and including the successful node; b - branching factor P[T bi ](1 p) pi i 0 (heavy-tailed distribution) p = probability wrong branching choice. 2^k time to recover from k wrong choices. b=2 Intuitively: Exponential penalties hidden in backtrack search, consisting of large inconsistent subtrees in the search space. But, for restarts to be effective, you also need short runs. Where do short runs come from? Explaining short runs: Backdoors to tractability Informally: A backdoor to a given problem is a subset of the variables such that once they are assigned values, the polynomial propagation mechanism of the SAT solver solves the remaining formula. Formal definition includes the notion of a “subsolver”: a polynomial simplification procedure with certain general characteristics found in current DPLL SAT solvers. Backdoors correspond to “clever reasoning shortcuts” in the search space. (Gomes et al. ’04, ’08) Backdoors can be surprisingly small: Most recent: Other combinatorial domains. E.g. graphplan planning, near constant size backdoors (2 or 3 variables) and log(n) size in certain domains. Backdoors capture critical problem resources (bottlenecks). Backdoors --- “seeing is believing” Constraint graph of reasoning problem. One node per variable: edge between two variables if they share a constraint. Logistics_b.cnf planning formula. 843 vars, 7,301 clauses, approx min backdoor 16 (backdoor set = reasoning shortcut) Visualization by Anand Kapur. Logistics.b.cnf after setting 5 backdoor vars. After setting just 12 (out of 800+) backdoor vars – problem almost solved. Another example MAP-6-7.cnf infeasible planning instances. Strong backdoor of size 3. 392 vars, 2,578 clauses. After setting 2 (out of 392) backdoor vars. --- reducing problem complexity in just a few steps! Last example. Inductive inference problem --- ii16a1.cnf. 1650 vars, 19,368 clauses. Backdoor size 40. After setting 6 backdoor vars. Some other intermediate stages: After setting 38 (out of 1600+) backdoor vars: So: Real-world structure hidden in the network. Can be exploited by automated reasoning engines. But… we also need to take into account the cost of finding the backdoor! We considered: Generalized Iterative Deepening Randomized Generalized Iterative Deepening Variable and value selection heuristics Size backdoor n = num. vars. k is a constant Current solvers Dynamic view: Running SAT solver (no backdoor detection) SAT solver detects backdoor set Summary Considered complexity of the Boolean Satisfiability (SAT) problem The prototypical NP-complete problem Hardest instances occur at phase transition boundaries Instances go from satisfiable to unsatisfiable Tools from statistical physics (disordered systems) provide new insights into these (computational) phase transitions. (e.g. 1st vs. 2nd order transitions) Work led to the Survey Propagation method, the fastest current algorithm (1M+ vars). A computational interpretation of the cavity method Insights into highly structured problems: backdoor variable sets “Self-referential” thought of the day: The design of this laptop and presentation software was done using a SAT solver using techniques discussed in this talk! The end

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 7 |

posted: | 6/1/2012 |

language: | English |

pages: | 75 |

OTHER DOCS BY jolinmilioncherie

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.