VIEWS: 4 PAGES: 213 POSTED ON: 7/16/2011
Simple Temporal Problems with Uncertainty (Vidal,Fargier ’99) Thierry Vidal, Hélène Fargier: Handling contingency in temporal constraint networks: from consistency to controllabilities. J. Exp. Theor. Artif. Intell. 11(1): 23-45 (1999) Paul H. Morris, Nicola Muscettola: Managing Temporal Uncertainty Through Waypoint Controllability. IJCAI 1999: 1253-1258 Paul H. Morris, Nicola Muscettola: Temporal Dynamic Controllability Revisited. AAAI 2005: 1193-1198. 2004. 25 Paul Morris: A Structural Characterization of Temporal Dynamic Controllability. CP 2006: 375-389 Simple Temporal Problems with Uncertainty Informally, an STPU is an STP where some of the variables are not under the control of the agent, i.e. the agent cannot decide which value to assign to them. An STPU: • Set of executable timepoints (controllable assignment); • Set of contingent timepoints (uncontrollable assignment); • Set requirement constraints TiJ: • Binary • Temporal interval I=[a,b] meaning a≤XJ-Xi≤b • Set of contingent constraints Thk: • Binary: on an executable Xh and a contingent timepoint Xk • Temporal interval I=[c,d] meaning 0≤c≤Xk-Xh≤d Example: satellite maneuvering Executable End Contingent Start clouds 1 8 Contingent Constraint 1 5 -6 4 Start aiming Executable 2 5 End aiming Executable STPUs Definitions Given an STPU P A control sequence d is an assignment to the executable timepoints A situation w is a set of durations on contingent constraints (set of elements of contingent intervals) A schedule is a complete assignment to the variables of P A schedule is viable if it is consistent with all the constraints. Sol(P) is the set of all viable schedules of P. STPUs Definitions Given an STPU P A projection Pw corresponding to situation w is the STP obtained replacing each contingent constraint with its duration in w. Proj(P) is the set of all projection of P. A viable strategy S: Proj(P) → Sol(P) maps every projection Pw into a schedule including w Example: STPU 1 1 2 2 5 5 9 1 4 2 5 5 9 1 4 2 4 4 5 2 5 5 9 1 4 2 1 4 2 5 5 9 2 5 5 9 1 4 1 4 Projections 5 5 Notation [S(Pw)]x : time assigned to executable variable x by schedule S(Pw) [S(Pw)]<x : history of x in S(Pw) is the set of durations corresponding to contingent events which have occurred before [S(Pw)]x Controllability There is a plan that will work Strong Controllability whatever happens In the future I can build a plan Dynamic Controllability while things happen that will be successful. For every possible scenario there is a Weak Controllability plan Strong Controllability (SC) of STPUs An STP with Uncertainty is Strongly Controllable if there is an assignment to all the executable time points consistent with all the possible scenarios (=complete assignments to contingent points) (Vidal, Fargier, 1999) Aiming should start no more than 10 s. 0 31 40 35 31 Contingent 40 before End- Start- End- clouds. clouds clouds Executable 0 10 25 35 Executable Executable Start- End- aiming aiming 30 20 25 50 Strong Controllability: formal definition An STPU P is Strongly Controllable (SC) iff there is an execution strategy S s.t. 1. ∀Pw ∈Proj(P), S(Pw) is a solution of Pw, in other words S is viable 2. and [S(P1)]x = [S(P2)]x, ∀P1, P2 ∈Proj(P) Strong requirement Useful when the situation is not observable at all Or when the solution must be known beforehand Testing Strong Controllability Main idea Characterize the constraint induced on executable variables from the strong consistency requirement new constraints on executable variables Remove all contingent variables and all constraints involving them we get an STP The STP is consistent iff the STPU is strongly controllable Induced constraints (1) Induction(1) A,B executable Zi contingent w New constraint x y Contingent constraint A Zi Zi-A∈[x,y] and requirement y-v x-u constraint Zi-B ∈[u,v] B u v Induce constraint B- A∈[y-v,x-u] Induced constraints (2) Induction(2) A,B executable wi Ci,Cj contingent New constraint li ui A Ci Contingent constraints: Ci-A∈[li,ui] ui-lj+a li-uj+b a b Cj-B∈[lj,uj] B Cj requirement constraint wj Cj-Ci ∈[a,b] lj uj Induce constraint Y- X∈[ui-lj+a, li-uj+b] SC Algorithm Input: STPU Q=(Xe,Xc,Cr,Cc) Output: minimal STP STP P (Xe,∅) P all constraints of Q defined only on executable variables for every distinct pair of contingent constraints Ci and Cj in Ce P P ∪ all constraints induced from Ci with induction 1 P P ∪ all constraints induced from Cj with induction 1 P P ∪ all constraints induced from Ci and Cj by induction 2 P’ Solve(P) If P consistent then return P’ else return “not strongly controllable” Example 2 1 3 2 5 C1 A X C2 -5 5 0 10 3 11 1 15 B Y C3 STPU 0 10 5 8 Induced STP Minimal STP 2 2 A X X A -5 1 1 4 1 4 -5 1 3 6 B Y Y B 0 10 2 10 SC Algorithm is sound and complete An STPU P is strongly controllable iff the execution of SC- Algorithm(P) does not terminate with “not strongly controllable” Proof SC-Algorithm(P) terminates with “not sc” if the derived STP is not consistent This happens iff there is an inconsistency caused by: constraints of Q defined only on executables, in this case intrinsically inconsistent and thus not strongly controllable, or constraints of Q only on executable and some induced constraints in this case it is never possible to satisfy the constraints of Q on executables and control all possible durations of some contingent events not SC only induced constraints in this case there are two or more contingent events which are controlled by disjoint sets of control sequences involving common variables not SC Complexity of SC Algorithm Building the set of induced constraints can be done in linear time in cardinality of the set of constraints O(|Cr|+|Cc|), thus O(|Xe ∪ Xc|2) Solving the STP is O(|Xe|3) Usually |Xc|<<|Xe| and the second dominates Weak Controllability: formal definition An STPU P is Weakly Controllable (SC) iff there is an execution strategy S s.t. ∀Pw ∈Proj(P), S(Pw) is a solution of Pw in other words if there is viable strategy An STPU is weakly controllable if every situation has at least one consistent schedule. However there maybe different schedules for different situations. Weak requirement Relevant in applications where the situation may become available just before execution starts Pseudo-controllability Consider the STPU as an STP (forgetting about the distinction between contingent and executable events) The STPU is pseudo-controllable iff in the minimal network of the associated STP no interval on a contingent constraint is tightened Pseudo and Weak Controllability Weak Controllability Pseudo Controllability Proof To show this we show that if the STPU is not pseudo-controllable then it is not weakly controllable If it is not pseudo-c then there must be at least on interval on a contingent constraint which has been tightened This means that some possible duration of the contingent event has been ruled out not belonging to any solution of the STP This means that any projection associated to a scenario including that duration for the contingent constraint does not have a schedule not weakly controllable The converse does not hold: weak controllability requires that ALL possible combination of all durations of all contingent constraints must have a schedule Pseudo-controllability just guarantees that for each possible duration there is at least a projection that has a schedule We can thus use pseudo-controllability as a preprocessing step for weak controllability Weak Controllability on bounds Weak Controllability on bounds: An STPU is weakly controllable on bounds if ∀w ∈{l1,u1}x{l2,u2}x…x{lk,uk}, where k=|Cc|, and li,ui are the lower and upper bound of the contingent constraint Ci there exist a strategy S such that S(Pw) is a solution of Pw Weak controllability and weak controllability on bounds TH. Consider the following two situations w={w1,w2,…,lk,…,wg} w’ ={w1,w2,…,uk,…,wg} and assume that PW and PW’ are consistent. Then for any other situation of the form wk={w1,w2,…,vk,…,wg}, with vk∈[lk,uk] Pwk is consistent Proof For the sake of contradiction assume PW and PW’ are consistent and Pwk is not consistent Partial situation w-vk={w1,w2,…,wk-1, wk+1,…,wg} induces an interval [x,y] on ctg constraint Ck PW consistent x≤lk≤y PW’ consistent x≤uk≤y Thus it must be that x≤lk≤vk≤uk≤y Which means that Pwk is consistent Weak controllability and weak controllability on bounds And STPU is weakly controllable (wc) iff it is weakly controllable on bounds (wcb) Proof wc wcb obvious wcb wc previous theorem Thus, to test weak controllability, it is sufficient to test weak controllability on bounds WC-Algorithm Input: STPU Q=(Xe,Xc,Cr,Cc) Output: wc / not wc if (pseudo-c(Q)=false) return “not wc” if (sc(Q) =true) return “wc” else if Cc = ∅ then return “not wc” else choose and remove Ci from Cc replace Ci with [li,li] if(WC(P)= “wc”) replace Ci with [ui,ui] WC(P); Example 2 3 A X 0 1 3 B Y 3 6 It is pseudo-controllable It is not SC It is not WC Weak Controllability is NP-hard Reduction 3-coloring problem weak controllability In more detail coloring problem STPU distance graph solution of the coloring problem negative cycle in some projection Dynamic Controllability: formal definition An STPU is dynamically controllable (DC) iff there is a strategy S such that 1. ∀Pw ∈Proj(P), S(Pw) is a solution of Pw, in other words S is viable 2. and if [S(P1)]<x = [S(P2)]<x then [S(P1)]x = [S(P2)]x, ∀P1, P2 ∈Proj(P) In words it is possible to assign values to executable variables online only based on values assigned to past executables and without regrets for the future. Dynamical Controllability: Example s_man1 e_man1 s_com1 e_com1 s_man2 0 5 6 16 ? 17 1 10 1 5 1 10 1 5 1 5 Dynamically Controllable NOT Dynamically Controllable Reductions Inducing dc constraints on executables from contingent constraints and dc requirements Like for SC but now the induced constraints can be ternary Precede reduction Only when u≥0 u≥0 B-C≥0 B ≥C w Executable C must always New constraint precede contingent event B x y A B Thus any time at which C is executed must be y-v x-u consistent with all possible C occurrences of B 0 u v Same constraint as that induced by SC Example: Precede reduction u≥0, since 1≥0 C must be executed exactly 1 unit of time before B occurs without New constraint w knowing when B will occur 1 2 A B Impossible! y-v x-u In fact the new constraint C on AC is [1,0] never 0 1 satisfiable The network is not DC Follow reduction Only when v<0 v<0 B-C<0 B <C w Executable C must always No further follow contingent event B restriction x y Assuming the network is A B pseudo-controllable Then for every occurrence p q of B there is a consistent C value of C, executing after u v 0 B When B occurs its time will be propagated to AC No a priori restriction to AC Example: Follow reduction v<0, since -1<0 C must be executed exactly 1 unit of time after B occurs Assuming TA=0 If B occurs at 1 the we 1 2 A B execute C at 2 If B occurs at 2 then we execute C at 3 2 3 C Notice that the AC -1 0 constraint was already [2,3] due to pseudo- controllability Unordered reduction Only when u<0 and v ≥0 C before or after B Impose a wait (B,y’), y’=y-v on AC x y (B,y-v) A B Wait (B,y’) on AC means: Either B occurs and thus C p q can be immediately executed or C C can be safely executed u 0 v after TA+y’ regardless if B has executed or not Wait ternary constraint on A, C and B Example: Unordered reduction u<0 and v ≥0, since u=-1 and v =1 C before or after B Impose a wait (B,y’), y’=3- 1=2 on AC Assuming TA=0 1 3 (B,2) Either A B If B occurs at 1 C can be immediately executed at 1 or 2 0 4 Otherwise C can be safely executed after TA+2=2 C regardless if B has executed -1 0 1 or not If we do not respect the wait and we execute C =1 and then B occurs at 3 CB is violated Regression A regression is a propagation of a wait from one constraint to another The regression of the wait is from requirement constraint to requirement constraint However it can be caused by another requirement or contingent constraint Simple regression Given that New wait C has to wait either for B to occur or for y time units (B,y-v) to pass after A (B,y) A D has to occur at least v h k time units after C p q Then, it follows that C D D must wait either for B to u v occur or for y-v time units to pass after A Example: Simple regression No contingent constraints involved New wait Given that (B,1) C has to wait either for B (B,2) to occur or for 2 time units A to pass after A -1 6 D has to occur at least 0 4 after C-1 C D Then, it follows that -2 1 D must wait either for B to occur or for 1 time unit to pass after A Contingent regression Only if y≥0 and B≠C Given that New wait C has to wait either for B to occur or for y time units to (B,y-u) pass after A (B,y) D has to occur at least v time A units after C h k Then, it follows that p q D must wait either for B to occur of for y-u time units to C D pass after A Also simple regression can be u v applied but gives a shorter wait which is subsumed by that of a ctg regression (u≤v) Example: Contingent regression y≥0 since 6≥0 New wait Given that C has to wait either for B (B,5) to occur or for 6 time units (B,6) A to pass after A 1 9 D has to occur at least 3 2 10 time units after C C D Then, it follows that 1 3 D must wait either for B to occur or for 5 time units to pass after A Reductions A reduction is a tightening of a constraint interval due to the presence of a wait Unconditional reduction Only when y-v≤x x y (B,y-v) A B Executable C must either wait for B until p q or for y-v after A C u v If y-v≤x then the wait will always expire before B executes, thus C can be x y executed immediately at x (B,y-v) A B This induces a new lower New constraint bound on AC of y-v p q C y-v +∞ Example: Unconditional reduction Only when 2<3 3 6 (B,2) A B Executable C must 0 5 either wait for B or for C -5 4 2 after A 3 6 This induces a new A B lower bound on AC of (B,2) u New constraint 0 5 C 2 +∞ General reduction Only when y-v>x and p<x x y (B,y-v) A B Executable C must either wait for B or for y-v after A p q C u v B can occur at x the earliest Thus, C can occur at x the earliest x y (B,y-v) A B This induces a new lower bound on AC of x (B,y-v) p q C x +∞ New constraint Example: General reduction Only when 4>3 3 6 (B,4) A B Executable C must 0 6 either wait for B or for C 4 after A 3 6 This induces a new A B lower bound on AC of (B,4) 3 (B,4) 0 6 C 3 +∞ New constraint Regressions: example (P,2) 5 9 Upper bounds A C of AP, DQ, BR>2 (R,2) D B (Q,2) Contingent regression 1: Simple regression 3: BC (R,2) AD (P,2) AC [5,9] AD [-∞,1] thus BA takes (R,-3) thus AA takes (P,1) Unconditional reduction on AB: ub = 3 Unconditional reduction on AA: Simple Regression 2 lb=1 DB (Q,2) inconsistency not DC AB [-∞,3] thus D A (Q,-1) Unconditional reduction on A D: ub =1 Classical DC algorithm Input: STPU P Output: “dc” or “not dc” 1. If (pseudo-controllable(P)=false) return “not dc” 2. Select any triangle ABC when ub(AC)≥0 3. Apply Precede and unordered reductions 4. Do all possible wait regressions 5. Apply all unconditional and general reductions 6. If no changes return “dc” 7. Else go to 1 Complexity of Classical-DC The complexity is pseudo-polynomial since it is polynomial if we assume the maximum link size is bounded new algorithm truly polynomial Reductions and regressions on constraint graph reductions on distance graph Pseudo-controllability consistency of the AllMax projection Cut-off Bound as in Bellman-Ford replacing termination dependence on domain size Labeled distance graph(1) For requirement [x,y] constraints: just like A C distance graph y For contingent constraint AB with interval [x,y] we A C have -x AB labeled y BA labeled -x Lower case labeled edge AB labeled b:x [x,y] Upper case labeled edge A B BA labeled B:-y B and b stand for the name of b:x the contingent variable y involved Lower and upper case edges A B have the value the edge -x would have in case of resp. earliest/latest occurrence of B B:-y Labeled distance graph(2) Wait constraint: (B,t) (B,t) A on AC Becomes p q CA with label B:-t C Like an uppercase B:-t edge In fact, if B is A executed at the latest q then t is a lower -p bound for C on the AC C constraint Reductions in the labeled distance graph From the reductions and regressions defined on the constraint graph we obtain a set of equivalent reductions on the labeled distance graph In particular the following set of reductions is equivalent to the set containing Precede (and Follow) reduction Unordered reduction Simple regression Contingent regression Upper case reduction Only if y≥0 New constraint If B:x comes from a wait constraint A Translation in the B: x+y labeled graph of the B:x Simple regression If B:x comes from a C D contingent constraint y Unordered reduction with [-∞,y] as bound Lower case reduction Only if x≤0 New constraint A x+y x C D c:y Cross case reduction Only if x≤0, C≠B If B:x comes from a wait New constraint constraint A Translation in the labeled B:x+y graph of the Contingent B:x regression C D B:x cannot come from a c:y contingent constraint since ctg constraints do not share finishing points No case reduction No restrictions New constraint Usual composition of edges A x+y x C D y Label Removal Reduction The label removal reduction is equivalent to the set containing General reduction B b:x A C B:z z Properties Upper case labels move: reductions propagate them from one edge to another Lower case labels are fixed: no new ones are produced by reductions Old and new reductions Unordered reduction Upper-case reduction Simple regression upper case reduction Contingent regression cross case reduction Unconditional reduction Label removal Precedes reduction Upper-case + Lower- case + Label removal reductions Follow reduction No case reduction General reduction unnecessary : let us see why AllMax projection AllMax projection is the projection where all ctg constraints are replaced with their maximum value The distance graph of the AllMax projection is obtained from the labeled distance graph by Deleting all lower case edges Removing the labels from all upper case edges AllMax instead of pseudo-controllability The pseudo-controllability test can be replaced by the consistency test of only the AllMax projection In fact Pseudo-controllability 1. Detect negative cycles 2. Compute minimal requirement constraints 1. Negative cycles Assume an upper bound on a ctg constraint is squeezed BA B:-y AB z and z<y Applying Upper-case gives: AA B:(-y+z) This inconsistency will be detected by the next run of AllMax Assume a lower bound on a ctg constraint is squeezed AB b:x BA z and x<-z Applying Lower-case gives: AA x+z This inconsistency will be detected by the next run of AllMax 2. Minimal requirement constraints No case reduction Cutoff(1) Repetition: when the same triple of nodes, in the same configuration is considered at least twice A repetition takes place only if one of the edges of the triple is tightened Parent of the resulting edge from a repetition: the tightened edge that caused the repetition If both edges have been tightened then pick one arbitrarily Labeled graph during execution Has at most n2+nk+k edges, where n is the total number of nodes k number of contingent nodes In fact k is also the number of contingent constraints n2: number of ordinary edges nk: number of upper case edges every upper case edge must always point to the source of the contingent constraint k: number of lower case edges, one for each ctg constraint and fixed during execution Cutoff(2) Labeled distance graph max n2+nk+k edges If parent chain is longer than n2+nk+k contains a repeated edge Linearity property: tightening from parent to child of the same quantity If a reduction is applicable then if a parent is tightened, then, it will still be applicable repetition in a parent chain leads to continued repetition of reductions and will exhaust a domain network not DC Edges added at iteration i must have at least i parents in their chain Similarly to Bellman Ford we can stop the execution of DC when the number of iteration exceeds n2+nk+k Basic Cut-Off DC Algorithm Input: labeled distance graph G of STPU P Output: “dc” ot “not-dc” 1. for i=1 to n2+nk+k 2. if not(AllMAX(G)) return false 3. do any No-Case tightenings 4. for each Uppercase edge e do Complexity AllMax O(n2)times 5. for each node B do O(n3 n2)=O(n5) 6. Uppercase-reductions(e,B) 7. CrossCase-reductions(e,B) 8. for each Lower-case edge e do 9. for each node B do 10. Lowercase-reductions(e,B) 11. If no-changes return true 12. for each Uppercase edge e do 13. LabelRemoval-reductions(e) 14. return false Widget approach Widget small device handling a subpart of the problem Refined algorithm Reduce distance: of a path in the labeled graph is obtained summing the numerical values on the edges without regarding uppercase or lower case labels New paths created by reductions preserve the reduced distance Reductions: reduced distances distances in the AllMax projections since they might provide alternative paths free of lowercase edges Restricted reductions If we restrict Upper case reduction and the No Case reduction to being applied only when D is a contingent time point Given the current definitions of Lower Case and Cross Case reductions Then the reduced scope of the reductions only adds edges that emanate from the start or the end of a contingent constraint Restricted Upper case reduction Only if y≥0, and D is a contingent time point New constraint If B:x comes from a A wait constraint B: x+y Translation in the B:x labeled graph of the Simple regression C D If B:x comes from a y contingent constraint Unordered reduction with [-y,+∞] as bound Restricted No case reduction Only if D is contingent New constraint Usual composition of edges A x+y x C D y Refined Cut-Off DC Algorithm Input: labeled distance graph G of STPU P Output: “dc” ot “not-dc” 1. for i=1 to n2+nk+k 2. if AllMAX(G) return false 3. do any restricted No-Case tightenings 4. for each Uppercase edge e do 5. for each contingent node B do 6. Uppercase-reductions(e,B) 7. CrossCase-reductions(e,B) 8. for each Lower-case edge e do 9. for each node B do 10. Lowercase-reductions(e,B) 11. If no-changes return true 12. for each Uppercase edge e do 13. LabelRemoval-reductions(e) 14. return false AllMax distances and restricted reductions Consider any application α of a Cross-case or a lower-case reduction The only one that can tighten AllMax distances To CA negative possibly upper-labeled DC c:y Nodes and color widget Node of the coloring problem A two tempoarl variables: AX and AY For each possible color a path from AX to AY Example coloring problem with 3 nodes red red blue BX red blue AY BY CX blue gree gree AX gree CY Job-shop Scheduling Claude Le Pape, Philippe Baptiste: Resource Constraints for Preemptive Job-shop Scheduling. Constraints 3(4): 263-287 (1998) Setting Resources: with capacities Activities: durations and resource requirements Temporal constraints between activities Problem: when to execute each activity satisfying both temporal and resource constraints Types of scheduling problems Disjunctive: each resource can execute at most one activity at a time Cumulative: a resource can run several activities in parallel (provided the capacity is not exceeded) Non-preemptive: activities cannot be interrupted Preemptive: activities can be interrupted at any time Mixed: some activities can be interrupted, some not CSP representation of non-preemptive problems Activity A three (temporal) variables start(A): start time of A end(A): end time of A ESTA (earliest start time) : smallest value in D(start(A)) EETA (earliest end time): smallest value in D(end(A)) LSTA (latest start time): greatest value in D(start(A)) LETA (latest end time): greatest value in D(end(A)) duration(A): end(A)-start(A) CSP representation of preemptive problems Activity A either set variable set(A): the set of times at which A executes ot boolean W(A,t), t time: equal to 1 iff A executes at time t W(A,t)=1 iff t∈set(A) start(A): mint∈set(A)(t) end(A): maxt∈set(A)(t+1) start(A) and end(A) are in the preemptive case used to define temporal constraints among activities In the non-preemptive case: set(A)=[start(A),end(A)) duration(A): |set(A)|=end(A)-start(A) In the non-preemptive case The preemptive Job-Shop scheduling problem Set of jobs Set of machines Job: set of activities to be processed in a given order (temporal precedence constraints) For each activity: integer processing time machine on which it has to be processed ACTS(M) set of activities to be processed on machine M A machine can process at most one activity at a time (disjunctive) Activities can be interrupted at any time and an unlimited number of times (preemptive) Goal: find a schedule (set of execution times for each activity) that minimizes makespan (the time at which all activities are finished) NP-complete decision problem Makespan minimization algorithm 1. Compute obvious upper bound UB and initial lower bound LB of the makespan 2. while(LB≠UB) 3. select a value v∈[LB,UB) 4. set makespan=v and run branching procedure 5. if solution is found 6. UBmakespan of solution found 7. else 8. LBv+1 Branching in the non-preemptive case Based on ordering ACTS(M), the set of activities that require the same machine M At each node Select M Select O⊆ACTS(M) For each activity A∈O Branch assuming A executes first or last among those in O Propagate the decision taken on A Ordering in the preemptive case “Ordering” interruptible activities that require the same machine Given a schedule S, dSA, due date of A in S: the makespan of S if A is the last activity in S otherwise, the start time of the successor of A Priority for any schedule S, an activity Ak has priority over an activity Al in S (Ak <S Al) iff either dS(Ak)<dS(Al) or dSAk=dSAl and k≤l Jackson derivation(1) TH. For any schedule S there exists a schedule J(S) (the Jackson derivation) such that: 1. J(S) meets the due dates: for any A in J(S) the end time of A is at most dsA 2. J(S) is active: for every machine M, and for every time t, if some activity A∈ACTS(M) is available at time t, then M is not idle at time t 1. available: the predecessor of A is finished and A is not finished 3. J(S) follows the <s priority order: for every machine M, for every time t, for every Ak and Al in ACTS(M), if Ak ≠Al if Ak executes at time t, either Al is not available at time t or Ak <S Al Jackson derivation(2) Since makespan(J(S))≤makespan(S) at least one optimal schedule is a Jackson derivation of some other schedule We can use the characteristics of the Jackson’s derivations to prune the search Branching scheme 1. repeat 2. let t be the earliest time such that there is an activity A available and not scheduled yet 3. Compute set K of activities available at t on the same machine as A 4. Compute subset NDK of undominated activities in K 5. Select Ak in NDK (e.g. minimal latest end time) 6. Schedule Ak to execute at t 7. Propagate 8. If failure backtrack on other activities in NDK 9. until(all activities are scheduled or failure) Computing NDK In J(S) an activity cannot be interrupted unless a new activity becomes available on the same resource if Ak∈ACTS(M) is chosen to execute at t, it is set to execute either up to its earliest end time or to the earliest start time of another activity Al∈ACTS(M) not available at time t In J(S), an Activity Ak cannot execute when another activity Al is available unless Ak<sAl if Ak∈ACTS(M) is chosen to execute at t, any other activity Al∈K can be constrained not to execute between t and the end of Ak. At time end(Ak)≥t’>t, Al is dominated by Ak hence not in NDK We cannot have Ak<sAl if Ak is the last activity of its job and either Al is not the last activity of its job or l<k If Ak∈ACTS(M) is the last activity of its job, Ak is not candidate for execution at time t if another activity Al∈ACTS(M), which is not the last of its job, or such that l<k is available at time t (Ak is dominated by Al) Propagation techniques In step 7 of the branching scheme the execution of an activity on a machine at a given time is propagated Several ways to propagate: Timetable constraints Disjunctive Constraints Edge-finding Arc-B-consistency Arc-consistency: constraint c over variables v1,…,vn, with domains D1,…Dn, is arc consistent iff for any variable vi and domain value vali in Di there exist value val1,val2,…vali-1,…,vali+1,valn in D1,D2,…Di- 1,Di+1,…,Dn such that c(val1, …,valn) holds Arc-B(ounds)-consistency: constraint c over variables v1,…,vn, with domains D1,…Dn, is arc-B-consistent iff for any variable vi and for vali=max(Di) or vali=min(Di) there exist value val1,val2,…vali-1,…,vali+1,valn in D1,D2,…Di- 1,Di+1,…,Dn such that c(val1, …,valn) holds Timetable constraints: non-preemptive Time table: explicit data structure to keep track of resource utilization over time resource availability over time Propagation for resource constraints: from resources to activities: update activity bounds according to resource availability from activities to resources: update the minimal and maximal capacities that can be used at any point in time Resource constraint propagation: non preemptive case maintaining arc-B-consistency on the following constraint ∑A[W(A,t) x capacity(A)]≤ capacity(t) capacity(A): capacity required by activity A capacity(t): capacity available at time t Example A, B two activities requiring the same resource of capacity 1 D(start(A))={0,1} Act. EST EET LST LET Dur D(start(B))={0,1,2} A 0 2 1 3 2 D(end(A))={2,3} B 0 2 2 4 2 D(end(B))={2,3,4} Propagation 1 Act. EST EET LST LET Dur D(start(A))={0,1} A 0 2 1 3 2 D(start(B))={2} B 2 4 2 4 2 D(end(A))={2,3} D(end(B))={4} Propagation2 D(start(A))={0} Act. EST EET LST LET Dur D(start(B))={2} A 0 2 0 2 2 D(end(A))={2} B 2 4 2 4 2 D(end(B))={4} Resource constraint propagation: preemptive case(1) From resources to activities ESTA the first time t at which W(A,t) can be true LETA 1+ (the last time t at which W(A,t) can be true) EETA must satisfy there possibly ∃ duration(A) timepoints in set(A)∩[ESTA,EETA) LSTA must satisfy there possibly duration(A) timepoints in set(A)∩[LSTA,LETA) Resource constraint propagation: preemptive case(2) From activities to resources: W(A,t) cannot be set to 1 as soon as LSTA≤ t ≤ EETA W(A,t) can be set to 1 iff no more that duration(A) time points can belong to set(a) Unlikely to occur before set(A) is instantiated not much pruning Same Example, but preemptive canboth Nothing deduced if be are interruptible A, B two activities, B interruptible Act. EST EET LST LET Dur D(start(A))={0,1} A 0 2 1 3 2 D(start(B))={0,1,2} B 0 2 2 4 2 D(end(A))={2,3} D(end(B))={2,3,4} Propagation 1 Act. EST EET LST LET Dur D(start(A))={0,1} D(start(B))={0,1,2} A 0 2 1 3 2 D(end(A))={2,3} B 0 3 2 4 2 D(end(B))={3,4} Disjunctive constraints: non-preemptive Non-cumulative disjunctive scheduling: two activities A and B requiring the same resource R cannot overlap A before B or B before A If n activities require R: nx(n-1)/2 disjunctive constraints Disjunctive constraints: propagation, non preemptive maintaining arc-B-consistency on: [end(A)≤ start(B)] or [end(B)≤ start(A)] Example: Disjunctive csts, non-preemptive A, B two activities same resource no overlap Act. EST EET LST LET Dur D(start(A))={0,1,2} A 0 2 2 4 2 D(start(B))={1,2,3} D(end(A))={2,3,4} B 1 3 3 5 2 D(end(B))={3,4,5} Propagation 1 D(start(A))={0,1,2} Act. EST EET LST LET Dur D(start(B))={2,3} A 0 2 1 3 2 D(end(A))={2,3,4} B 2 4 3 5 2 D(end(B))={4,5} Disjunctive constraints can propagate more than timetable constraints which in this case cannot deduce anything Disjunctive constraints: non preemptive Representing the fact A and B cannot overlap, two ways set(A)∩set(B)=∅ ∀t, [W(A,t)=0] or [W(B,t)=0] If either of the two is adopted propagation of preemptive disjunctive constraints = propagation of preemptive timetable constraints Rewriting However if we rewrite the non-preemptive disjunctive constraint [end(A)≤ start(B)] or [end(B)≤ start(A)] as [start(A)+duration(A)≤end(B)-duration(B)] or [start(B)+duration(B)≤end(A)-duration(A)] we obtain a new preemptive disjunctive constraint [start(A)+duration(A)+duration(B)≤end(A)] or [start(A)+duration(A)+duration(B)≤end(B)] or [start(B)+duration(A)+duration(B)≤end(A)] or [start(B)+duration(A)+duration(B)≤end(B)] complements set(A)∩set(B)=∅ Mixed case if A is non interruptible: remove first disjunct if B is non interruptible: remove fourth disjunct Example: Disjunctive csts, preemptive A, B two interruptible activities same resource no overlap D(start(A))={0,1,2} Act. EST EET LST LET Dur D(start(B))={2,3} A 0 4 2 6 4 D(end(A))={4,5,6} B 2 3 3 4 1 D(end(B))={3,4} Propagation 1 Act. EST EET LST LET Dur D(start(A))={0,1} A 0 5 1 6 2 D(start(B))={2,3} D(end(A))={5,6} B 2 3 3 4 2 D(end(B))={3,4} Edge-finding Branching technique: ordering activities that require the same resource At each node set of activities O is selected for each activity A in O new branch where A executes first (or last) among activities in O Bounding technique: deducing that some activities in O must execute first (or last) in O can execute first (or last) in O cannot execute first (or last) in O Notation pA: minimal duration of A ESTO: smallest earliest start time of activities in O LETO: greatest latest end times of activities in O pO: sum of minimal durations of activities in O A<<B: A executes before B A<<O: A executes before all activities in O A≻≻O : A ends after all activities in O Edge-finding bounding technique, non- preemptive Rules ∀O,∀A∉O, [LETO∪A-ESTO<pO+pA] A<<O ∀O,∀A∉O, [LETO-ESTO∪A<pO+pA] A>>O A<<O [end(A) ≤ min O’⊆O (LETO’ –pO’ )] A>>O [start(A) ≥ max O’⊆O (ESTO’ +pO’ )] If n activities require the resource O(n x 2n) pairs (A,O) to consider Time bound adjustment “Primal” algorithm updates earliest start times “Dual” algorithm updates latest end times Prima algorithm Compute Jackson’s preemptive schedule (JPS) JPS is the schedule obtained applying rule whenever the resource is free and one activity is available, schedule A such that LETA is smallest If B becomes available while A is in process, interrupt A and start B if LETB<LETA, otherwise continue A pB* residual duration of B on JPS at t For each activity A compute S= set of activities not finished at t=ESTA on JPS For each activity in S in decreasing order of LET if C satisfies ESTA+pA+∑B∈S-{A} | LETB≤LETC (pB*)>LETC then impose following constraint A>>{B ∈S-{A}|LETB≤LETC} start(A)≥max B∈S-{A} | LETB≤LETC (CBJPS) where CBJPS is the completion time of B in JPS Example: preemptive JPS for 3 activities EST LET Dur A 0 17 6 B 1 11 4 C 1 11 3 0 1 5 8 13 A C B Edge finding: preemptive case Weakening of the non-preemptive rules 1. ∀O,∀A∉O, [LETO-ESTO∪A<pO+pA] A≻≻O 2. A≻≻O [start(A) ≥ max O’⊆O (ESTO’ +pO’ )] A≻≻O: A ends after all activities in O When A cannot be interrupted both rules apply even if other activities may be interrupted Same adjustment (induced by 2.) of ESTA a non- preemptive If A is interruptible rule 1. ok and replace 2. with weaker 2. A≻≻O [end(A) ≥ max O’⊆O (ESTO’∪A +pO’∪A )] Primal preemptive edge-finding algorithms 1. Compute preemptive JPS 2. For each activity A 3. compute S = set of activities not finished at t=ESTA on JPS 4. For each activity in S in decreasing order of LET 5. if C satifies ESTA+pA+∑B∈S-{A} | LETB≤LETC (pB*) > LETC 6. then impose following constraints 7. A≻≻{B ∈S-{A}|LETB≤LETC} 8. if A non-interruptible 9. start(A)≥max B∈S-{A} | LETB≤LETC (CBJPS) 10. else 11. end(A)≥ESTA +pA ∑B∈S-{A} | LETB≤LETC(pB*) Properties of the preemptive algorithm If A is non-interruptible: computes the earliest time at which A could start if all other activities were interruptible If A interruptible: computes the earliest time at which A could end if all other activities were interruptible Mutual Exclusion widgets (Mutex) Modelling A ≠ B L large enough so that if a path goes through the mutex with length L the it is not a part of a negative cycle Two paths through the mutex widget might lie in a negative cycle p1: A1 P Q A2 p2: B1 Q P B2 If w(PQ)=4L and w(PQ)=-4L then w(p2)=0 w(p1)=2L only p1 can be in a cycle If w(PQ)=2L and w(PQ)=-2L then w(p1)=0 w(p2)=2L only p2 can be in a cycle mutual exclusion from cycles AX AY 4L A1 -L 2L -L A2 P Q B1 B2 2L -4L 2L -2L BY BX Multiple constraint If we have A≠B and A≠C AX Mutex Mutex AY DC of triangular networks (1) (Morris, Muscettola,Vidal,01) x>0 y>0 A C Follow case: C first, then B p>0 q>0 p>0 q>0 Precede case: y-v x-u B first, then C B u v 0 DC of triangular networks (2) Unordered case: B first, then C or x>0 y>0 C first, then B A C p>0 q>0 B must either wait for C or y-v y-v after A wait B u v 0 (Morris, Muscettola,Vidal’01) Wait regression Regression 1: AB constraint has a wait <C,t> Any DB constraint with upper bound u deduce wait <C,t-u> on AD Regression 2: AB constraint has a wait <C,t> A contingent constraint DB with lower bound z deduce wait <C,t-z> on AD DC Algorithm 1. Input STPU P 2. Until quiescence: 3. If enforcing path consistency on P tightens any contingent interval then exit (not DC) 4. Select any triangle ABC, C uncontrollable, A before C 5. Perform triangular reduction 6. Regress waits 7. Output minimal STPU P Complexity : deterministic polynomial Weak Controllability (Vidal,Fargier ’99) An STPU is weakly controllable if for every situation w, the corresponding projection STP Pw is consistent Consider STPU Q and the set Z={l1,u1} x … x {lh,uh}, where lJ,uJ are the lower and upper bound of a contingent constraint in Q: Q is WC iff for every w’ in Z, STP Pw’ is consistent The WC algorithm tests this property. Exponential. Testing WC is Co-NP-complete (Vidal,Fargier’99 and Morris, Muscettola ’99) Simple Temporal Problems with Preferences Solving and learning a tractable class of soft temporal problems: theoretical and experimental results, L. Khatib, P. Morris, R. Morris, F. Rossi, A. Sperduti, K. Brent Venable, AI Communications, special issue on Constraint Programming for Planning and Scheduling, to appear in 2006. Tractable Pareto Optimization of Temporal Preferences L. Khatib, P. Morris, R. Morris and K.B. Venable, Proc. Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-03), Morgan Kaufmann 2003. Overview Simple Temporal Problems with Preferences Tractable Subclass Two Solvers for Fuzzy optimal Solutions A solver for Pareto optimal Solutions Learning local temporal preferences from preferences on solutions Utilitarian Optimals Disjunctive temporal Problems with Preferences Simple Temporal Constraints: An Example Two activities of Mars Rover: 6 1 10 11 • Taking pictures: 7 • 1 < duration < 10 Start_p End_p 0 • 0 < start < 7 • Analysis: -4 4 Beginning_world • 5 < duration < 15 • 5 < start < 10 5 Start_a End_a 7 12 • Additional constraint: 10 • -4 < start analysis – end pictures < 4 5 15 • One of the solutions Introducing preferences Sometimes hard constraints aren’t expressive enough. We may think that: ► It’s better for the picture to be taken as late as possible and as fast as possible. ► It’s better if the analysis starts around 7 and lasts as long as possible. ► It’s ok if the two activities overlap but it’s better if they don’t. 1 10 7 Start_p End_p 0 Beginning_world -4 4 preference Start_a End_a 5 10 5 15 time STPP Formalism Simple Temporal Problem with Preferences • Simple Temporal Problem • Set of variables X1,…,Xn; • Constraints T={I}, I=[a,b] a<=b; • Unary constraint T over variable X : a<=X<=b; • Binary constraint T over X and Y: a<=X-Y<=b; • C-semiring S=<A, +, x, 0, 1> • A set of preference values • + compares preference values inducing the ordering on A • a<=b if a+b=b , a,b in A • x composes preference values preference • Simple Temporal Constraint with Preferences • Binary constraint • Interval I=[a, b], a<=b • Function f: I A a b time STPP Formalism Simple Temporal Problem with Preferences • Simple Temporal Problem • Set of variables X1,…,Xn; • Constraints T={I}, I=[a,b] a<=b; • Unary constraint T over variable X : a<=X<=b; • Binary constraint T over X and Y: a<=X-Y<=b; • C-semiring S=<A, +, x, 0, 1> • A set of preference values • + compares preference values inducing the ordering on A a<=b if a+b=b , a,b in A preference • • x composes preference values • Simple Temporal Constraint with Preferences time • Binary constraint • Interval I=[a, b], a<=b • Function f: I A a b What does solving an STPP mean? A solution is a complete assignment to all the variables consistent with all the constraints. Every solution has a global preference value induced from the local preferences. Solving an STPP means finding an optimal consistent solution , where optimal means its global preference is best. Intractability The class of STPPs is NP-hard. Proof :Any TCSP can be reduced to an STPP on the SCSP=<{1,0},or,and,0,1> semiring in the following way: For every hard constraint I={[a1,b1],…,[ak,bk]} write the soft constraint <I’,f> I’=[a1,bk], for x ∈I’, f(x) =1 iff ∃j such that x ∈ [aJ,bJ] S is a solution of the TCSP iff it is an optimal solution of the STPP Tractability conditions Simple Temporal Problems with Preferences are tractable if: 1) the underlying semiring has an idempotent multiplicative operator (x). For example: Fuzzy Semiring <{x| x in [0,1]}, max, min, 0, 1> 2) the preference functions are semi-convex 3) the set of preferences is totally ordered Semi-convex Functions y{x | f ( x) y} is an interval Examples Semi-convex Not Semi-convex Solutions of the Rover Example 1 0.7 1 0.8 1 10 6 Start_p End_p Fuzzy Semiring 57 1 <[0,1], max, min, 0, 1> 0 0.7 Beginning_world -4 1 4 0.9 0.9 Global preference of a solution: Start_a 1 End_a minimum of the preferences of 5 7 0.6 its projections 10 5 15 Goal: maximize the global preference Two solutions: Start_p = 5 End_p= 11 Start_a= 7 End_a=12 global preference =0.6 Start_p = 7 End_p= 8 Start_a= 9 End_a=24 global preference =0.9 BEST Path consistency with preferences As with hard constraints, two operations on temporal constraints with preferences: Intersection Composition Intersection of Soft Temporal Constraints If T1 I1 , f1 and T2 I 2 , f 2 defined on X i and X j then T1 T2 I1 I 2 , f1 f 2 is defined on X i and X j and I1 I 2 I1 I 2 and fuzzy f1 f 2 a f1 a f 2 a min f1 a , f 2 a 0.56 0.45 0.33 0.25 a=6 min(0.33,0.45)= 0.33 5 6 7 8 9 10 17 1 5 6 7 8 9 10 a=9 min(0.56,0.25)= 0.25 0.33 0.25 5 6 7 8 9 10 Composition of Soft Temporal Constraints If Tik I ik , f ik and Tkj I kj , f kj then Tij Tkj I ik I kj , f ik f kj is defined on X i and X j and I1 I 2 a r1 r2 r1 I1 r2 I 2 and maxmin f1 r1 , f 2 r2 a r1 r2 fuzzy f1 f 2 a f1 r1 f 2 r2 a r1 r2 0.48 0.4 0.3 0.2 If a=8 0 1 2 3 4 5 6 7 5 6 7 8 9 10 r1= 0 r2=8 min(0.2,0.4)= 0.2 r1= 1 r2=7 min(0.3,0.48)= 0.3 0.55 r1= 2 r2=6 min(0.4,0.52)= 0.4 r1= 3 r2=5 min(0.6,0.55)= 0.55 5 8 17 max{0.2,0.3,0.43,0.55}=0.55=f1 f2 (8) Path Consistency in STPPs A Soft Temporal Constraint, TiJ, is path consistent iff TiJ ⊆ ⊕∀k (Tik ⊗ TkJ) An STPP is path consistent if all its constraints are path consistent As PC-2 except: Algorithm STPP_PC-2 The input and the output are STPPs 1. Input: STPP P ⊗⊕ are extended to preferences 2. Queue Q←{(i,j,k)| i<j, k≠i,j} 3. while Q≠Ø do 4. Select and delete a path (i,j,k) from Q 5. if TiJ ≠ TiJ ⊕ (Tik ⊗ TkJ) then 6. TiJ ← TiJ ⊕ (Tik ⊗ TkJ) 7. if TiJ =Ø then exit (inconsistency) 8. Q←Q ⋃ {(i,j,k)| 1≤k≤n, k≠i,j} 9. end-if 10. end-while 11. Output Path consistent STPP Path Consistency on STPPs Xk 0 5 10 5 17 Xi XJ 1 10 5 17 1 10 The new constraint will replace the old one 5 10 Solving tractable STPPs with path consistency Given a tractable STPP, path consistency is sufficient to find an optimal solution without backtracking Proof: Closure of semi-convex functions under intersection and composition After enforcing path consistency, if no inconsistency is found, all the preference functions have the same maximum preference level M The subintervals mapped into M form an STP in minimal form such that an assignment is a solution of the STP iff it is an optimal solution of the STPP Path-Solver 1. Input STPP P 2. STPP Q ←STPP_PC-2(P) 3. Build an STP PM considering only intervals mapped into the best level of preference M 4. Output STP PM M 2 12 3 10 3 7 a) b) STPP STPP Q STP PM P 1:1 Optimal Solutions of P Solutions of PM Complexity of Path-solver A tractable STPP can be solved in O(n3rl) relaxations or more precisely O(n3r3l) arithmetic operations n = number of variables r = max size of an interval l = number of different preferences STPP in input After STPP_PC-2 Random Generator of STPPs It generates an STPP on the fuzzy semiring and with semi-convex parabolas as preference functions. Why semi-convex parabolas?... …Because: they are easily parametrized they are representative of many temporal relations (they include linear) they will be useful for the learning module A small simulation of the generator 0 Parameters: n=number of variables 5 r=range of the first solution 5 0 25 10 10 5 20 d=density 20% max= maximum expansion of the intervals 25 pa, pb e pc= percentage of perturbation for parabolas 2 19 20% 30% 10% Experimental Results for Path-solver Results on randomly generated problems with : n=30 r=100 pa=20 pb=20 pc=30 x-axis : density y-axis : seconds. Solving STPPs with a decomposition approach Given a tractable STPP and a preference level y, the intervals of elements with preference above y form an STP: Py. The highest level, opt, at which STP Popt is consistent is such that an assignment is a solution of Popt iff it is an optimal solution of the STPP Basic Iteration of Chop-Solver Consistent Inconsistent STPP P Chopper STP P’ STP Solver STP STP Solution Chop-Solver Algorithm 1) Input STPP P; 2) Input Precision; 3) Real lb=0, ub=1, y=0, n=0; 4) If STPy is consistent 5) y=1, 6) if STPy is consistent return solution; 7) else 8) y=0.5, n=n+1; 9) while n<=Precision 10) if STPy is consistent 11) lb=y, Y=Y+(lb-ub)/2, n=n+1; 12) else 13) ub=y, Y=Y-(lb-ub)/2, n=n+1; 14) end of while; 15) return solution; 16) else exit; Chop-Solver performs a binary search of the highest level at which chopping the STPP gives a consistent STP. Precision is the number of steps allowed in the search Complexity of Chop-solver A tractable STPP can be solved using Chop-solver in: O(precision x n3) if we use Floyd Warshall to solve STPs O(precision x n3 x r) if we use PC-2 to solve STPs n =number variables ,r = max size of the interval Experimental results for Chop-Solver X-axis number of variables Y-axis time in seconds Fixed parameters: Range of first solution:100000 Max expansion: 50000 Perturbation on a: 5% Perturbation on b: 5% Perturbation on c: 5% Varying: Density 20%, 40%, 60% ,80% Mean on 10 examples Path-solver vs Chop-solver Path-solver Chop-solver Constraint representation discrete continuous Performance slow very fast Time to solve a problem with 40 variables, r=100, max=50, pa=pb=10% and pc=5% Density Path-solver Chop-solver 40% 1019.44 sec 0.03 sec 60% 516.24 sec 0.03 sec 80% 356.71 sec 0.03 sec Representational Power unlimited limited but useful for learning From Fuzzy to Pareto Optimality In Fuzzy CSPs: global preference = minimum associated with any of its projections (Drowning Effect) Fuzzy Optimal: the maximum minimum preference Pareto Optimal: no other solution with higher preferences on all constraints Example: solution S <f1(S1)= 0.2, f2(S2)=0.3, f3(S3)=0.2> solution S’ <f1(S’1)=0.8, f2(S’2)=0.9, f3(S’3)=0.2> Fuzzy Optimals: S, S’ Pareto Optimals: S’ Pareto Optimal Solutions (Khatib, Morris,Morris, Venable ’03) The WLO+ algorithm finds a Pareto Optimal solution of an STPP After applying Chop Solver to the problem it identifies special constraints, the weakest links. It modifies the weakest links. It reapplies Chop solver. Weakest Link Weakest link opt Minimal interval Minimal interval A constraint is a weakest link at optimal chopping level opt if the maximum reached by the preference function on the minimal interval is opt. So… 1 1 Weakest link Weakest link WLO+ 3 1 3 1) Apply Chop solver and S cpu1 Ecpu1 S cpu2 Ecpu2 identify weakest links 2) While there are w.l. 3) Modify the w.l. : 1 1 3a) interval minimal interval 3 1 3b) pref. function y=1 S a1 E a1 S a2 Ea 2 4) Apply Chop Solver 1 1 Pareto Optimal solutions (WLO+) Fuzzy Optimal solutions (Chop. S.) BEa1 BSa2 a1 a2 cpu1 cpu 2 4 9 3 1 3 3 4 9 F(3)<F(1) 4 9 3 1 3 1 Beginning of the world WLO+: Experimental Results (1) x-axis: variables y-axis: seconds Fixed parameters: r=50, max=50, density=no. of variables Throughout all the experiments: 1) 10% of utilitarian improvement 2) 75% of wins w.r.t. the earliest optimal solution returned by Chop Solver Stratified Egalitarianism and Utilitarian Criteria (P. Morris et al. ’04) The set of solutions returned by WLO+ is a subset of the Pareto Optimal solutions It can be characterized by the Stratified Egalitarianism criterion: given a solution S of an STPP let uS=<us1,…,uS m > be the associated vector of preferences (one for each constraint). Solution S SE-dominates solution S’ iff for any preference level α: uSi < α implies uSi ≥uS’i ∃i such that uSi < α implies uSi >uS’i uSi ≥ α implies uS’i ≥ α Learning Simple Temporal Problems with Preferences Learning local from global It can be difficult to have precise knowledge on the preference function for each constraint. Instead it may be easier to be able to tell how good a solution is. Global information Local Information some solutions + global shape of preference preference values functions Learning STPPs • Inductive Learning: ability of a system to induce the correct structure of a map t known only for particular inputs • Example: (x,t(x)). • Computational task: given a collection of examples (training set) return a function h that approximates t. • Approach: given an error function E(h,t) minimize modifying h. • In our context : • x solution • t rating on solutions given by expert • Preference function constraint Ci parabola aix2+bix+ci • Error E E(a1,b1,c1,…,an,bn,cn) • Learning technique gradient descent Gradient Descent Define h=hW where W=set of internal parameters, thus E=EW; Initialize W to small random values (t=0); Update W according to the Delta rule: W (t 1) W (t ) W (t ) E W (t ) W (t ) Stop when satisfied with the level of minimization reached; Test results on a set of new examples, test set. The Learning Module Training set STPP 5 11 7 12 0.6 7 8 6 11 0.8 ……………… … ……………….. … 1 10 7 Start_p End_p 0 Learning Module STP Beginning_world -4 4 1 10 5 7 Start_a End_a Start_p End_p 0 10 5 15 Beginning_world -4 4 5 Start_a End_a 10 5 15 The Implemented Learning Module Works with Parabolas f(x)=ax2+bx+c as preference functions Fuzzy Semiring <[0,1],max,min,0,1> as underlying structure Smooth version of the min function Performs Incremental gradient descent on the sum of squares error 1 E (t (s) h(s))2 2 sT t (s ) Preference value of solution s in the training set h(s) Preference value guessed for solution s from the current network The Learning Algorithm 1) Read a solution s and its preference value t(s) from the training set 2) Compute the preference value of s, h(s), according to the current network 3) Compare h(s) and t(s) using the error function 4) Adjust parameters a, b, c, of each preference function of each constraint, in order to make the error smaller 5) Compute the global error; if below threshold, exit, otherwise back to 1) Adjustment of the parameters Delta rule: E ~ ai ai a1 , b1 , c1 ,.....,a , b , c with number of constr. ai E ~ bi bi a1 , b1 , c1 ,.....,a , b , c bi E ~ ci c i a1 , b1 , c1 ,.....,a , b , c ci Semi-convexity is maintained during all the learning process ~ if a 0 then ~ a 0 Stopping the learning phase Parabolas or fuzzy-parabolas? ……both! Monitored errors on both kinds of parabolas: Sum of squares error parabola Absolute maximum error 1 Absolute mean error Fuzzy-parabola Stop criterion: 100 consecutive failure of improving of at least 70% the abs. mean error computed with fuzzy parabolas Errors computed on test set for final evaluation: Sum of squares error Absolute maximum error Absolute mean error Experimental results on randomly generated problems •Varying parameters: • density (D) • maximum range of interval expansion (max). •Fixed parameters : • number of variables n=25 • range for the initial solution r=40 • parabolas perturbations pa=10, pb=10 and pc=5. •Displayed: absolute mean error (0<ame<1) on a test set (mean on 30 examples). • 357<=iterations<=3812 • 2’ 31’’<=time required<=8’ 18’’ Density Number of examples of D=40 D=60 D=80 Maximum training and test Range set. max=20 0.017 0.007 0.0077 500 max=30 0.022 0.013 0.015 600 max=40 0.016 0.012 0.0071 700 An example with maximum lateness Problem: 8 activities to be scheduled in 24 hours Given: Duration intervals for each activity Constraint graph Aim: Minimize the ending time of the last activity scheduled. Procedure: 1) Solve the hard constraint problem: 900 solutions 2) Rate each solution with a function that gives higher preference to schedules that end sooner: 37 optimal solutions 3) Select 200 solutions for the training set, 8 optimal solutions, and 300 for the test set. 4) Perform learning: 1545 iterations. Results: Absolute mean error on test set: 0.01 Maximum absolute error on test set: 0.04 Number of optimal solutions of the learned problem: 252 all rated highly by the original function. Number of unseen optimal solutions recognized by the learned problem: 29. Disjunctive Temporal Problems with Preferences B. Peintner and M. E. Pollack, "Low- Cost Addition of Preferences to DTPs and TCSPs," 19th National Conference on Artificial Intelligence, July, 2004 Disjunctive Temporal Problems with Fuzzy Preferences Quantitative Temporal Constraint Problems Variables time events Domains allowed occurrence times Constraints disjunction of simple temporal constraints with preferences C : (X1-Y1 ∈[a1,b1], f1) v …. v (Xn-Yn ∈[an,bn], fn) fi: [ai,bi] [0,1] (fuzzy case, A in general) Example of TCSPP Sometimes hard constraints aren’t expressive enough. We may think that: ► It’s better for the picture to be taken as late as possible and as fast as possible. ► It’s better if the analysis starts around 7 and lasts as long as possible. ► The analysis must start from 2 to 4 units before or after the end opf the picture, preferable the further 1 10 7 Start_p End_p 0 V Beginning_world -4 -2 2 4 preference Start_a End_a 5 10 5 15 time Example of DTPP Alice gets up beweeen 9 and 10 and she preferes to sleep late. She will have lunch with Bob at 12:00. In the morning she wants to visit for a couple of hours with her friend Carla or go swimming for two hours, or possibly both. The more time she spends with Carla the better. As far as swimming, Alice gets tired after 1 hour. She can meet Carla or be at the pool between 40 minutes after she gets up, the earlier the better. She should leave Carla or the pool 30 minutes before lunch, the earlier the better. Corresponding DTP DTP constraints: Variables: C1: 1 W=wake up time, Domain [0,12] pref Ls= Lunch starts, Domain [0,12] Vs= Visit starts, Domain [0,12] 0 Ve= Visit ends, Domain [0,12] 9 10 (W-X0) Ss= Swim starts, Domain [0,12] C2: 1 Se= Swim ends, Domain [0,12] pref 0 C4: Ve-Vs=[2,2] V Se-Ss=[1,2] 2 3 (Ls-W) 1 1 C3: pref pref V 1 1 0 pref pref 1 2 (Se-Ss) 0 1 2 (Ve-Vs) V C5:…. 0 0 40’ 12 (Ss-W) 40’ 12 (Vs-W) Chopping a DTPP Given a DTPP Q and a preference level p chopping Q at p means • For every constraint Ci: ci1vci2v…vcik • Replace each disjunct cij: (Xij-Yij∈[aij,bij],fij) • With {Xij-Yij∈[ahij,bhij]| for all h x∈ [ahij,bhij] iff fij(x)≥p} Ci p V (xi1-yi1) (xi2-yi2) Summarizing Given preference level p Chop(semi-conv STPP, p) STP Chop(STPP,p) TCSP Chop(TCSPP,p) TCSP Chop(DTPP,p)DTP Constraints with the same variables Possibly more disjuncts on each constraint Solving fuzzy DTPPs Discretize preferences For each preference p∈{0.1,…,0.9, 1} obtain the corresponding DTP: DTPp Search for the highest p such that the corresponding DTPp is consistent That is, search within the DTPs for a consistent STP Optimal solution: assignment to all variables such that its preference is maximal Definitions STP constraint family: the set of STP constraints obtained chopping at ALL preference levels a DTPP disjunct 0.6 0.3 0.1 2 4 5 6 7 DTP0.6 DTP0.3 STP constraint family DTP0.1 2 4 5 6 7 Definitions STP1 is tighter than STP2 if each constraint in STP1 corresponds to unique tighter constraint in STP2 STP1 is related to STP2 if each constraint in STP2 has a constraint family member in STP1 Intuitively, the two STPs work on the same disjuncts Properties Given two preference levels p, q such that q<p, each component STP of DTPp is related to and tighter than a component STP of DTPq Downward consistency: If there is a consistent component STP of DTPp, then there is a consistent STP component of every DTPq for q<p Upward inconsistency: if a component STP of DTPq is inconsistent, then any related component STP of DTPp , with q<p, is also inconsistent Solve-DTPP Solve-DTPP-main Solve-DTPP Input: DTPP P Input: index i, preference p 1. ∀i,∀p [Dpi,Spi]project(Di,p) 1. If p> 1 return bestSolution 2. bestSolution=Ø 2. select[i] select[i]+1; 3. Solve-DTPP(1,0.1) 3. If (select[i]>Spi) 4. If (i=1) return bestSolution 5. select[i]0 6. Solve-DTPP(i-1,p) Dpi =set of disjuncts 7. If STP-consistency(Dp1(select[1]), …, Dpi(select[i])) obtained projecting 8. If (i = n) constraints Ci at 9. bestSoln= Min Net of selected STP preference level i 10. pp+0.1 11. select[i] select[i]-1 12. Solve-DTPP(i,p) Spi =number of 13. else disjuncts in D ip 14. Solve-DTPP(i+1,p) 15. else select[i] = index of 16. Solve-DTPP(i,p) selected disjunct of constraint i Comments of Solve-DTPP Solve-DTPP-main, where the DTPP is projected, is less expensive than the second part (Solve-DTPP) where search is performed Starts searching at the lowest preference level (0.1) If a consistent STP is found moves up of one preference level starting from an STP related to the one found Complexity of solving Fuzzy DTPPs Naïve Algorithm: Project DTPs at all preference levels Solve each of them O(number of preferences x complexity of Epilitis) Binary search: Project DTPs at all preference levels Solve the DTPs following a binary search O(log(no. of prefs) x complexity of Epilitis) Solve-DTPP O((no. of prefs x solving STP) + complexity of Epilitis) Improving Solve-DTPP When a disjunct splits then lower indexes should be given to parts persisting at higher levels Intersect an STP at level k+1 with the minimal net of the unique related wider STP at level k If the intersection is empty inconsistency If the intersection produces the minimal net at level k+1 consistency Otherwise I can skip that STP at that level Uncertainty in quantitative temporal problems Ioannis Tsamardinos, Thierry Vidal, Martha E. Pollack: CTP: A New Constraint-Based Formalism for Conditional, Temporal Planning. Constraints 8(4): 365-388 (2003) Thierry Vidal, Hélène Fargier: Handling contingency in temporal constraint networks: from consistency to controllabilities. J. Exp. Theor. Artif. Intell. 11(1): 23-45 (1999) Paul H. Morris, Nicola Muscettola: Temporal Dynamic Controllability Revisited. AAAI 2005: 1193-1198 Paul H. Morris, Nicola Muscettola, Thierry Vidal: Dynamic Control Of Plans With Temporal Uncertainty. IJCAI 2001 Formalisms Conditional temporal problems (CTP) Simple temporal probles with uncertainty(STPU) Activating labels The main idea of CTPs is to attach to each variable, representing a time event, a label. The variable will be executed iff the label is true. Proposition: variable which can be true or false. A, B, C… Literal: proposition or its negation. A, A Label: conjunction of literals. Ex: ABC (A and B and C) Inconsistent pair of labels: Inc(l1,l2)=true iff (l1^l2) =false. Ex.: l1=AB, l2 =BC Consistent pair of labels: not(Inc(l1,l2)). Ex.: l1=AB, l2=C l1 subsumes l2: l1 l2. Ex.: ABCAC Label Universe: set of all possible labels defined on a set of propositions P (also denoted with P*) Conditional temporal problems A CTP is a tuple <V,E,L,OV,O,P> P finite set of propositions V is a set of variables (representing time-points) {x,y,z,…} E set of constraints defined on V L:V P* attaches a label to each node. L(x)=AB OV ⊆ V, set of observation nodes O: P OV bijection attaching each proposition to an observation node. The observation node will provide (when executed) the truth value of the proposition. Ex: O(A)=x, O(B)=y. Types of CTPs A CTP <V,E,L,OV,O,P>: Is a CSTP if the constraints in E are of STP type Is a CTCSP if the constraints in E are of TCSP type Is a CDTP if the constraints in E are of DTP type Example We want to go skiing either at park city of Snowbird, starting from home. If we go to Snowbird we want to get there after 1 p.m. for the discount If we go to Park City we want to be at point C before 11 am due to traffic The road between point B and Snowbird and/or the road between C and Park City might be impassable due to snow Park City Snowbird B C Home CSTP corresponding to the example [0,11] A A A A [0,0] BCs BCf CPs CPs [1,1] [1,1] [2,2] X0 HBs HBf O(A) [1,1] [0,0] BSs BSf A A [13,+∞] Default interval [0,+∞] Scenarios Execution scenario s: label partitioning the variables in V into two sets V1={x∈V|Sub(s,L(x))} executed vars V2={x∈V|Inc(s,L(x))} not executed vars Scenario projection of CTP Q=<V,E,L,OV,O,P> and secenario s, is the non conditional temporal problem Pr(s)=<V1,E1>, where V1= {x∈V|Sub(s,L(x))} E1= {(x,y)∈E|x,y ∈V1} Example CTP Pr(ABC) Pr(ABC) AB AB [1,1] [1,1] A w A w y AB y [1,1] [1,1] true [1,1] u true true X0 AC X0 X0 [1,1] [1,1] v [1,1] z AC z AC A A [1,1] [1,1] q q O(A)=X0 O(B)=y O(C)=z Equivalent scenarios Two execution scenarios are equivalent if they induce the same partition of the nodes Scenario equivalence is a equivalence relation (rifl, sym . and transit.), defining corresponding equivalence classes Minimum execution scenario: minimum scenario w.r.t. the number of propositions within its equivalence class Scenarios equivalence: example CTP Pr(ABC) Pr(ABC) Pr(AB) AB A w AB AB AB y AB w A A w w A true u y y y X0 AC true true true v X0 X0 X0 z AC A q Pr(ABC) ≡ Pr(ABC) ≡ Pr(AB) O(A)=X0 O(B)=y AB is minimum O(C)=z Consistency Notions For CTPs 3 notions of consistency are defined: Strong Consistency: if whatever the observations will be the plan will be consistent Dynamic Consistency: given a partial assignment and a set of observations up to a certain time, if they are consistent, then the assignment can be extended to a consistent complete one, no matter what the future observations will be. Weak Consistency: given any scenario, its projection is consistent Formal definitions Schedule T of CTP Q: T: V ℝ, time assignment to nodes in V. T(x): the time assigned by schedule T to node x Execution strategy: St: SC T SC: set of scenarios T: set of schedules associates a schedule to each scenario Viable execution strategy: if, for every scenario s in SC, schedule St(s) is consistent Observation history of node x w.r.t. scenario s and schedule T, H(x,s,T): the set of observation outcomes in s before T(x) Schedules and Strategies: example Strategy St, St(s)=T, ∀ s ∈ {AB, AB, AC,AC} AB [1,1] Schedule T A w=2 A AB y=1 y=1 AB [1,1] [1,1] [1,1] A w=2 [1,1] u=2 true y=1 AB St(AB) true [1,1] X0=0 H(w,AB,T)= X0=0 St(AB) true [1,1] u=2 {O(A)=true, X0=0 O(B)=true} AC true true [1,1] X0=0 X0=0 AC [1,1] v=2 [1,1] [1,1] [1,1] v=2 z=1 AC A z=1 z=1 [1,1] q=2 A O(A)=X0 [1,1] q=2 A O(B)=y St(AC) O(C)=z T(q)=2 AC St(AC) Strong Consistency A CTP is Strongly Consistent iff, exists a viable execution strategy St such that, for every pair of scenarios s1 and s2, and for every node x executed in both scenarios [St(s1)](x)=[St(s2)](x) Weak Consistency A CTP is Weakly Consistent iff, exists a viable execution strategy St for it. That is, for every scenarios s, its projections Pr(s) is consistent (in the STP, TCSP, or DTP sense). Dynamic Consistency A CTP is Dynamically Consistent iff, exists a viable execution strategy St such that, for every pair of scenarios s1 and s2, and for every node x: If Con(s2,H(x,s1,St(s1)) v Con(s1,H(x,s2,St(s2)) then [St(s1)](x)=[St(s2)](x) Consistency Not strongly consistent Not dynamically consistent Weakly consistent [0,11] A A A A [0,0] BCs BCf CPs CPs [1,1] [1,1] [2,2] X0 HBs HBf O(A) [1,1] [0,0] BSs BSf A A [13,+∞] Default interval [13,+∞] Relation between consistency notions Strong Dynamic Weak Consistency Consistency Consistency Testing strong consistency(1) A CTP <V,E,L,OV,O,P> is strongly consistent iff the non conditional temporal problem (V,E) is consistent Proof Assume the CTP is Strongly Consistent. Then exists a viable execution strategy St such that, for every pair of scenarios s1 and s2, and for every node x executed in both scenarios [St(s1)](x)=[St(s2)](x) Thus for any node x, the values assigned to x is unique, regardless the scenario. Denote such value with T(x) Assignment T(x) satisfies every subset of constraints, Ei, corresponding to any scenario si. Thus, it satisfies the union i Ei=E If the temporal problem (V,E) is consistent, then it has a least a consistent assignment to all the nodes in V. Denote such assignment with T. Posing [St(si)](x)=T(x), for all x in V, and every scenario si, we obtain a viable execution strategy. In fact, T(x) is consistent with the whole set of constraints in E and, thus, it is consistent with any of its subsets corresponding to scenarios Testing strong consistency(2) Thus given a CTP, to test if it is strongly consistent, we forget the labels and: if it is CSTP use Floyd-Warshall (polynomial) If it is a CSTP or a CDTP use Epilitis (exponential) Complexity Weak Consistency (1) Testing Weak Consistency is Co-NP-Complete Proof Reduction SAT Co-problem of Weak Cons. of CSTP The co-pbl of Weak Consistency is to find a projection that is inconsistent Generic SAT problem with Boolean variables B={x,…,y} Clauses : Ci=(x v … v y v z v … v w) i=1…K Reduces to CSTP=<V,E,L,ON,O,P> Propositions P=B={x,…,y} Variables V For each Ci For each occurrence of x in Ci if x occurs V=V {X}, L(X)=x if x occurs V=V {X}, L(X)=x Clause(Ci)= nodes of the CSTP corresponding to clause Ci Since we are testing weak consistency, O and ON don’t matter Constraints E Add a constraint between each variable, X, in Clause(Ci) and each variable, Y, in Clause(Ci+1) (resp. CK and C1), with consistent labels, of the form: Y-X=-1 Complexity Weak Consistency (2) CSTP corresponding to the SAT formula : (x v y v z) and (x v y v z) and (x v y v z) and ( y v z) x x x X1 X2 X3 y y y y Y1 Y2 Y3 Y4 z z z Z1 Z2 Z3 Z4 z All STP constraints with interval [-1,-1] Complexity Weak Consistency(3) Assume we have a SAT solution {x=T,…,y=T, z=F, …, w=F} Since it is a solution, it makes at least one literal true in each clause at least one time-point in each Clause(Ci) is activated Let Li be the time point activated in Caluse(Ci) There must be a constraint on Li and Li+1 L1L2…LKL1 form a negative cycle Thus if SAT has a solution then the CTP is not Weakly consistent Complexity weak consistency (4) Complete assignment to SAT variables ↔CTP scenario If SAT has no solution for any assignment there is at least a clause, Ci, which is not True Thus, none of the time points of Clause(Ci) are activated Any cycle, by construction has to pass through at least a node in each Clause(Ci) Thus the CTP has no negative cycles in all scenarios If the SAT problem has no solution then the CTP is Weakly Controllable Testing weak Consistency A brute force algorithm is to consider every possible scenario and to test the consistency of the corresponding temporal problem (STP, TCSP, DTP) Obv. Scenarios that are equivalent yield the same temporal problem Identify equivalent scenarios, pick one scenario for every class, possibly the minimum one Computing an approx. of minimal scenarios Technique based on building a scenario tree MakeScenarioTree First Call Input: P= all propositions Set of propositions P Current label L L= true Set of labels Labels Labels=all labels in 1. Node=new-tree-node(L); the problem 2. Node.Left=nil; 3. Node.Right=nil; 4. If L does not subsume all labels in Labels 5. p=Choose-next(P,L,Labels) 6. LabelsLeft={l ∈ labels | Con(L^p,l)^notSub(L,l)} 7. PLeft= propositions of LabelsLeft 8. node.Left=MakeScenarioTree(PLeft,L^p,LabelsLeft) 9. LabelsRight={l ∈ labels | Con(L^p,l)^notSub(L,l)} 10. PRight= propositions of LabelsRight 11. node.Left=MakeScenarioTree(PRightL^p,LabelsRight) 12. endIf 13. return node Example ABC CTP Scenario tree AB v w ABC True A q y AB A A true u X0 AB AB z ABC ABC A MakeScenarioTree(P,True,Labels) P={A,B,C} Labels={A,A,AB,AB,ABC,ABC} Testing Weak Consistency WeakConsistency Input Scenario-tree node T CSTP C STP S 1. T.TimePoints={CSTP nodes n | label L(n) =T.L} 2. S’= STP S T.TimePoints and all constraints defined on such nodes 3. If S’ inconsistent 4. return false 5. Else 6. return (WeakConsistency(T.left,C,S’)^ WeakConsistency(T.right,C,S’)) 7. EndIf Dynamic Consistency A CTP is Dynamically Consistent iff, exists a viable execution strategy St such that, for every pair of scenarios s1 and s2, and for every node x: If Con(s2,H(x,s1,St(s1)) v Con(s1,H(x,s2,St(s2)) then [St(s1)](x)=[St(s2)](x) Properties N(x,s) node x in Pr(s) Diffs2(s1) the set of nodes {N(v,s1)} where v provides observations with outcomes that differ from the corresponding one in s2 Example S1=AB S2=AB CTP AB Pr(s1) Pr(s2) A w AB N(w,s1) y AB w A N(y,s2) A y AB true u y N(y,s1) u X0 AC true true v X0=0 N(u,s2) X0 z AC N(X0,s1) N(X0,s2) A q O(A)=X0 Diffs2(s1)={N(y,s1)} O(B)=y O(C)=z Diffs1(s2)={N(y,s2)} Properties for Dynamic Consistency Con(s2,H(x,s1,St(s1)) iff any node providing an observation of a proposition on which the two scenarios differ, has not occurred yet Formally Con(s2,H(x,s1,St(s1)) ↔ ∀ N(v,s1) ∈Diffs2s1 N(x,s1)≤N(v,s1) (≤: precedes in time) Same for Con(s1,H(x,s2,St(s2)) The DC(x,s1,s2) condition A CTP is Dynamically Consistent iff for every pair of scenarios s1 and s2, and for every node x: ∀ N(v,s1) ∈Diffs2s1 N(x,s2)≤N(v,s1) OR ∀ N(v,s2) ∈Diffs1s2 N(x,s1)≤N(v,s2) implies N(x,s1)=N(x,s2) Testing Dynamic Consistency DCons Input: CTP Ct 1. Compute all projected scenarios Pr(si)=<Vi,Ei> 2. DTP D <V=iVi, C=iEi> 3. For each pair of scenarios s1, s2 4. For each v∈Pr(s1)∩Pr(s2) 5. C=C^DC(v,s1,s2) 6. If D is consistent return Dynamic-Consistent 7. Else return non-Dynamic-Consistent [0,11] A A A A BCs BCf CPs CPs [0,0] [1,1] [1,1] [2,2] X0 HBs HBf Z [1,1] [0,0] BSs BSf A A [13,+∞] [0,11] A A A A BCs BCf CPs CPs [0,0] [1,1] [1,1] [2,2] X0 HBs HBf Z [0,0] [0,0] [0,0] [0,0] s1=A, s2=A X0 HBs HBf Z DiffA(A)=N(Z,A) DiffA(A)=N(Z,A) [2,2] [1,1] Since N(HBs,A)≤N(Z,A) BSs BSf [0,0] Then N(HBs,A)=N(HBs,A) A A Similar for HBf [13,+∞] Example s1=AB X01 x1 y1 z CTP s2=AB s3=AB = AB s4=AB z X02 x2 y2 w = x AB X03 x3 y3 v true w = X0 AB X04 x4 y4 w v y AB C1: y2-x1∈[-∞,-1] v x2-x1∈[0,0] u C2: y1-x2∈[-∞,-1] v x2-x1∈[0,0] O(A)=x O(B)=y x z means z-x ∈[1,1] Comments The same complexity of solving a DTP How big is this DTP? The new DTP can have up to O(|V||SC|) variables |SC| can be exponential in |P| Also an exponential number of new disjuncts can be added