propagation

Document Sample
propagation Powered By Docstoc
					Constraint Propagation

1. Motivating examples (P. Winston)

1.1    Numerical constraint nets.

Any set of equations defines a numeric constraint net. In such a net, variables are
represented as variable boxes, while numerical operations on those variables are
represented as operator boxes (e.g. multiply boxes, adder boxes, etc.). The operator
boxes are linked to the variable boxes on which the operator needs to be performed.

In a numerical constraint net, values for variables can propagate (or flow) in various
directions. Similar to mathematical equations, which do not impose a direction on how the
equation is used procedurally, the data-flow in a numeric constraint net is not fixed. In
fact, many different procedural behaviors can be attached to a same net. This is
essentially one of the main characteristics of constraint solving in general: the way in
which various procedural behaviors can operate on a same set of constraint to allow fast
propagation of values to determine problem solutions.

Consider the constraint net associated to the three equations in the slide. If we assign a
value 3000 to the variable box A, this may activate a data-flow through the first multiply
box, instantiating the box for C to 3300. On its turn, this value may now be propagated
further through the second multiply box, instantiating the box for D to 3630.

If instead, we would have started by giving the box for D the value 3630, then this
information could have propagated in the inverse direction, instantiating C and A to 3300
and 3000 respectively, by using the multiply boxes as divisors instead of multipliers.

Even more complicated propagation can occur, as we will illustrate further on.

1.2    Spreadsheets.

Spreadsheets are a (very weak) application of numerical constraint nets. Typically, a
spreadsheet can be used in 2 modes. In a first mode, the user identifies relevant
constants, variables and equations between the variables occurring in his problem. This
can be seen as defining an equation theory, or equivalently, defining a constraint net. In
the second mode, often referred to as the ‘what if’-mode, the spreadsheet propagates
the available values through the equations in order the compute values for the dependent
variables. This is similar, but less powerful, than constraint propagation. In particular, it
is weaker because most spreadsheets do not allow propagation to occur in different
directions. It is usually impossible in a spreadsheet to assign a value to some field, which
is already occupied by an equation. It that would be possible, than, similar to what
happened in the constraint nets, the equations might be usefully activated in an inverse




                                                                                           1
direction (e.g. changing multipliers into divisors, as in the previous example) to produce a
richer functionality.

1.3    More advanced numerical propagation.

Reconsider the constraint net of Section 1.1. Assume that we are given the information
that the value for variable A can be either 2000 OR 3000, while the value for variable D
can be either 3630 OR 4840. Another type of propagation can now take place. We can
select the possible value 4840 for variable D and try to check whether this value allows
us to construct a consistent solution for the entire net. Consistency here means that,
given the possible values we have put forward for the different variables in the net,
there is at least 1 such value for each variable in the net such that all equations become
satisfied in combination with the selected value 4840 for D.
Again, we propagate the value 4840 for D through the net and we observe that the
corresponding value obtained for A should be 4000, which is not one of the possibilities
we had for A. Thus, the value 4840 for D cannot be part of a solution of all equations,
given the restriction on the values for A. It may therefore be eliminated as a possibility
for D. Similarly, the possible value 2000 for A can be propagated through the multiply
boxes, allowing us the eliminate this value as a possibility for A, due to inconsistency with
the remaining value (3000) for D. At this point, no further value elimination can occur,
since both A and D have only 1 possible value left. Propagation of any of these two values
will show that they are indeed consistent, and we have reached a solution.



1.4 Initial conclusions.

Detecting, representing, exploiting and propagation the natural constraints of a problem
you wish to solve is considered as the most crucial part of any form of problem solving, be
it in software engineering or in knowledge engineering. In the remainder of this chapter,
we will first study constraint propagation problems in more technical detail. We will
formally introduce a fairly generic class of constraint problems. Then we will study a
collection of methods that can be used to propagate information through such
constraints. A first class of methods will be based on backtracking techniques. It will
turn out that there is a very rich class of algorithms and techniques, all variants of
standard backtracking schemes, which can provide efficient solutions to these problems.
Then, we study relaxation techniques, also called arc-consistency techniques. These are
usually not complete, in the sense that they are not always sufficient to generate a
solution, but they help very much in reducing the possibilities. Then, we study techniques
that combine backtracking techniques with relaxation techniques. Such hybrid
techniques turn out to be extremely useful and efficient in practice, especially in the
context of scheduling, rostering and planning problems. Next we look at some
applications. We investigate a problem in computer vision: how to provide a 3-dimensional
interpretation of a (2-dimensional) line drawing. It turns out that this is very naturally
described as a constraint problem. In particular, the solution that Waltz proposed for
this problem formed the basis for the first constraint propagation algorithm ever. It was



                                                                                           2
later generalized to other application domains. We also study a problem in understanding
the meaning of natural language sentences. Again, constraint propagation forms the key
to success (although we will only cover this application in high-level terms, not identifying
the formal representation of it as a constraint problem).



As a final conclusive statement, notice that even artificial neural networks can be
considered as a form of (numerical) constraint propagation networks. Numerical values in
the inputs to the neural network are propagated, using some multiply and adder boxes,
over intermediate nodes in the net to output values for the net. However, in the context
of neural nets, other issues than the methods of propagation are the important ones.

2. Constraint problem solving: introducing the notions. (Nadel)

2.1    Definition of a constraint problem.

On the slide, we give a formal definition of a constraint problem (also called consistent
labeling problem). It consists of: a finite set of variables; for each variable: a finite set
of possible values for that variable (referred to as the domain of the variable); for each
2 variables: a constraint (or, otherwise stated: a relation) which should hold between the
values for those 2 variables.

A solution to a constraint problem is a selection of 1 value from the domain of each
variable, such that all the constraints are fulfilled with these selected values. The
problem we address here is to find efficient methods that produce solutions to
constraint problems. In some cases, we may be interested in techniques that compute
just 1 solution to such a problem. In other cases, we may want the techniques to generate
all solutions. In yet other cases, we may be interested in techniques that computer an
‘optimal’ solution, where optimal means that we are given some additional function,
defined on the variables, which should reach its maximum (or minimum) in the computed
solution.

The stated definition of a constraint problem is not as general as it could be. For
instance, we have restricted the attention to ‘binary’ constraint problems. This means: we
only consider constraints (relations) between pairs of variables. In principle, there can be
many practical problems in which there are natural constraints expressible only in terms
of 3 or more variables at once. The reason why we restrict to binary constraints only, is
that this will considerably reduce the notational complexity in our discussions, examples
and algorithms. However, most of the techniques we will discuss can be extended to more
general constraint problems (sometimes straightforwardly, sometimes with more
technical difficulty).
Another restriction is the one on the finiteness of the domains. This restriction delimits
the applicability of the methods that we will study to a specific class of problems: finite
domain constraint problems. There are other types. For instance, we could allow the
domains of the variables to range over the natural numbers, or the integers, or the


                                                                                           3
rational or real numbers. The techniques that we will discuss in this chapter are often not
(easily) extendible to such more general constraint problems. For instance, the
backtracking variants that we will study in the next section have no obvious counterpart
when moving to infinite domains. One way in which a lot of the techniques we study here
can be ported to infinite domains is to reason about finite sets of (possibly infinitely
large) intervals.
Just to illustrate the concept: consider the 3-equation example again and assume that we
know that the value for A must be in the range [2000,3000], while the value for D must
be in the range [3630,4840]. Completely similar as what we did in the example before,
the bounds on these intervals can be propagated through the constraint net (either from
D to A, or from A to D). Propagating the bounds for D backward in the net gives us the
resulting interval bounds 3000 and 4000 for A. Comparing these with the given interval
[2000, 3000] for A, we notice that only the value 3000 remains consistent. Thus: the
interval [3630,4840] contains only 1 point (3630) which is consistent with the possible
values for A. We can therefore reduce the possible values for D to just that point:
[3630,3630].

2.2    Examples: q-queens and confused q-queens.

We will illustrate most points and techniques on 2 famous toy examples: the q-queens
puzzle and the confused q-queens puzzle. Assume you are given a chess board of
dimensions q x q, where q is some integer number. The problem is to place a total number
of q different queen pieces on this chess board, in such a way that no 2 of these q queens
attack each other. Note that 2 queens on some board are said to attack each other if the
2 queens are either on a same row, or on a same column, or on a same diagonal of the
board. The slide shows an example of a solution for the q-queens puzzle, for q = 4
(abbreviated from here on as the 4-queens puzzle).
The confused q-queens puzzle is completely similar to the previous one, except that we
are now looking for placements of q queens on the board, such that every 2 queens do
attack each other. Again, some examples of solutions for dimension 4 are given on the
slide.

This being said, we haven’t actually defined any constraint problems as yet. In particular:
we didn’t specify what the variables are, what the domains are and what the constraints
are that link them. There are in fact a number of different constraint problems
associated to q-queens or confused q-queens, depending on the choice of the variables
and the domains. One possibility is to introduce 1 variable for each queen that needs to
be placed, and to associate to each of these variables the same domain: namely, the set
of all pairs of integers (n,m), where n denotes the row-number and m the column-number
of a possible position on the q x q-chess board. Each constraint between two variables zi
and zj would then take the form described on the slide. Note that the last line in this
constraint expresses that the 2 queens are not on a same diagonal.

Although this is a perfectly correct representation, we will use a different, slightly
better representation of the problem. To introduce it, note that in the q-queens problem,



                                                                                         4
it makes no sense to try combinations of queen placements in which 2 queens are on a
same row. We exploit this in our representation by assigning each queen (and its
corresponding variable) to a specific row from the start. Say that zi denotes the variable
that is associated to the queen on row i. Then, for each i, we define the domain of zi to be
the set {1,...,q}, that is to say: all the possible column-positions for the queen. This
representation (and the new form that the constraints take under it) is expressed on the
slide. The drawing shows a solution to the problem, which - in this representation - would
correspond to z1 = 2, z2 = 4, z3 = 1 and z4 = 3. A main reason for moving to this second
representation is that both the domains and the constraints are more easily expressed. A
(possibly even more important) second motivation is that the total number of possible
queen-placements has been seriously reduced, making the problem more easy to solve.

Note that if we also move to this second representation for the confused q-queens
problem, then we change the conceptual problem of that puzzle. Indeed, in this second
representation, a number of solutions to the original formulation of the puzzle are no
longer solutions. Specifically, placing all the queens on a same row is no longer possible in
this representation. Still, we will consider this second representation as our formal
definition for the confused q-queens problem from now on.
As a final comment, the confused q-queens may seem like a trivial and dumb puzzle for
humans. Our graphical interpretation of the problem allows humans to come up with the
solutions very quickly. The reason why we want to study the puzzle anyway is that: 1) for
a general constraint solving algorithm, there is no real big difference between the
inequality constraints of the q-queens problem and the corresponding equality
constraints in the confused q-queens one. In other words, the amount of search or
computation involved (for general-purpose algorithms) should be roughly the same; 2) the
q-queens puzzle has a very irregular behavior. Suppose that for q = 11 is might have 7
solutions, then it may very well be that for q = 10 there are some 296 solutions. The
amount of solutions behaves disproportional to the dimensions of the problem. This is a
highly undesirable feature if we aim to study the efficiency of algorithms working on this
problem, because the efficiency will behave irregularly on scale-ups of the problem
dimensions. Luckily, confused q-queens does not suffer from this problem. The confused
q-queens puzzle always has q + 2 solutions: one for each column, plus the positioning on
the 2 main diagonals (see the slide). There is only 1 exception to this: the case in which q
= 3. In this case, we get 4 additional solutions (see slide).

2.3 Representing the search.

A last item of discussion, before we can address some problem solving methods, is on how
we will represent the search for solutions. There are 3 different ways of representing
the search space for constraint problems: the OR-tree representation, the network
representation and the domain-array representation.

2.3.1. The OR-tree representation.




                                                                                           5
Let z1,…, zn be the variables in our problem, while aij denotes the j-th value in the domain
of zi and c(zi,zj) is the binary constraint between zi and zj. We assume that an order on
the variables z1,…, zn has been fixed (in particular, assume that it corresponds to the
order of their indices). We also fix an order on the values in each domain, say ai1,…,aim.

The search space is represented by a tree. The root of the tree has one branch for each
value in the domain of z1. The leaves of these initial branches are labeled by the
different values in the domain of z1. Then, each of these initial leaves branches once
again: again, one branch for each value in the domain of z2. At this point, at every leave of
the current tree, we verify the value of the constraint c(z1,z2). The values resulting
from the evaluation of this constraint are added as additional labels to these leaves. In
particular, the values can be either ‘true’, represented as v (for victory?), or ‘false’,
represented as x.
Next, all the leaves for which the constraint value is v are further extended. Note that
there is no point in extending the leaves with label x, because the values for z1 and z2 on
the corresponding branch already violate the first constraint c(z1,z2). Thus, further
extending these assignments can never lead to a solution of the entire constraint
problem.
So, the next step will create branchings for all nodes labeled by v, constructing one
branch for each value in the domain of z3. Again, at the resulting leaves, we test the
constraints c(z1,z3) and c(z2,z3), which are all the constraints that relate the previously
encountered variables with the new variable z3. Again, the next layer will only be built for
those leaves for which both results for these constraint tests are v. This process
continues up till the variable zn, including constraint test for all the constraints c(z1,zn)
up to c(zi-1,zi) in the final leaves. Those final leaves in which all the final constraint tests
result in v represent solutions to the problem. It suffices to combine all the values for
the variables on the branch leading to such a leaf to get the solution. See the
corresponding slide for the general layout.

Note that this representation does not fix a search strategy. It is only a representation
of the search space that needs to be searched. We can still use various strategies to
construct this tree (depth-first, breadth-first, etc.) in actually searching for a solution.
But the techniques using this representation will all be of the ‘backtracking-type’.

2.3.2 The network representation.

In a second representation, we construct a network. The nodes in the network
correspond to the variables in the constraint problem and are labeled by these variables.
An additional label placed on each of these nodes, is the domain of the variable. Each two
nodes are connected by an arc. These arcs are labeled by the constraint c(zi,zj) which is
imposed on the variables zi and zj, labeling the two nodes.

The network representation is intended to be used with a ‘relaxation’ or ‘arc consistency’
method to solve the constraint problem. Roughly, relaxation proceeds as follows. Select a
value, say aij, in a domain of some variable, say zi. Select also a constraint involving that



                                                                                              6
same variable, say c(zi,zk). Now check whether exist any value in the domain of zk, say
akl, such that c(aij, akl) is true. If there does not exist any value in zk’s domain for which
this is the case, then the value aij is inconsistent with all the remaining values for zk.
Thus: aij cannot occur in a solution of the problem. Then: remove aij from the domain of
zi. We can now select a new value in some domain, and a constraint, and continue in the
same way as above. The process is continued until no more inconsistent values can be
removed from any domain. We return to relaxation or arc consistency later on.

2.3.3 The domain-array representation.

This third representation is a syntactic variant of the previous one. For each variable
there is an array including all the values of the domain of that variable as elements. For
any two arrays, there is an arc connecting the arrays, labeled by the constraint between
these variables.
It should be clear that this is essentially the same representation as the previous one.
Again, relaxation is used as the problem solving technique.

3. Backtracking, backjumping and backmarking (Nadel).

3.1 The basic backtracking scheme.

Basic backtracking is the depth-first, left-to-right traversal of the OR-tree
representation of the problem. The initial part of the traversal of the tree for the
4-queens problem is drawn on the slide. This initial part is traversed in the usual way:
depth-first and backtracking from left to right.
The algorithm is shown on the next slide. It is described in a Pascal-like syntax.
Backtr(<input>) is a recursive procedure. <input> is a number that corresponds to the
depth in the OR-tree that needs to be dealt with next. Initially, Backtr is called with
<input> = 1 : we need to construct the backtrack search, starting with the first level in
the tree. To that end, we need to construct 1 branch for every possible value in the
domain of z1. Thus, we get a ‘For’-loop in which we assign the values a11 to a1n1 to z1 (in
that order). The next part of the algorithm (including the ‘While’-loop) checks whether
all constraints involving the current value for z1 are consistent with the values already
assigned to previous variables (in other words, whether all constraint c(z1,..) hold).
Because z1 is the first variable, no checking needs to be done. There are no constraints
c(z1,zj), with j < 1. For z1, Consistent will therefore be true after the ‘While’-loop. If 1 is
not the last variable, then we will increase the depth-variable to 2 and move on to
Backtr(2).
At later stages, the ‘While’-loop will check all constraints c(zi,zj), where zi is the variable
of the current depth and zj is any previously encountered variable. Only is all checks
evaluate to true, Consistent will remain to be true, thus allowing the algorithm to go to
the next level in the tree.
If the maximal depth is reached, the values for all the variables are returned.




                                                                                             7
There are a number of problems with the standard backtrack algorithm. All of these
problems have to do with a notion called ‘trashing’. We explain the notion in the next
slides and show how it can be dealt with by changing the algorithm.

3.2 Backjumping.

Consider the part of the OR tree traversed by the standard backtrack algorithm for the
confused 4-queens puzzle, shown on the next slide. The value for z1 is fixed to 2 for this
entire segment of the search space, while we backtrack over the values for z2, z3 and z4.
Consider the point in the tree where the algorithm assigns the value 3 to z3. At this
point, it has just tried the value 2 for z3 and got some surprising (and extremely
informative) results for the consistency tests. So look back at the values it obtained for
the constraint-checks for the assignment 2 to z3. It appears that all descending
assignments to z4 made the constraints fail. But, the constraints didn’t fail in just any
kind of way: there is something special about the failure. Notice that all the tests failed
already for the first two constraints, c(z1,z4) and c(z2,z4). The constraint c(z3,z4)
wasn’t even tried! This has one very important implication: the reason why all assignments
to z4 caused failure of the constraints has nothing to do with the current value of z3!
The values of z1 and z2 were already completely incompatible with any possible value for
z4. This means that backtracking to try the next value for z3 makes no sense at all. If we
backtrack to the next value for z3, without changing the values for z1 or z2, then the
same constraint checks for z4 will fail at exactly the same locations as for the previous
value for z3. You can see this in the drawing: the tests for c(z1,z4) and c(z2,z4) fail again
under the assignment z3 = 4, and at exactly the same locations. Note that we don’t get
the same duplicated tests for z3 = 3 because this assignment simply fails even earlier: we
don’t even get the chance here to see how it would fail for z4-assignments.
In conclusion, we know in advance that the indicated part of the tree represents a
completely useless computation. Another, completely similar redundant part of the tree
is more to the left of the drawing, again related to assignments to z3.
The behavior we notice here is a form of the notion ‘trashing’. Trashing means that, after
detecting that a certain assignment fails to produce a solution, you are spending time
resetting values for a variable that had nothing to do with the reason for the failure.
Evidently, if computation proceeds, the failure will occur again, (at the latest) at the
point of occurrence of the previous failure.
What we really would want to do in the examples, is not to backtrack to the next value of
z3, but to backtrack to a higher level: namely, the last (deepest) level that was involved in
the reason for the failure. In both examples shown in the drawing, this last (deepest)
level was that of z2. So, the algorithm would need to ‘jump’ back to a higher level.

In order to achieve this in an improved version of the backtrack algorithm, we need to
keep track of some extra information: the ‘deepest fail-level’ (or here in this text also
referred to as the ‘backjump-dept’). This notion is explained on the next slide.
Assume we have completely checked all the different possibilities for the assignment of
one particular variable, say zi, at one particular location in the tree. Also assume that all
the assignments caused failure of one of the constraint checks (so: they all end in an ‘x’).



                                                                                           8
Now, let c(zk,zi) be the deepest of these constraint checks that failed. This means:
there is no zj, with j>k, such that c(zj,zi) gave ‘v’. Then, we define the backjump-depth at
that point in the tree to be k. Obviously, backtracking will now need to occur, because all
alternatives for zi led to failure. When returning to the level of zi-1, we will check
whether i-1 is equal to backjump-depth. If it is, we take the next value for zi-1 and
proceed. If i-1 > backjump-depth, then we backtrack further to i-2 and check whether
this is the backjump-depth. We continue going up the tree, until we reach the
backjump-depth, before we restart assigning values. Note that the backjump-dept is
trivially smaller than i.

On the next slide we show the backjump algorithm. The structure of the algorithm is
completely the same as that of Backtr. There are 2 new variables that are used to
compute and pass on the backjump-depth. The variable checkdepthk stores the depth at
which the constraint checks failed for 1 particular branch (= one particular value) for the
current variable. We compute the maximum over all the different checkdepthk values:
this is the deepest level at which the constraints failed, considering all the different
branches (= all possible values) for the current variable. The variable jumpback records
this maximum. It is another name for backjump-depth. Note that if there is a value for
which none of the constraint checks fails, then the corresponding checkdepthk is set to
the depth of the variable. Also, jumpback is set to the depth of the variable (i-1) in this
case.

A final change with respect to Backtr is that jumpback is included as an extra argument
to the procedure. Note that the declaration ‘VAR’ in front of the argument ‘jumpback’ for
BackJ indicates that this argument is (also) an output variable. Any assignment made to
that variable in 1 particular evocation of the procedure will be returned to the previous
evocation to it. This is used in the statement ‘If jumpback < depth then RETURN’. It
means: if we just backtracked to a previous level and jumpback is still smaller than that
level, backtrack even more to the next previous level.

3.3 Backmarking.

Backjumping solves the trashing problem of backtracking only in part. There is still
considerable redundancy left. Consider the example on the confused 4-queens puzzle on
the slide. Look at the results of the constraint checks for the assignments to z3 rooted
at the assignment z2 = 1 and compare them with the results of the constraint checks for
the same assignments to z3 rooted at the assignment z2 = 2. All the outcomes of these
checks are identical in these two cases. Even for the results of those same checks, but
now rooted at the assignment z2 = 3, we again get the same results. The reason is
basically the same as in the problem we discussed for backjumping. At z2 = 1, we get the
results: x, v, x, v for the constraint c(z1,z3). Then, we backtrack over z2 and give it the
value z2 = 2. Note though that z2 does not occur in the constraint c(z1,z3). Therefore,
assigning the values 1,2,3 and 4 again to z3 MUST result in the same results x, v, x, v for
the constraint c(z1,z3) again, because the value for z1 has not changed! The same
problem occurs in the assignments to z4 (even at different levels). All these tests are



                                                                                          9
redundant, because we could predict the outcome of the checks without computing them
again.

There are two possible approaches to avoid computing these redundant checks. The first
solution is to use tabulation. Tabulation is a general technique that performs ‘lemma
generation’ and ‘lemma application’. Lemma generation means that a table is constructed
in which all the different constraint checks performed so far are kept as entries,
together with the result of the checks. Lemma application means that, when a new
constraint check needs to be computed, the algorithm first checks whether this check is
already present in the table. If it is, the result from the table is used for the check
(without computing the check again). Otherwise, the check is computed and added to the
table with its result.
Tabling only partly solves our problem. It may improve the search so some extend, but it
causes the overhead of checking and building tables. In some cases it may impose high
storage requirements. Most importantly: it doesn’t really allow to avoid looking at
redundant checks completely. It just avoids to recompute them.

Backmarking was introduced as a variant of backtracking to avoid computing these
redundant checks completely. Redundant checks of the type illustrated in the drawing
will simply be jumped over by the algorithm. It is practically totally time saving (virtually
no overhead), while it only requires modest space overhead. The space overhead consists
of 2 new variables, both arrays, that need to be maintained. We explain their meaning
below.

Checkdepth(k,l) is a 2-dimensional array of sizes k: 1->n, where n is the number of
variables in the problem, and l:1->M, where M is the maximum of the sizes of the domains
of the variables. So, for each variable zk, and for each possible assignment to that
variable zk, say akl, Checkdepth will contain a value at its position (k,l). What is the value
of this position? It is precisely the value of the variable checkdepthl that we computed in
the Backjump algorithm for that particular variable and value. In other words, it is the
depth of the deepest constraint check that was performed when we last assigned to
value akl to the variable zk. See the slide for a (symbolic) illustration. The depth of
checking ak1 reaches only 1, that for ak2 reaches 2, if all check are successful (in the
case of the next to last branch in the picture), we get k-1, for the last branch, we again
get 2.

Backup(k): is a 1-dimensional array of size k: 1->n. It has one entry for each variable. Its
value on position k contains the most shallow depth in the tree that we backtracked to
since we last visited the values for the variable zk.

Now let us see how we can use these two new variables. They have 2 very interesting and
practical properties.
For the first of these, assume that for some k and l we have that Checkdepth(k,l) <
Backup(k). See the slide for an illustration of this. In this situation, first note that the
constraint checking for the value akl must have ended in an ‘x’, not in a ‘v’. The reason is



                                                                                           10
that Backup(k) is at most k-1, since we backtracked at least one level. If all constraint
checks for akl had been successful (= true or ‘v’), then Checkdepth(k,l) should have been
equal to k-1, which contradicts our assumption that Checkdepth(k,l) < Backup(k). Second,
because of the inequality, we know that we have not backtracked as deep as any of the
variables, which were involved in the checks previously. So: all the results of all the
constraint checks we did for akl MUST be the same as they were when we previously
checked them. Moreover, there MUST be one final check that still fails.

What can we conclude if we assume the inverse of the inequality, Checkdepth(k,l) >=
Backup(k)? Look again at the drawing. In this case, the checks for akl were at least as
deep as Backup(k). This means that all the checks up till Backup(k)-1 must all have been
successful (=true or ‘v’) and also, that they must all still be true on our next visit to akl,
because we haven’t backtracked over the variables involved in those checks.

These two properties are exploited in the BackM algorithm on the next slide. It is again
a slight variant of the Backtr algorithm, but includes keeping track of the two new
variables and using the two properties we discussed. We will not go into the details of
this algorithm, but only show where and how these properties are used. The first
property is used at the very start of the ‘For’-loop. If the first property holds, then we
know that assigning this value to this variable will lead to failure (as it did before). Thus,
it makes no sense to do anything for this value.
In the algorithm, this is expressed by only doing the remainder of the work if the first
property does NOT hold. Otherwise, we just go to the end of the ‘For’-loop and pick the
next value for this variable.
The second property is used to determine the scope of the constraint checking. In the
standard Backtr algorithm, we check constraints for all variables with smaller index,
starting from z1. Here, we just start from the variable with index Backup(depth) instead.
This means that we skip checking the constraints for variables lower than Backup(depth).
This makes sense, because property 2 tells us that these constraint checks were true
and are still true at this point.
The remainder of the changes to the Backtr algorithm all have to do with how to update
and how to pass-on the two new variables. We will skip these issues here.

Finally note that the algorithm needs to be called with the variables Checkdepth and
Backup both completely initialized with 1’s. This makes sure that we are not cutting away
any checks at the start of the algorithm.

3.4 Results and discussion.

It is hard to compare different optimizations of Backtr because it is always possible to
find some application on which one performs better than the other. Nadel compared
them on a number of applications that were more or less randomly chosen. Let us here
just consider the confused 4-queens puzzle. A table with results is included on the slides.
The 2 first rows are of little relevance: they give results for a fragment of the search
space. Let us consider the last 2 rows instead. The first of these gives the number of



                                                                                           11
nodes visited by the Backtr, BackJ and BackM algorithms respectively. Note that only
the BackJ algorithm saves on visiting nodes. The last row gives the number of constraint
checks computed by each algorithm. BackJ doesn’t really explicitly save checks with
respect to Backtr, but, because it saves visiting 2 nodes, it saves checking the 21
constraints that corresponded to those nodes in Backtr. The savings by BackM are
entirely due to saving checks: 70 in total for this example. On the whole, BackM tends to
be the best algorithm for most applications: more checks are saved, only causing
relatively low overhead. An interesting question is whether the optimizations of BackJ
and BackM could be integrated into 1 single optimized algorithm. Work has been devoted
to this and solutions were found. However, the matter is never clear-cut. Because both
these algorithms avoid to compute some parts of the search tree, they also avoid
computing some of the information that the other algorithm needs to do its optimization.
As such, in integrated BackJ-BackM algorithms, you never get the full optimization of
either of them, only part of it.

In the remainder of this chapter we study some further alternatives in backtracking
methods. In particular, we will study Intelligent backtracking and Dynamic search
rearrangement. A further alternative that exists and has been used intensively in some
knowledge-based systems is ‘Dependency-directed backtracking’ (Doyle). This has mostly
been used for search in logical reasoning systems that deal with incompletely
represented knowledge (part of the information about the world is unknown). In those
cases, it is necessary to perform some form of hypothetical reasoning and the
propagation of the effects of forming a hypothesis needs to be traced. Although
introduced in a very different context than Intelligent backtracking,
Dependency-directed backtracking has essentially the same technical characteristics.
Therefore, we restrict our further discussion to Intelligent backtracking only.

4. Intelligent backtracking (Bruynooghe)

Intelligent backtracking is a general framework for defining more clever backtracking
schemes. It is not one specific algorithm, as BackJ or BackM are. Instead, it gives a
general strategy that may give rise to many different algorithms, depending on the
choices that are made.

The key idea in intelligent backtracking is the notion of the ‘no-good’. A no-good is a set
of assignments of values to variables that cannot co-exist in any solution. We will
illustrate this with examples below. The use of these no-goods in the approach is roughly
as follows: During the construction of the OR-tree, collect, infer and store some
no-goods. Later, upon backtracking, use these no-goods to improve the backtrack
behavior of the algorithm.

Consider the first example in the slides. It represents the state of affairs at a
particular time-point in the search for the 8-queens problem. Note that, compared to our
earlier representation, the horizontal and vertical axis have been swapped here (without
particular reason). The drawing represents a situation in which z1, z2, z3, z4 and z5 have


                                                                                        12
obtained a value. We are in the process of assigning a value for z6. Each line in the
drawing represents a no-good. In this case, these no-goods are trivial, because they
correspond to 1 specific constraint in the problem, which would be violated is the 2
assignments would occur together. These 8 different ‘basic’ no-goods are written out
explicitly on the slide in the form of sets of non-allowed simultaneous assignments.
Next, we infer another no-good. Note that z6 needs to obtain a value, and that there are
only 8 possible values. Thus, we know that z6 = 1 or z6 = 2 or ... or z6 = 8. Now consider the
set of assignments {z1 = 1, z2 = 3, z3 = 5, z4 = 2}. These are all the assignments for the
variables z1, z2, z3 and z4 that occur in the basic no-goods listed before. This set is
clearly a no-good, because, assuming that all these assignments would exist together,
then all the left-hand side assignments in all the basic no-goods would hold. Thus, none of
the right-hand side assignments in these basic no-goods are possible. Therefore, z6
cannot obtain a value.

How can this no-good be used? Well it essentially deals with the backtracking
improvement discussed in backjumping. The no-good states that there is no point in
backtracking over the value of z5, because the assignments of z1 to z4 themselves
cannot co-exist in a solution. Thus: backtrack to the next value of z4 instead.

A second example is shown on the next slide. Again, we have a state of the search in
which z1 to z5 have obtained a value and we are at the point of assigning a value to z6.
Again, we can derive 8 basic no-goods, due to individual constraints of the problem that
would be violated if the 2 assignments would be made together. As before, we have that
z6 needs to get one of the 8 possible values. As before, the set {z2 = 1, z3 = 4, z4 = 6, z5
= 3} is a no-good. If they would co-exist, then the left-hand sides of all basic no-goods
would already be assigned, thus, there is no more possibility to assign a value to z6.
In this case, this no-good can be used later on when we backtrack over the value of z1. If
for a later assignment to z1 we again come to a situation where z2 to z5 obtain these
same values, then we can backtrack immediately (disregarding z6). In a way, this is a
simulation of the effect of backmarking. A bad partial-assignment to the variables is
remembered and stored, so that constraints do not have to be checked again when this
partial-assignment re-occurs. Note that it is not really like BackM, because it does a
form of tabulation.

Of course, this only illustrates the concept of intelligent backtracking. The main
problems that are left are: Which no-goods should we store? How should we
systematically try to infer new no-goods from these? When should we consult and use
them? These problems are hard and may be done differently in different types of
applications. We will not discuss them here.

5. Dynamic search rearrangement.

Yet another enhancement in backtracking techniques is dynamic search rearrangement.
Similar to intelligent backtracking, dynamic search rearrangement is not a specific
algorithm, but a general strategy. It can be applied to most other backtracking variants.


                                                                                           13
The idea is that we drop the assumption that the variables are ordered in a fixed way.
Instead, we will allow to select the order of the variables dynamically during the
construction of the tree. What we hope to gain by this is to reduce the size of the tree
that needs to be explored. In particular, we aim to reduce the branching factoring and/or
cut off failing branches as soon as possible.

The underlying principle that will be used to find a good ordering of the variables is the
first-fail principle. This principle states that:
       If the assignment of a value from domain Di to zi is more likely to cause failure
       than assingment of a value of domain Dj to aj,
       Then: assign to zi first.

The intuition behind this heuristic principle is illustrated in the next slides. The first
slide shows a situation in which our guess that assignment to zi would cause failure was
correct. Assume that assignment to zj would not lead to (immediate) failure – in the case
of the example: Dj has 2 values for zj for which constraints do not fail -. In this case the
gain of selecting zi first is obvious from the drawing. On the left, only the values for zi
need to be enumerated to detect the failure. On the right, we need to construct a tree
with approximately three times more nodes to get to the same conclusion.

The next slide shows a situation in which our guess was only partly correct. Namely,
assume that assignment to zi doesn’t really result in immediately failure (as we hoped),
but that zi allows less values for which the constraints hold than zj. Even in this case we
gain. In the drawing, zi allows 1 successful value, while zj allows 3 successful values.
Selecting zi first clearly leads to a smaller tree. In the selection on the right, the
subtree for zi is repeated three times. In the one on the left, we only do these constraint
checks once.

Of course, the problem is to decide how to apply the first-fail principle: What are good
guesses to whether one variable will lead to failure sooner than another? There are
general heuristics to help us on this, as well as application specific ones. On the level of
the general heuristics, two basic ones:
        Select that zi with the smallest domain Di.
Because zi has less possible values, it is likely to have less successful values too. The
successful branches are a subset of the branches.
A second heuristic is:
        Select that zi which occurs in the highest number of non-trivial constraints.
This may seems strange at first sight. We had agreed that a constraint problem would
have 1 constraint for each 2 different variables. So, all variables would seem to occur in
equally many constraints. However, for most problems, a large number of constraints
c(zi,zj) will just be the constraint ‘true’, which always holds. In other words: in most
problems there are only constraints between some of the pairs zi, zj. Introducing the
constraints c(ck,zl) = ‘true’ for all others is just a matter of getting uniformity in our
presentation of the methods. By selecting the variable occurring the largest number of
non-trivial constraints, we reduce the chances of that variable getting successful



                                                                                         14
solutions (at least, given that the size of its domain is the same as for the other
variables).

Apart from these general heuristics, the problem at hand may give you additional
information on the crucial (more constrainted) variables. It is important to extract such
information from your specification and apply it to control your problem solving method.

An interesting question to think about is whether or not a backtracking algorithm
augmented with dynamic search rearrangement is actually still complete. What is meant
here is: considering that after backtracking to a previous variable you are free to chose
a completely different variable than before, are we still considering all the possibilities
for assignments? The answer is yes. It is a good exercise to convince yourself of this.

As a final comment, apart from reordering the variables to decrease the size of the
search space, we can also change the order in which values from the domains are assigned
to the variables. Given the domain {1,2,3,4}, we could either take the standard order of
assigning them in the order that they occur, we could reverse this order, we assign them
in a random order, etc. In some cases, this can again affect the efficiency of the search
very much. In the case of q-queens for instance, random order most often improves the
speed at which a first solution is found. However, finding good heuristics for this order is
usually very difficult. Experimentation with different orders is often the only way to
optimize.

6. Arc-consistency or relaxation techniques.

The principle of arc-consistency or relaxation was already explained in the subsection on
the network representation. We aim to eliminate some values from the domains of
certain variables. The way we do it is by verifying that such a value is inconsistent with all
the variables in the domain of another variable, under the basic constraint imposed
between them.
However, there are many ways in which one can use this principle. In fact, there are some
10 (or more?) arc-consistency algorithms around. They are usually referred to as AC1,
AC2, AC3, …. Two very simple ones will be illustrated further on.

Let us start with an example to illustrate the ideas and methods. The example is the
4-houses puzzle. It is completely described on the slide. In the next slide, we represent
the problem as a constraint problem, defining the variables, domains and constraints.
Note that this is slightly different from the definition of a constraint problem that we
provided earlier: there are constraints here that are defined on only 1 variable. In
particular: the constraints that C =/= 4 and D =/=2. We will deal with these in a
preprocessing phase that eliminates them, mapping the problem fully into our earlier
definition. On the same slide we see the network representation of the problem. Again,
there is a slight change, in that constraints defined on 1 single variable are added as
extra labels on the nodes.




                                                                                           15
Moving to the next slide, we see how these 1-variable constraints are dealt with. This is
done in a phase ensuring ‘node-consistency’ or 1-consistency. For each variable in the
problem, we eliminate all the variables in its domain that doesn’t satisfy the constraint on
that variable. Concretely, for this example, the value 4 is removed from the domain of C
and the value 2 is removed from the domain of D. This gives us a new network
representation, which is now completely within the scope of our formal definition of a
constraint problem.

On the next slide, we show the most trivial arc-consistency (or 2-consistency) algorithm.
It is appropriately referred to as AC1 (numbers added to AC tend to increase with the
level of refinement of these algorithms). The method was developed by Mackworth, just
as the next one that we will see later on.

We omit comments on the forward check, the look ahead check and AC1. These should be
clear from the slides. There are some comments explaining the forward check and the
look ahead check below.

AC1 is illustrated on the 4-houses puzzle in the next slides. We need three traversals
through the queue. At the third traversal, nothing changes anymore: we have reached a
consistent set of domains. Note that the result of AC1 is NOT a solution to the
constraint problem. Each domain still contains 2 elements. Although we know that for
each of these elements there is a value in other domains making the linking constraints
true, we do not know which combination of these values makes ALL the constraints true.
In fact, it could very well be that there is no solution to the constraint problem, in spite
of the fact that AC1 returns non-empty domains. AC1 ensures local consistency, but not
global consistency. Of course, if we now activate a backtracking search on the reduced
domains, the search space is much smaller than before, making the problem more easy to
solve.

The AC1 algorithm is immensely inefficient. At each pass through the ‘Repeat’-loop, all
constraints are again checked. There may be no need to do this. If we removed values
from the domains of zi, due to checking the consistency of c(zi,zj), there is no immediate
reason to reconsider the constraint c(zk,zl).
The second algorithm, AC3, is more economic in the way it adds previously visited
constraints back to the queue. Initially, the queue contains all the constraints again. In
the ‘While’-loop, we again take 1 constraint out of the queue, say c(zi,zj). Again, we
remove all values from the domains of zi and zj which are inconsistent with the other
domain. Finally, if the domain of either zi or zj has changed, we add all the constraints in
which that variable (for which deletions occurred) occurs to the queue. Then, if the
queue is not empty, we restart with the current queue.
The next slides revisit the 4-houses puzzle. Note that ‘add constraints to the queue’
actually means: check whether the constraint is already in the queue and, if it isn’t, add
it. In the concrete version illustrated here, constraints are added to the back of the
queue. Note that the algorithm performs much less constraint checks. In particular: AC3
visits 9 constraints in total, while AC1 visits 18 (in this example).



                                                                                         16
In principle, the problem of ‘locality’ of arc-consistency can be reduced.
Node-consistency is a very local step: it only considers 1 individual variable and makes
sure that the domain of that variable contains only values consistent with the constraint
on that variable. Arc-consistency (also called 2-consistency) is still very local: only the
relations imposed by individual constraints between 2 variables are enforced on the
domains. We could go further. We could pick out 3 variables, say zi, zk and zl, from the
problem and consider all the binary constraints that relate these 3 variables. Then, we
could define a value aij for zi to be consistent, if there exist values akn and alm for zk
and zl, such that all the binary constraints connecting the 3 variables hold. If aij is not
consistent in this sense, then aij is removed from the domain. This can of course be
further generalized to k-consistency, with k any natural number larger than 0. Note that
4-consistency trivially solves the 4-houses puzzle, in the sense that, if all domains remain
non-empty after checking 4-consistency, then the problem has a solution. This does not
necessarily mean that the domains returned by 4-consistency would be singletons though.
One thing should be observed: computing k-consistency, for k > 2, is very complex. There
are no efficient known techniques for doing this. Thus, in practice, people restrict to
arc-consistency.

7. Hybrid Backtrack-Consistency techniques.

It should be clear by now that neither the backtrack techniques, nor the consistency
techniques by themselves are optimal for dealing with these problems. Backtracking
usually needs to search very large OR-trees, and – in the absence of consistency methods
– if the domains become large, we get exponential behavior rooted on very large
branching factors. Remember that with n variables and b values for each variable, the
number of nodes in the OR-tree is bn, where b is also the branching factor of the tree.
Consistency techniques alone aren’t that powerful either, since, in general, they do not
result in a solution.

What we need is a combination of backtracking and consistency techniques. We already
mentioned before that we could solve a problem by first applying arc-consistency and
then apply backtracking on the resulting domains. The alternative, which is more
commonly applied, is to combined them the other way around: do a backtrack search, but
after each assignment, interrupt the backtrack search to perform a consistency check.
In such hybrid backtrack-consistency techniques, the consistency checking is usually
reduced to a simpler and less powerful check than AC1 or AC3. The reason for this is that
methods like AC1 and AC3 are computationally rather expensive. You do not want to
activate such an expensive consistency check after each assignment of a value to a
variable. Most likely, the amount of removals of values from domains will not be
proportional to the computation cost of the AC activation.

We will illustrate 2 hybrid BT-consistency algorithms: forward checking and lookahead
checking. There are many more around, but the ones we will study here are very well know
and tend to be useful in many different applications.


                                                                                         17
To introduce the forward checking algorithm, let us first discuss the ‘simplified’
consistency technique it relies on. The check is forward check(zi). This check activates
every constraint in which zi occurs just once. More specifically, for every constraint
c(zi,zj) or c(zj,zi), it removes all the values from the domain Dj for zj which are not
consistent with the value ai for zi. Note that this is a very weak and very inexpensive
check (compared to AC1 and AC3).

Forward checking now works as follows: apply standard backtracking, but, after each
assignment of a value ai to a variable zi, apply forward-check(zi).
The algorithm is applied to the 4-houses puzzle in the slides. Note how the domains of B,
C and D are already very strongly reduced in size after the first assignment to A.
Eventually, we end up with an OR-tree with only 9 nodes before the first success is
obtained (we only have a few more in the entire tree). Observe also that in this type of
algorithm, backtracking occurs as soon as some domain becomes empty. As a final
observation, note that the checking of the constraints c(zi,zj) that we did at each level in
the definition of the OR-tree has disappeared here. There is no more need for it. Once
we assign a value to a variable, then it is already consistent with the values assigned to
variables at earlier stages, because the forward-check removes the inconsistent values
from all domains of variables that remain to be assigned.

As a second hybrid BT-consistency technique, we discuss looking ahead. The consistency
check here is look ahead check. Look ahead is more expensive than the forward check,
but on the other hand, it does more work. As a result, more values tend to be removed by
the check and the branching factor of the backtrack search is further reduced.

Lookahead activates every constraint c(zi,zj) of the problem exactly once, and removes
all the inconsistent values from Di and Dj for that constraint. The best way to
understand this is to look at the AC1 algorithm, but to imagine that it would stop after
having traversed the queue just once.

Lookahead checking then proceeds as follows: First do a look ahead check. Then, apply
standard backtracking, but after each assignment, apply look ahead check. The algorithm
is illustrated on the 4-houses puzzle in the next slide. Note that we now have only 6 nodes
left in the OR-tree. There is a trade-off here. Forward checking spends little effort at
each node, imposing only a very weak form of consistency. This is at the cost of a slightly
larger search tree for the backtrack part. Looking ahead checking does more work to get
a stronger consistency at each node, with a smaller resulting tree.
Which of these is best depends on the specific problem and is very hard to predict. One
general heuristic is that, if the constraints are such that the value of one variable
constraints very heavily the possible values of the other variable (for instance, the
constraint B = A + 1 in the example), then it might be better to apply a stronger
consistency check. This may pay off in getting much more removal of domain values. If
the constraints are relatively weak, it may be better to only propagate the effect of the
last assigned variable to its neighboring variables only (= forward-check).



                                                                                         18
The above techniques were both defined in terms of standard backtracking. In principle
it is also possible to combine more advanced backtrack schemes with consistency
checking. In particular, the use of dynamic search rearrangement in forward checking or
looking ahead is strongly recommended. Because of the dynamic elimination of values
from domains, some domains may become much smaller than others at some points in the
computation. Of course, it pays off a lot to select such variables first. Also other
optimizations, such as picking a good strategy for which values to assign to the variable
first, are frequently used in combination with the above methods.

The hybrid BT-consistency techniques discussed here are very compeditive to other
methods for solving complex combinatorial problems. There are several alternative
approaches that we did not discuss here. If the dimensions of the problem are not
excessively high and assuming that all constraints are linear equations, a valid alternative
is to apply linear programming techniques. Specifically in the context where the solution
needs to be optimized in terms of some maximalization or minimalization function, linear
programming may be a very good alternative. In the context of non-linear constraints or
excessive problem dimensions, the above techniques tend to be only feasible option.
Typically problems in scheduling or rostering (think of scheduling of trains, flights,
exams, or of building time-tables for the personel of a large company) these techniques
are increasingly applied in practice.

As a final concluding comment, the selection of the appropriate algorithm (which
combination of techniques to apply for a specific application at hand: Should we use
forward checking or looking ahead? Should we use dynamic search rearrangement and, if
so, with which strategy? Should we order the values in domains in a particular way? Etc.)
may seem problematic. The developer cannot be expected to first write his program with
a number of these choices in mind and then, if the choices turn out to be bad, re-develop
the system again for completely different choices. The answer to this is given by the
programming languages that support constraint problem solving. In Constraint Logic
Programming languages, the language for defining the constraints in the problem is kept
completely separate from the language for selecting the constraint solving method. The
constraints themselves are defined in logic formulae, in particular, in Horn clause logic
(see other parts of this course). The constraint solving technique is selected with a
separate declaration language. As such, it is easy to test one particular problem solving
strategy on your problem and, if it is not satisfactory, adapt only some declarations to
experiment with another. This separation of the logic and the control is essential for the
success of these techniques in the considered application domains.

8.     Non-numerical constraint processing.

Constraint processing methods are in no way restricted to numerical applications. In
some applications, the possible values that variables can take are described as just sets
of (possibly symbolic) data. As long as there is some way of accurately describing the
constraints that relate the variables, then most techniques studied here are extendible.



                                                                                         19
In particular, we will briefly study some applications in symbolic constraint processing
for 3-dimensional interpretation of line drawings and for disambiguation of semantics of
natural language sentences at the end of this chapter.

One (tiny) example that moves in the direction on non-numerical constraint processing is
in theorem proving or logical reasoning systems.
Consider the slides on truth-propagation nets. In these nets, variable boxes are related
to propositional logic formulas. For each propositional formula, we have an associated
variable box. The value of the variable box can be either: unknown (this variable did not
receive any value as yet), true or false. Instead of adder boxes or multiply boxes, we now
have truth-propagation boxes, which represent the relations between propositional
formulas and their sub-formulas. In the example, the truth-propagation boxes are both
related to the implication symbol. They connect an implication to its antecedence and
consequence. Again, these boxes can be activated in various directions. We can even
compute a complete set of propagation rules that give us all cases in which truth-values
of some of the connecting boxes allow propagation to others. These are presented in the
slide’s overlayer. Note that constraint propagation through truth-propagation nets has
been the basis of a widely influential technique for building logical reasoning systems,
called ‘truth-maintenance’ (or ‘assumption-based truth-maintenance’) systems.

Other examples of non-numerical constraint propagation are illustrated in some
additional case studies later on.

9. Bayesian networks and probability nets.

In some applications, the variables in your problem representation may describe
probabilities with which certain properties hold. In such cases, variable boxes for
interdependent properties may be connected by yet another type of propagation box:
‘probability propagation boxes’. In these boxes we express the probability laws that
connect the dependent concepts. Depending on the directionality in which the
propagation is performed, again, different probability equations may be used to enforce
the given probability law. Constraint propagation nets of this type are referred to as
bayesian nets or probability nets.

On the Geninfer slide, we show one particular application of probability nets. Geninfer is
a system that provides advice on the probability that individuals have the hemophilia
disease.

Hemophilia is a genetically carried disease. It is carried by X chromosomes only. Women
have 2 X-chromosomes. If one of these is a hemophilia-defective X chromosome, then
the woman is a carrier of the disease, but she does not have any signs of the disease
herself. Men have 1 X and 1 Y-chromosome. If they have a hemophilia-defective X
chromosome, then they are hemophiliacs.
Because every child inherits 1 of its mother’s X-chromosomes, if the mother is a carrier,
then there is 0.5 probability that the diseased X-chromosome is carried over to the



                                                                                       20
child. So, for female children: 0.5 probability of becoming carrier; for male children: 0.5
probability of having the disease.

The problems given to Geninfer are of the following type. Suppose that for at least one
ancestor in a family tree it is known that he/she was/is hemopheliac or carrier. What is
the chance that a newly born in the family will have/carry the disease?
The propagation is clearly over probabilities. For at least one person in the family tree, it
is known that he/she has/carries the disease. Thus, the variable representing the
probability for this person has the value 1. Most often, for a number of other people in
the family tree it will be known that they do not have/carry the disease. For instance: a
grandfather who did not have the signs of the disease. In such cases the corresponding
variable has value 0. For yet other people in the tree, it is unknown whether they
had/carried the disease or not. The variables corresponding to these people are
uninstantiated at first. It is the job of Geninfer to propagate the values from the known
variables to probabilities for the unknown variables.
As a very simple example, considered in the slide, if the known information is: great uncle
diseased, grandfather ok, father ok, then the probability of the grandmother being a
carrier is 0.5, of the mother being a carrier is 0.25 and of the child being a
carrier/having the disease is 0.125. Much more complex is the propagation when it is also
known that there are healthy uncles and brothers of the child. In that case, complex
probability rules allow to diminish the probabilities for the grandmother, mother and
child (for the child: 0.028 if uncles are ok, nothing known about brothers; 0.007 if both
uncles and brothers are ok).



   10.        An illustration: interpretation of line-drawings (Winston).

See the enclosed extract from Winston’s book.

   11.        An illustration: disambiguation of natural language (Winston).

See the enclosed extract from Winston’s book.




                                                                                          21

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:12/5/2011
language:English
pages:21