# bf

Document Sample

```					                          Bayesian Filters

Dr. Olivier Aycard

Laboratoire d’Informatique de Grenoble

Grenoble, FRANCE

http://emotion.inrialpes.fr/aycard

aycard@inrialpes.fr                                        1

Content
• Fundamentals of Bayesian Techniques
• Logical propositions
• Discrete random variables
• Rules of bayesian calculus
• Bayesian Filters

aycard@inrialpes.fr                                        2

1
Logical proposition
A logical proposition is a statement which is true or false.
It is noted with a small letter.
d = « we are in december »
t = « the temperature is 7 degres »
We use the usual logical operators:
d∧ t
d∨ t
¬d

aycard@inrialpes.fr                                                  3

Probability of a logical proposition
We associate a probability to a logical proposition to summarize
our knowledge on this proposition. We note it:
P(d ) ∈ [0,1]
We are also interested in the negation of a logical proposition:

P(¬d) ∈[0,1]
Finally, we are interested in the probability of a logical proposition t
conditioned by the value of an other proposition d:

P(t d) ∈[0,1]
Which represents the probability to have 7 degres when we are in
december
aycard@inrialpes.fr                                                  4

2
Content
• Fundamentals of Bayesian Techniques
• Logical propositions
• Discrete random variables
• Rules of bayesian calculus
• Bayesian Filters

aycard@inrialpes.fr                                                                         5

Discrete Random Variables(1/2)
• Variables are denoted by name starting with one
random variable x
uppercase letter: X
• By definition a discrete variable X is a set of
0,4

0,35

propositions xi         0,3

•   Mutually exclusive: i ≠ j ⇒ xi ∧ x j = false
probability
0,25

0,2
[         ]   Série1

•   Exhaustive: at least one is true
0,15

0,1
•   ∀i P(xi) ≥ 0    0,05

•   ∑iP(xi) = 1        0
X1   X2               X3         X4   X5

The cardinal of X is denoted: X 
possible values

•

aycard@inrialpes.fr                                                                         6

3
Discrete Random Variables(2/2)
• Example
• D = “we are in december” defined by the logiacl proposition d
and its negation;
• We note D = T (respectively D = F) or d (respectively ¬d) for “we are
in december” (respectively for “we are not in december”).
• T = “the temperature is of i degres” defined by the 61 logical
propositions:
ti = “the temperature is of i degres” with -20 ≤ i ≤ 40;
• We note T = 12 (respectively T = 5) or t12 (respectively t5) pour
“the temperature is of 12 degres” (respectively for “the temperature is of
5 degres”).

aycard@inrialpes.fr                                                              7

Distribution of probabilities associated to a discrete random
variable

• We tell that P(X) is a probability distribution associated to
a discrete random variable X defined by xi with 1 <= i <=
N, if :
• ∀i 0 <= P(xi)<= 1
• ∑iP(xi) = 1

• This probability distribution represents our knowledge on
the variable X.
• Example
• P( D = V) = 1/12
• P( ¬d) = 11/12

aycard@inrialpes.fr                                                              8

4
Conditional discrete random variables

• A discrete random variable can be defined conditionally to
the value of an other discrete random variable.
• We note X | Y ( X knowing Y).

• Example
• T | D which defines “the temperature in degres when we are (or are
not) in december”
• t5 | D = T : “the temperature is of 5 degres when we are in
december”;
• T = 12 | ¬d : “the temperature is of 12 degres when we are not in
december”;

aycard@inrialpes.fr                                                            9

Probability distribution associated to a discrete random
variable

• We tell that P(X | Y) is the probability distribution
associated to a discrete random variable X conditionally to
a variable Y defined on yj with 1 <= j <= M, if :
• ∀i,j 0 <= P(xi | yj) <= 1
• ∀j, ∑iP(xi | yj) = 1
• This distribution represents our knowledge on the variable X
conditionally to the variable Y.
• Example
• P(T = 12 | d) = 0,06
• P( t5 | D = F) = 0,05

aycard@inrialpes.fr                                                            10

5
Content
• Fundamentals of Bayesian Techniques
• Logical propositions
• Discrete random variables
• Rules of bayesian calculus
• Bayesian Filters

aycard@inrialpes.fr                                         11

Normalization and Conjunction
Postulates

P(a) + P(¬a) = 1

P (a ∧ b) = P (a) × P (b | a)
= P (b) × P(a | b)

aycard@inrialpes.fr                                         12

6
Conjonction rule
or bayesian rule
∀xi∈X,∀y j∈Y
P(xi ∧ y j )=P(xi )×P(y j|xi )=P(y j )×P(xi| y j )

P(X ∧Y )=P(X )×P(Y |X )=P(Y )×P(X |Y )

P(X ∧Y Z )=P(X Z )×P(Y |X ∧Z )=P(Y Z )×P(X |Y ∧Z )

P(X ∧ Y )
P (X |Y ) =
P(Y )

aycard@inrialpes.fr                                                        13

Normalisation rule

∑ P(X ) = 1
X

∀y∈Y, ∑P(X Y )=1
X

aycard@inrialpes.fr                                                        14

7
Marginalisation rule

∑ P(X ∧ Y ) = P(Y )
X

∑P(X ∧Z Y )=P(Z Y )
X
Preuve

aycard@inrialpes.fr                                                  15

Content
• Fundamentals of Bayesian Techniques
• Bayesian Filters
• Definition & interests
•   Definition
•   Inference
•   Example of localization: Markov localization
•   Conclusion
• Implementations
• Conclusion

aycard@inrialpes.fr                                                  16

8
Bayesian Filters
•    Filtering is the problem of sequentially estimating the states of a system as a
set of actions and observations become available on-line (one or several
sensors)

•    St or Xt: state at time t
•    Ot or Zt: observation at time t
•    At or Ut: action at time t

•     Hypothesis :
• Order 1 Markov model
• P(St|St-1,At) :    dynamic model
• Sensor model
• P(Ot|St)         :   sensor model

•    Goal : compute posterior distribution P(St|O0:t,A1:t)

aycard@inrialpes.fr                                                                               17

Recursive Inference
Question(1/2)
P(ST O0:T ,A1:T ) = αP(ST , O0:T , A1:T )

P(ST O0:T ,A1:T ) = αP (OT ST , O0:T −1 , A1:T )× P(ST , O0:T −1 , A1:T )
P(ST O0:T ,A1:T ) = αP (OT ST )× P (ST , O0:T −1 , A1:T )

P(ST O0:T ,A1:T ) = αP (OT ST ) ∫ P (ST , ST −1 , O0:T −1 , A1:T )
ST −1
P (ST O0:T ,A1:T ) = αP (OT ST ) ∫ P (ST ST −1 , O0:T −1 , A1:T )× P(ST −1 , O0:T −1 , A1:T )
ST −1

P (ST O0:T ,A1:T ) = αP(OT ST ) ∫ P(ST ST −1 , AT )× P(ST −1 , O0:T −1 , A1:T )
ST −1

aycard@inrialpes.fr                                                                               18

9
Recursive Inference
Question(2/2)
P(ST O0:T ,A1:T ) = αP(OT ST ) ∫ P (ST ST −1 , AT )× P(ST −1 , O0:T −1 , A1:T )
ST −1

P(ST O0:T ,A1:T ) = αP(OT ST ) ∫ P(ST ST −1 , AT )× P(ST −1 O0:T −1 , A1:T )× P(O0:T −1 , A1:T )
ST −1

P(ST O0:T ,A1:T ) = α ' P(OT ST ) ∫ P(ST ST −1 , AT )× P(ST −1 O0:T −1 , A1:T )
ST −1

P (ST O0:T ,A1:T ) = α ' P (OT ST ) ∫ P (ST ST −1 , AT )× P(ST −1 O0:T −1 , A1:T −1 )
ST −1

aycard@inrialpes.fr                                                                                   19

Recursive Inference
Initialization :
1
P (S 0 O0 ) = P (S 0 )× P (O0 S 0 )
Z
Integral approximation

P(ST O0:T −1,A1:T ) =                 ∫ P(S             T   ST −1 , AT )× P(ST −1 O0:T −1 , A1:T −1 )
ST −1

Estimation :           Confrontation observation - prediction

(                          )
P S T O0:T ,A1:T = α ' P (OT ST )× P (ST O0:T −1 , A1:T )

aycard@inrialpes.fr                                                                                   20

10
Algorithm for bayesian filtering
Initialization :
1
P (S 0 O0 ) =     P (S 0 )× P (O0 S 0 )
Z

Input: P(ST −1 O0:T −1,A1:T −1 ) (previous probability distribution),AT , OT
for all s ∈ ST
P (ST = s O0:T −1,A1:T ) =        ∫ P(S   T   = s ST −1 , AT )× P (ST −1 O0:T −1 , A1:T −1 ) (prediction)
ST −1

P(S T = s O0:T ,A1:T ) = α ' P(OT ST = s )× P(ST = s O0:T −1 , A1:T )(estimation : confrontation prediction - observation)

Endfor
(
return P S T O0:T ,A1:T                      )

aycard@inrialpes.fr                                                                                                       21

Discrete bayesian filters or Markov localization(1/2)

• We want to know the position of a robot in an environment
(ie, localize it)
• We have a map of the environment (ie, the position of the
doors in the corridor are known)
• The environment is static: there is any dynamic entities in
the environment
• The mobile robot is doing an action to move
• After each action, the robot is able to observe its
environment

aycard@inrialpes.fr                                                                                                       22

11
Discrete bayesian filters or Markov localization(2/2)

• Bayesian filters are typically used to solve this problem
• The environment is supposed to be discrete
• Discrete bayesian filters or Markov localization
• 3 main hypothesis regarding the initial position of the
robot:
• initial position known: local localization or tracking;
• initial position unknown: global localization;
• initial position false: knidnapping problem.

• To apply bayesian filters, we have to define:
• St, Zt or Ot and At
• P(St|St-1,At) and P(Zt|St)

aycard@inrialpes.fr                                                23

Definition of variables and models(1/2)

• A mobile robot is moving from one meter (At=1) in a
corridor of 20 meters length (St={1..20})
• The mobile robot is able to observe the doors and the wall
of the corridor (Ot={d, w})
• We have a map of the corridor: we know where the doors
are located in the corridor, there is a door at 6 meters
(St=6), at 8 meters (St=8) and 14 meters (St=14)

aycard@inrialpes.fr                                                24

12
Definition of variables and models(2/2)

• Dynamic model: Action is not perfect !!!
• P(St=s+1|St-1=s ,At=1)=0.8
• P(St=s|St-1=s ,At=1)=0.2
• Sensor model: Observations are not perfect !!!
• P(Ot=d|St=s)=0.7 with s={6; 8 or 14}
• P(Ot=w|St=s)=0.3 with s={6; 8 or 14}

• P(Ot =w|St=s)=0.6 with s={1..5; 7; 9..13; 15..20}
• P(Ot=d|St=s)=0.4 with s={1..5; 7; 9..13; 15..20}

aycard@inrialpes.fr                                            25

Tracking(1/3)

•   Local localization or tracking
•   Initially, the mobile robot is located at 5 meters
•   P(S0 = 5) = 1 and P(S0 = s) = 0 with s={1..4;6..20}

•   It moves from one meter: A1=1 (prediction)
•   P(S1| A1 = 1)
= ΣS0 P(S1| S0, A1 = 1) x P(S0)
= P(S1| S0 = 5, A1 = 1)
• P(S1 = 5| A1 = 1) = 0.2
• P(S1 = 6| A1 = 1) = 0.8

•   It observes a door: O1=d (estimation)
•   P(S1| O1 = d, A1 = 1)
= P(O1 = d| S1) x P(S1| A1 = 1)
• P(S1 = 5| O1 = d, A1 = 1) = α0.4x0.2 = 8/64 = 1/8
• P(S1 = 6| O1 = d, A1 = 1) = α0.7x0.8=56/64=7/8
aycard@inrialpes.fr                                            26

13
Tracking(2/3)

•    It moves again from one meter: A2=1 (prediction)
•    P(S2| A2 = 1, O1 = d, A1 = 1)
= ΣS1 P(S2| S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
• P(S2 = 5| A2 = 1, O1 = d, A1 = 1)
= ΣS1 P(S2 = 5| S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
= P(S2 = 5| S1 = 5, A2 = 1) x P(S1 = 5| O1 = d, A1 = 1)
= 0.2 x 1/8 = 1/40

• P(S2 = 6| A2 = 1, O1 = d, A1 = 1)
= ΣS1 P(S2 = 6| S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
= P(S2 = 6| S1 = 5, A2 = 1) x P(S1 = 5| O1 = d, A1 = 1) +
P(S2 = 6| S1 = 6, A2 = 1) x P(S1 = 6| O1 = d, A1 = 1)
= 0.8 x 1/8 + 0.2 x 7/8 = 11/40

• P(S2 = 7| A2 = 1, O1 = d, A1 = 1)
= ΣS1P(S2 = 7|S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
= P(S2 = 7| S1 = 6, A2 = 1) x P(S1 = 6| O1 = d, A1 = 1)
= 0.8 x 7/8 = 28/40 = 7/10

aycard@inrialpes.fr                                                     27

Tracking(3/3)

• It observes a door: O2 = d (estimation)
• P(S2| O2 = d, A2 = 1, O1 = d, A1 = 1)
= αP(O2 = d| S2) x P(S2| A2 = 1, O1 = d, A1 = 1)
•   P(S2 = 5| O2 = d, A2 = 1, O1 = d, A1 = 1)
= αP(O2 = d| S2 = 5) x P(S2 = 5| A2 = 1, O1 = d, A1 = 1)
=α0.4 x 1/40 = α 1/100 = 0.01
•   P(S2 = 6| O2 = d, A2 = 1, O1 = d, A1 = 1)
= αP(O2 = d| S2 = 6) x P(S2 = 6| A2 = 1, O1 = d, A1 = 1)
= α0.7 x 11/40 = α 77/400 = 0.25
•   P(S2 = 7| O2 = d, A2 = 1, O1 = d, A1 = 1)
= αP(O2 = d| S2 = 7) x P(S2 = 7| A2 = 1, O1 = d, A1 = 1)
= α0.4 x 7/10 = α 14/25 = 0.74

• The second observation should correspond to an error !!!

aycard@inrialpes.fr                                                     28

14
Global localization
• Initial position unknown
• P(S0=1/20) for each S0
• Progressively, we are
able to find the most
probable position of the
robot

Phd of Dieter Fox (98)
aycard@inrialpes.fr                                                             29

Conclusion
• Only a conceptual solution
• Integrals are seldom intractable
• Practical solutions are only possible under some hypothesis:
• Discretization of state-space:
• Markov localization or discrete bayesian filters
• Markov chain & Hidden Markov models (see M2R “knowledge
representation & reasoning”)
• Representation of state space by a gaussian
• Kalman filters, Extended Kalman filters
• Representation of state-space by particles:
• Particle filters;

• Different type of filters:
• Markov chain: no actions, no observations;
• Hidden Markov models: no actions;
• Markov localization, Kalman filters, particle filters: actions and
observations;

• We also have to define dynamic model and sensor model

aycard@inrialpes.fr                                                             30

15
Content
• Fundamentals of Bayesian Techniques (E. Sucar)
• Bayesian Filters (O. Aycard)
• Definition & interests
• Implementations
• Kalman filters
• Particle filters

aycard@inrialpes.fr                                                                            31

Kalman Filter
•   Kalman Filter is an implementation of Bayesian Filters
•   State space is represented by a gaussian distribution

•   St or Xt : state at time t (St is a gaussian distribution represented by N(µ t, ∑t))
•   Ot or Zt: observation at time t
•   At or Ut: actions at time t

•   Hypothesis :
• Order 1 Markov model
• P(St | St-1,At) : St is a linear combination of St-1 and At. The noise
associated to this model is gaussian.
• Sensor model
• P(Ot | St) : Ot is a linear combination of St. The noise associated to
this model is gaussian.

•   Goal : compute prior distribution P(ST | O0:T,A1:T)

aycard@inrialpes.fr                                                                            32

16
Example: initial position
µ0 = 1                      µ0 = 1
∑0 = 2                      ∑0 = 4

0,25

0,2

0,15

Série2
Série3

0,1

0,05

0
-5   -4    -3   -2   -1   0   1   2   3   4       5        6   7    8    9    10

aycard@inrialpes.fr                                                                                                                             33

Kalman filter: assumptions(1/2)
•   St is a gaussian distribution represented by N(µ t, ∑t)
• Dynamic model is linear and associated noise is gaussian
• St = PtSt-1 + BtAt+εt or Xt = f(Xt-1, Ut)
•     St is a n vector
•     Pt is n x n matrix
•     At is a m vector
•     Bt is n x m matrix
•     εt is a gaussian distribution represented by N(0, QT) modelizing the
noise associated with the dynamic model
1
1                   −    (( ST − PT ST −1 − BT AT   )T QT−1 ( ST − PT ST −1 − BT AT ))
P(ST | ST −1 , AT ) =                      N
e       2

(2π ) 2 det(QT )

aycard@inrialpes.fr                                                                                                                             34

17
Kalman filter: assumptions(2/2)
•     Sensor model is linear and associated noise is gaussian
•   Ot = CtSt + wt or Zt = h(Xt) + wt
•   Ot is a k vector
•   Ct is a k x n matrix
•   wt is a gaussian distribution represented by N(0,Rt) modelizing the noise
associated with the observation model

1
1               −    ((OT −CT ST   )T RT−1 (OT −CT ST ))
P(OT | ST ) =                  N
e       2

(2π ) 2   det( RT )

• St+1 is a gaussian distribution represented by N(µt+1, ∑t+1)
• Only mean and covariance needed;
• Exact inference using simple linear equations
aycard@inrialpes.fr                                                                                           35

Kalman filter: equations
µ t+1 = Ptµ t + BtAt
∑t+1 = Pt∑tPtT + Qt                   }    Prediction

Kt =
∑C      t
T
t
Kalman gain
C ∑ C +R
t    t
T
t       t

µ t+1 = µ t+1 + Kt(Ot-Ct µ t+1)
∑t+1 = (I-KtCt)∑t+1                            }    Estimation

aycard@inrialpes.fr                                                                                           36

18
Example (1/6)

• A mobile robot is initially located at 1 meter from a wall:
=> (µ 0=1, ∑0 = 2);
• It is moving away from this wall with discrete translation of 3
meters (At = 3);
• It is equipped with a sensor to measure its distance to the wall
(Ot).

aycard@inrialpes.fr                                                                                37

Kalman filter: matrix & noise
•     Pt = 1
•     Bt = 1                    (action space and state space are identical)
•     Ct = 1                    (observation space and state space are identical)
•     Qt = 4                        (we know that the noise associated with actuators is important)
•     Rt = 1                         (we know that the noise associated with observations is small)

µ t+1 = 1xµ t + 1x3 = µ t + 3
∑t+1 = 1x∑tx1 + 4 = ∑t + 4                                            }   Prediction

Kt =
∑C     t
T
t
=
∑   t   ×1
=
∑   t
Kalman gain
C ∑ C +R
t   t
T
t       t       1× ∑ t × 1 + 1       ∑ +1
t

aycard@inrialpes.fr                                                                                38

19
Kalman filter: equations
µ t+1 = 1xµ t + 1x3 = µ t + 3
∑t+1 = 1x∑tx1 + 4 = ∑t + 4                             }        Prediction

Kt =
∑C     t
T
t
=
∑   t   ×1
=
∑   Kalman gain
t

C ∑ C +R
t   t
T
t       t       1× ∑ t ×1 + 1       ∑ +1
t

µ t+1 = µ t+1 + Kt(Ot-1x µ t+1) = µ t+1 + Kt(Ot-µ t+1)
∑t+1 = (I-KtCt)∑t+1 = (1-Ktx1)∑t+1 = (1-Kt)∑t+1                              }   Estimation

aycard@inrialpes.fr                                                                               39

Example: initial stage(2/6)
µ0 = 1
∑0 = 2

aycard@inrialpes.fr                                                                               40

20
Example (3/6)
µ1 = µ0 + 3 = 4

Dynamic model

aycard@inrialpes.fr                                 41

Example(4/6): prediction stage

∑1 = ∑0 + 4 = 2 + 4 = 6

P(S1|A1)

aycard@inrialpes.fr                                 42

21
Example (5/6): estimation stage
O1 = 5
K1 = ∑1/(∑1+1)-1 = 6/(6+1)-1 = 6/7

Sensor observation at time t

aycard@inrialpes.fr                                           43

Example (6/6): estimation stage
µ 1 = µ 1 + K1(O1-µ 1)= 4 +6/7(5-4) = 4+6/7
∑1 = (I-K1)∑1 = (1-6/7)x6 = 6/7

Posterior pdf at time t
P(S1|A1,O1)
aycard@inrialpes.fr                                           44

22
Conclusion
• KF limited to linear models (sensor and dynamic)
• EKF linearizes the estimation around the current estimate
• Local linearization using Taylor expansion
• Similar to KF (ie, gaussian distribution for the state space)
• KF and EKF popular and successful
• KF limited to gaussian noise
• KF can only represent monomodale distributions: only
tracking
• Derived models:
• Unscented Kalman Filter (dynamic model non linear);
• Multi Hypothesis Kalman Filter.

aycard@inrialpes.fr                                                        45

23

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 13 posted: 12/1/2011 language: Danish pages: 23