Docstoc

bf

Document Sample
bf Powered By Docstoc
					                          Bayesian Filters

                           Dr. Olivier Aycard

                  Laboratoire d’Informatique de Grenoble

                             Grenoble, FRANCE

                      http://emotion.inrialpes.fr/aycard



aycard@inrialpes.fr                                        1




                                 Content
    • Fundamentals of Bayesian Techniques
         • Logical propositions
         • Discrete random variables
         • Rules of bayesian calculus
    • Bayesian Filters




aycard@inrialpes.fr                                        2




                                                               1
                      Logical proposition
 A logical proposition is a statement which is true or false.
 It is noted with a small letter.
            d = « we are in december »
            t = « the temperature is 7 degres »
 We use the usual logical operators:
                        d∧ t
                        d∨ t
                        ¬d

aycard@inrialpes.fr                                                  3




          Probability of a logical proposition
We associate a probability to a logical proposition to summarize
our knowledge on this proposition. We note it:
                        P(d ) ∈ [0,1]
We are also interested in the negation of a logical proposition:

                       P(¬d) ∈[0,1]
Finally, we are interested in the probability of a logical proposition t
conditioned by the value of an other proposition d:

                       P(t d) ∈[0,1]
  Which represents the probability to have 7 degres when we are in
  december
aycard@inrialpes.fr                                                  4




                                                                           2
                                          Content
    • Fundamentals of Bayesian Techniques
         • Logical propositions
         • Discrete random variables
         • Rules of bayesian calculus
    • Bayesian Filters




aycard@inrialpes.fr                                                                         5




             Discrete Random Variables(1/2)
    • Variables are denoted by name starting with one
                                              random variable x
      uppercase letter: X
    • By definition a discrete variable X is a set of
                              0,4


                             0,35


      propositions xi         0,3




         •   Mutually exclusive: i ≠ j ⇒ xi ∧ x j = false
               probability
                             0,25


                              0,2
                                                                     [         ]   Série1

         •   Exhaustive: at least one is true
                             0,15


                              0,1
         •   ∀i P(xi) ≥ 0    0,05


         •   ∑iP(xi) = 1        0
                                    X1   X2               X3         X4   X5


             The cardinal of X is denoted: X 
                                                   possible values

         •


aycard@inrialpes.fr                                                                         6




                                                                                                3
             Discrete Random Variables(2/2)
• Example
    • D = “we are in december” defined by the logiacl proposition d
    and its negation;
         • We note D = T (respectively D = F) or d (respectively ¬d) for “we are
         in december” (respectively for “we are not in december”).
    • T = “the temperature is of i degres” defined by the 61 logical
    propositions:
    ti = “the temperature is of i degres” with -20 ≤ i ≤ 40;
         • We note T = 12 (respectively T = 5) or t12 (respectively t5) pour
         “the temperature is of 12 degres” (respectively for “the temperature is of
         5 degres”).


aycard@inrialpes.fr                                                              7




      Distribution of probabilities associated to a discrete random
                                 variable

  • We tell that P(X) is a probability distribution associated to
    a discrete random variable X defined by xi with 1 <= i <=
    N, if :
       • ∀i 0 <= P(xi)<= 1
       • ∑iP(xi) = 1

  • This probability distribution represents our knowledge on
    the variable X.
  • Example
       • P( D = V) = 1/12
       • P( ¬d) = 11/12


aycard@inrialpes.fr                                                              8




                                                                                      4
                      Conditional discrete random variables

   • A discrete random variable can be defined conditionally to
     the value of an other discrete random variable.
   • We note X | Y ( X knowing Y).

   • Example
        • T | D which defines “the temperature in degres when we are (or are
          not) in december”
        • t5 | D = T : “the temperature is of 5 degres when we are in
          december”;
        • T = 12 | ¬d : “the temperature is of 12 degres when we are not in
          december”;



aycard@inrialpes.fr                                                            9




         Probability distribution associated to a discrete random
                                 variable


    • We tell that P(X | Y) is the probability distribution
      associated to a discrete random variable X conditionally to
      a variable Y defined on yj with 1 <= j <= M, if :
         • ∀i,j 0 <= P(xi | yj) <= 1
         • ∀j, ∑iP(xi | yj) = 1
         • This distribution represents our knowledge on the variable X
           conditionally to the variable Y.
    • Example
         • P(T = 12 | d) = 0,06
         • P( t5 | D = F) = 0,05

aycard@inrialpes.fr                                                            10




                                                                                    5
                               Content
    • Fundamentals of Bayesian Techniques
         • Logical propositions
         • Discrete random variables
         • Rules of bayesian calculus
    • Bayesian Filters




aycard@inrialpes.fr                                         11




             Normalization and Conjunction
                      Postulates

                            P(a) + P(¬a) = 1


                          P (a ∧ b) = P (a) × P (b | a)
                                       = P (b) × P(a | b)




aycard@inrialpes.fr                                         12




                                                                 6
                           Conjonction rule
                           or bayesian rule
                      ∀xi∈X,∀y j∈Y
                      P(xi ∧ y j )=P(xi )×P(y j|xi )=P(y j )×P(xi| y j )


                 P(X ∧Y )=P(X )×P(Y |X )=P(Y )×P(X |Y )

        P(X ∧Y Z )=P(X Z )×P(Y |X ∧Z )=P(Y Z )×P(X |Y ∧Z )

                                                P(X ∧ Y )
                          P (X |Y ) =
                                                  P(Y )

aycard@inrialpes.fr                                                        13




                         Normalisation rule

                                     ∑ P(X ) = 1
                                      X

                                  ∀y∈Y, ∑P(X Y )=1
                                            X




aycard@inrialpes.fr                                                        14




                                                                                7
                         Marginalisation rule

                              ∑ P(X ∧ Y ) = P(Y )
                               X



                                ∑P(X ∧Z Y )=P(Z Y )
                                   X
                                                                 Preuve




aycard@inrialpes.fr                                                  15




                                       Content
    • Fundamentals of Bayesian Techniques
    • Bayesian Filters
         • Definition & interests
              •   Definition
              •   Inference
              •   Example of localization: Markov localization
              •   Conclusion
         • Implementations
         • Conclusion




aycard@inrialpes.fr                                                  16




                                                                          8
                                     Bayesian Filters
    •    Filtering is the problem of sequentially estimating the states of a system as a
         set of actions and observations become available on-line (one or several
         sensors)

    •    St or Xt: state at time t
    •    Ot or Zt: observation at time t
    •    At or Ut: action at time t

    •     Hypothesis :
          • Order 1 Markov model
              • P(St|St-1,At) :    dynamic model
          • Sensor model
              • P(Ot|St)         :   sensor model

    •    Goal : compute posterior distribution P(St|O0:t,A1:t)




aycard@inrialpes.fr                                                                               17




                               Recursive Inference
                                 Question(1/2)
   P(ST O0:T ,A1:T ) = αP(ST , O0:T , A1:T )

  P(ST O0:T ,A1:T ) = αP (OT ST , O0:T −1 , A1:T )× P(ST , O0:T −1 , A1:T )
  P(ST O0:T ,A1:T ) = αP (OT ST )× P (ST , O0:T −1 , A1:T )

  P(ST O0:T ,A1:T ) = αP (OT ST ) ∫ P (ST , ST −1 , O0:T −1 , A1:T )
                                             ST −1
  P (ST O0:T ,A1:T ) = αP (OT ST ) ∫ P (ST ST −1 , O0:T −1 , A1:T )× P(ST −1 , O0:T −1 , A1:T )
                                     ST −1

  P (ST O0:T ,A1:T ) = αP(OT ST ) ∫ P(ST ST −1 , AT )× P(ST −1 , O0:T −1 , A1:T )
                                    ST −1




aycard@inrialpes.fr                                                                               18




                                                                                                       9
                             Recursive Inference
                               Question(2/2)
 P(ST O0:T ,A1:T ) = αP(OT ST ) ∫ P (ST ST −1 , AT )× P(ST −1 , O0:T −1 , A1:T )
                                           ST −1

 P(ST O0:T ,A1:T ) = αP(OT ST ) ∫ P(ST ST −1 , AT )× P(ST −1 O0:T −1 , A1:T )× P(O0:T −1 , A1:T )
                                        ST −1

 P(ST O0:T ,A1:T ) = α ' P(OT ST ) ∫ P(ST ST −1 , AT )× P(ST −1 O0:T −1 , A1:T )
                                   ST −1



 P (ST O0:T ,A1:T ) = α ' P (OT ST ) ∫ P (ST ST −1 , AT )× P(ST −1 O0:T −1 , A1:T −1 )
                                                    ST −1




aycard@inrialpes.fr                                                                                   19




                             Recursive Inference
  Initialization :
               1
  P (S 0 O0 ) = P (S 0 )× P (O0 S 0 )
               Z
                          Integral approximation

    P(ST O0:T −1,A1:T ) =                 ∫ P(S             T   ST −1 , AT )× P(ST −1 O0:T −1 , A1:T −1 )
                                        ST −1


   Estimation :           Confrontation observation - prediction

                     (                          )
                   P S T O0:T ,A1:T = α ' P (OT ST )× P (ST O0:T −1 , A1:T )


aycard@inrialpes.fr                                                                                   20




                                                                                                            10
                          Algorithm for bayesian filtering
  Initialization :
                  1
  P (S 0 O0 ) =     P (S 0 )× P (O0 S 0 )
                  Z



  Input: P(ST −1 O0:T −1,A1:T −1 ) (previous probability distribution),AT , OT
  for all s ∈ ST
           P (ST = s O0:T −1,A1:T ) =        ∫ P(S   T   = s ST −1 , AT )× P (ST −1 O0:T −1 , A1:T −1 ) (prediction)
                                            ST −1

           P(S T = s O0:T ,A1:T ) = α ' P(OT ST = s )× P(ST = s O0:T −1 , A1:T )(estimation : confrontation prediction - observation)

   Endfor
                    (
  return P S T O0:T ,A1:T                      )



aycard@inrialpes.fr                                                                                                       21




         Discrete bayesian filters or Markov localization(1/2)

      • We want to know the position of a robot in an environment
        (ie, localize it)
      • We have a map of the environment (ie, the position of the
        doors in the corridor are known)
      • The environment is static: there is any dynamic entities in
        the environment
      • The mobile robot is doing an action to move
      • After each action, the robot is able to observe its
        environment




aycard@inrialpes.fr                                                                                                       22




                                                                                                                                        11
     Discrete bayesian filters or Markov localization(2/2)

    • Bayesian filters are typically used to solve this problem
         • The environment is supposed to be discrete
         • Discrete bayesian filters or Markov localization
    • 3 main hypothesis regarding the initial position of the
      robot:
       • initial position known: local localization or tracking;
       • initial position unknown: global localization;
       • initial position false: knidnapping problem.

    • To apply bayesian filters, we have to define:
       • St, Zt or Ot and At
       • P(St|St-1,At) and P(Zt|St)

aycard@inrialpes.fr                                                23




               Definition of variables and models(1/2)

    • A mobile robot is moving from one meter (At=1) in a
      corridor of 20 meters length (St={1..20})
    • The mobile robot is able to observe the doors and the wall
      of the corridor (Ot={d, w})
    • We have a map of the corridor: we know where the doors
      are located in the corridor, there is a door at 6 meters
      (St=6), at 8 meters (St=8) and 14 meters (St=14)




aycard@inrialpes.fr                                                24




                                                                        12
               Definition of variables and models(2/2)

    • Dynamic model: Action is not perfect !!!
         • P(St=s+1|St-1=s ,At=1)=0.8
         • P(St=s|St-1=s ,At=1)=0.2
    • Sensor model: Observations are not perfect !!!
         • P(Ot=d|St=s)=0.7 with s={6; 8 or 14}
         • P(Ot=w|St=s)=0.3 with s={6; 8 or 14}

         • P(Ot =w|St=s)=0.6 with s={1..5; 7; 9..13; 15..20}
         • P(Ot=d|St=s)=0.4 with s={1..5; 7; 9..13; 15..20}



aycard@inrialpes.fr                                            25




                                     Tracking(1/3)

    •   Local localization or tracking
    •   Initially, the mobile robot is located at 5 meters
    •   P(S0 = 5) = 1 and P(S0 = s) = 0 with s={1..4;6..20}

    •   It moves from one meter: A1=1 (prediction)
    •   P(S1| A1 = 1)
        = ΣS0 P(S1| S0, A1 = 1) x P(S0)
        = P(S1| S0 = 5, A1 = 1)
         • P(S1 = 5| A1 = 1) = 0.2
         • P(S1 = 6| A1 = 1) = 0.8

    •   It observes a door: O1=d (estimation)
    •   P(S1| O1 = d, A1 = 1)
        = P(O1 = d| S1) x P(S1| A1 = 1)
         • P(S1 = 5| O1 = d, A1 = 1) = α0.4x0.2 = 8/64 = 1/8
         • P(S1 = 6| O1 = d, A1 = 1) = α0.7x0.8=56/64=7/8
aycard@inrialpes.fr                                            26




                                                                    13
                                        Tracking(2/3)

    •    It moves again from one meter: A2=1 (prediction)
    •    P(S2| A2 = 1, O1 = d, A1 = 1)
         = ΣS1 P(S2| S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
          • P(S2 = 5| A2 = 1, O1 = d, A1 = 1)
            = ΣS1 P(S2 = 5| S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
            = P(S2 = 5| S1 = 5, A2 = 1) x P(S1 = 5| O1 = d, A1 = 1)
            = 0.2 x 1/8 = 1/40

          • P(S2 = 6| A2 = 1, O1 = d, A1 = 1)
            = ΣS1 P(S2 = 6| S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
            = P(S2 = 6| S1 = 5, A2 = 1) x P(S1 = 5| O1 = d, A1 = 1) +
              P(S2 = 6| S1 = 6, A2 = 1) x P(S1 = 6| O1 = d, A1 = 1)
            = 0.8 x 1/8 + 0.2 x 7/8 = 11/40

          • P(S2 = 7| A2 = 1, O1 = d, A1 = 1)
            = ΣS1P(S2 = 7|S1, A2 = 1) x P(S1| O1 = d, A1 = 1)
            = P(S2 = 7| S1 = 6, A2 = 1) x P(S1 = 6| O1 = d, A1 = 1)
            = 0.8 x 7/8 = 28/40 = 7/10

aycard@inrialpes.fr                                                     27




                                        Tracking(3/3)

• It observes a door: O2 = d (estimation)
• P(S2| O2 = d, A2 = 1, O1 = d, A1 = 1)
  = αP(O2 = d| S2) x P(S2| A2 = 1, O1 = d, A1 = 1)
    •   P(S2 = 5| O2 = d, A2 = 1, O1 = d, A1 = 1)
        = αP(O2 = d| S2 = 5) x P(S2 = 5| A2 = 1, O1 = d, A1 = 1)
        =α0.4 x 1/40 = α 1/100 = 0.01
    •   P(S2 = 6| O2 = d, A2 = 1, O1 = d, A1 = 1)
        = αP(O2 = d| S2 = 6) x P(S2 = 6| A2 = 1, O1 = d, A1 = 1)
        = α0.7 x 11/40 = α 77/400 = 0.25
    •   P(S2 = 7| O2 = d, A2 = 1, O1 = d, A1 = 1)
        = αP(O2 = d| S2 = 7) x P(S2 = 7| A2 = 1, O1 = d, A1 = 1)
        = α0.4 x 7/10 = α 14/25 = 0.74

    • The second observation should correspond to an error !!!



aycard@inrialpes.fr                                                     28




                                                                             14
                           Global localization
• Initial position unknown
• P(S0=1/20) for each S0
• Progressively, we are
  able to find the most
  probable position of the
  robot




                                    Phd of Dieter Fox (98)
aycard@inrialpes.fr                                                             29




                                    Conclusion
    • Only a conceptual solution
         • Integrals are seldom intractable
    • Practical solutions are only possible under some hypothesis:
         • Discretization of state-space:
              • Markov localization or discrete bayesian filters
              • Markov chain & Hidden Markov models (see M2R “knowledge
                representation & reasoning”)
         • Representation of state space by a gaussian
              • Kalman filters, Extended Kalman filters
         • Representation of state-space by particles:
              • Particle filters;

    • Different type of filters:
         • Markov chain: no actions, no observations;
         • Hidden Markov models: no actions;
         • Markov localization, Kalman filters, particle filters: actions and
           observations;

    • We also have to define dynamic model and sensor model

aycard@inrialpes.fr                                                             30




                                                                                     15
                                        Content
    • Fundamentals of Bayesian Techniques (E. Sucar)
    • Bayesian Filters (O. Aycard)
         • Definition & interests
         • Implementations
              • Kalman filters
              • Particle filters




aycard@inrialpes.fr                                                                            31




                                      Kalman Filter
    •   Kalman Filter is an implementation of Bayesian Filters
    •   State space is represented by a gaussian distribution

    •   St or Xt : state at time t (St is a gaussian distribution represented by N(µ t, ∑t))
    •   Ot or Zt: observation at time t
    •   At or Ut: actions at time t

    •   Hypothesis :
        • Order 1 Markov model
            • P(St | St-1,At) : St is a linear combination of St-1 and At. The noise
               associated to this model is gaussian.
        • Sensor model
            • P(Ot | St) : Ot is a linear combination of St. The noise associated to
               this model is gaussian.

    •   Goal : compute prior distribution P(ST | O0:T,A1:T)

aycard@inrialpes.fr                                                                            32




                                                                                                    16
                              Example: initial position
                                   µ0 = 1                      µ0 = 1
                                   ∑0 = 2                      ∑0 = 4

                  0,25




                   0,2




                  0,15



                                                                                                                 Série2
                                                                                                                 Série3



                   0,1




                  0,05




                    0
                         -5   -4    -3   -2   -1   0   1   2   3   4       5        6   7    8    9    10




aycard@inrialpes.fr                                                                                                                             33




             Kalman filter: assumptions(1/2)
    •   St is a gaussian distribution represented by N(µ t, ∑t)
    • Dynamic model is linear and associated noise is gaussian
         • St = PtSt-1 + BtAt+εt or Xt = f(Xt-1, Ut)
              •     St is a n vector
              •     Pt is n x n matrix
              •     At is a m vector
              •     Bt is n x m matrix
              •     εt is a gaussian distribution represented by N(0, QT) modelizing the
                    noise associated with the dynamic model
                                                                               1
                                                       1                   −    (( ST − PT ST −1 − BT AT   )T QT−1 ( ST − PT ST −1 − BT AT ))
        P(ST | ST −1 , AT ) =                      N
                                                                       e       2


                                         (2π ) 2 det(QT )


aycard@inrialpes.fr                                                                                                                             34




                                                                                                                                                     17
                    Kalman filter: assumptions(2/2)
    •     Sensor model is linear and associated noise is gaussian
           •   Ot = CtSt + wt or Zt = h(Xt) + wt
           •   Ot is a k vector
           •   Ct is a k x n matrix
           •   wt is a gaussian distribution represented by N(0,Rt) modelizing the noise
               associated with the observation model

                                                                       1
                                                   1               −    ((OT −CT ST   )T RT−1 (OT −CT ST ))
               P(OT | ST ) =                  N
                                                               e       2

                                         (2π ) 2   det( RT )

    • St+1 is a gaussian distribution represented by N(µt+1, ∑t+1)
           • Only mean and covariance needed;
           • Exact inference using simple linear equations
aycard@inrialpes.fr                                                                                           35




                                 Kalman filter: equations
    µ t+1 = Ptµ t + BtAt
    ∑t+1 = Pt∑tPtT + Qt                   }    Prediction



        Kt =
                 ∑C      t
                                 T
                                 t
                                            Kalman gain
               C ∑ C +R
                t    t
                             T
                             t       t




    µ t+1 = µ t+1 + Kt(Ot-Ct µ t+1)
    ∑t+1 = (I-KtCt)∑t+1                            }    Estimation




aycard@inrialpes.fr                                                                                           36




                                                                                                                   18
                                                            Example (1/6)




• A mobile robot is initially located at 1 meter from a wall:
  => (µ 0=1, ∑0 = 2);
• It is moving away from this wall with discrete translation of 3
  meters (At = 3);
• It is equipped with a sensor to measure its distance to the wall
  (Ot).

aycard@inrialpes.fr                                                                                37




                        Kalman filter: matrix & noise
    •     Pt = 1
    •     Bt = 1                    (action space and state space are identical)
    •     Ct = 1                    (observation space and state space are identical)
    •     Qt = 4                        (we know that the noise associated with actuators is important)
    •     Rt = 1                         (we know that the noise associated with observations is small)


    µ t+1 = 1xµ t + 1x3 = µ t + 3
    ∑t+1 = 1x∑tx1 + 4 = ∑t + 4                                            }   Prediction


        Kt =
                 ∑C     t
                                T
                                t
                                        =
                                               ∑   t   ×1
                                                             =
                                                                  ∑   t
                                                                               Kalman gain
               C ∑ C +R
                t   t
                            T
                            t       t       1× ∑ t × 1 + 1       ∑ +1
                                                                  t




aycard@inrialpes.fr                                                                                38




                                                                                                          19
                               Kalman filter: equations
    µ t+1 = 1xµ t + 1x3 = µ t + 3
    ∑t+1 = 1x∑tx1 + 4 = ∑t + 4                             }        Prediction



   Kt =
            ∑C     t
                           T
                           t
                                   =
                                          ∑   t   ×1
                                                       =
                                                            ∑   Kalman gain
                                                                t

          C ∑ C +R
           t   t
                       T
                       t       t       1× ∑ t ×1 + 1       ∑ +1
                                                            t




    µ t+1 = µ t+1 + Kt(Ot-1x µ t+1) = µ t+1 + Kt(Ot-µ t+1)
    ∑t+1 = (I-KtCt)∑t+1 = (1-Ktx1)∑t+1 = (1-Kt)∑t+1                              }   Estimation




aycard@inrialpes.fr                                                                               39




                           Example: initial stage(2/6)
                                   µ0 = 1
                                   ∑0 = 2




aycard@inrialpes.fr                                                                               40




                                                                                                       20
                                Example (3/6)
                      µ1 = µ0 + 3 = 4




                                    Dynamic model

aycard@inrialpes.fr                                 41




              Example(4/6): prediction stage

                        ∑1 = ∑0 + 4 = 2 + 4 = 6




                                    P(S1|A1)

aycard@inrialpes.fr                                 42




                                                         21
             Example (5/6): estimation stage
                      O1 = 5
                      K1 = ∑1/(∑1+1)-1 = 6/(6+1)-1 = 6/7




                            Sensor observation at time t

aycard@inrialpes.fr                                           43




             Example (6/6): estimation stage
                µ 1 = µ 1 + K1(O1-µ 1)= 4 +6/7(5-4) = 4+6/7
                ∑1 = (I-K1)∑1 = (1-6/7)x6 = 6/7




                            Posterior pdf at time t
                            P(S1|A1,O1)
aycard@inrialpes.fr                                           44




                                                                   22
                                Conclusion
    • KF limited to linear models (sensor and dynamic)
    • EKF linearizes the estimation around the current estimate
         • Local linearization using Taylor expansion
         • Similar to KF (ie, gaussian distribution for the state space)
    • KF and EKF popular and successful
    • KF limited to gaussian noise
    • KF can only represent monomodale distributions: only
      tracking
    • Derived models:
         • Unscented Kalman Filter (dynamic model non linear);
         • Multi Hypothesis Kalman Filter.


aycard@inrialpes.fr                                                        45




                                                                                23

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:13
posted:12/1/2011
language:Danish
pages:23