Variational Methods for Graphical Models

Document Sample
Variational Methods for Graphical Models Powered By Docstoc
					        CS498-EA
      Reasoning in AI
       Lecture #14

        Professor: Eyal Amir
        Fall Semester 2009

* Some slides due to Fei-Fei Li (Stanford U)


                                               1
 Summary So Far in Our Class
• We saw motivating applications
• We discussed two methods for
  propositional-logical reasoning
• We studied properties of graphical models
  of probability distributions
• We learned 2 kinds of probabilistic
  inference methods in graphical models
• We examined 2 methods for learning
  parameters of graphical models
                                              2
 The Road Ahead in Our Class
• Variational Approximations
• Models and inference with dynamic
  (temporal) systems: logical, probabilistic
• More expressive representations and
  inference:
  – First-Order Logic (FOL)
  – Relational/First-Order Probabilistic Models
  – Semantic Web and Description Logics
• Cross-cutting issues
                                                  3
      Before we Continue…
• Applications of methods we’ve learned
• Review ideas and techniques
• Reinvigorate our search for more
  methods…




                                          4
   Memories from Lecture 2…
• Applications of reasoning in AI
  – Econometrics
  – Social Networks
  – Verification of Circuits and Programs
  – Natural Language Processing
  – Robotics
  – Vision
  – Computer Security

                                            5
 Econometrics Example: A Recession
                                    a country
                  Model of when a bank(b ) goes into
– What is probability of recession,                 m
    bankruptcy?




–   Recession: Recession of a country in [0,1]
–   Market[X]: Quarterly market (X) index
–   Loss[X,Y]: Loss of a bank (Y) in a market (X)
                                                        6
–   Revenue[Y]: Revenue of a bank (Y)
Experiments




              7
Experiments




              8
                            Social Networks
• Example: school friendships and their effects
           Friend(A,B)                        Attr(A)                             Measuremt(A)


                                  1
           Friend(A,C)            2          Attr(B)                             Measuremt(B)




           Friend(B,C)                        Attr(C)                             Measuremt(C)



  Pr( f ( A, B), f ( A, C ), f ( B, C ), a( A), a( B), a(C ), m( A), m( B), m(C )) 
   1
      1 ( f ( A, B), a( A), a( B))   2 ( f ( A, C ), a( A), a(C ))  3 ( f ( B, C ), a ( B), a(C )) 
   Z
   4 (a( A), d ( A))  5 (a ( B), d ( B))   6 (a(C ), d (C )) 

   f (.,.),a(.),m(.)        shorthand for Friend(., .), Atrr(.), and Measuremt(.)
  1...6    potential func-tions
                                                                                                            9
                  f bob; f tom;                    f bob; f lia;    f ann; f lia;                        f bob; f val;
                    tom    bob                       lia      bob    lia       ann                         val     bob

f bob; f joe;                     f bob; f ann;                                       f ann; f val;                       f lia;     f val;
 joe       bob                     ann      bob                                        val        ann                       val         lia




hbob              bbob             bjoe                btom                  bann               blia               bval                 hval



hjoe                                                                                                                                    hlia

       htom                                                                                                                  hann
                           f joe; f ann;                   f joe;   f lia;                    f joe;    f val;
                            ann     joe                       lia     joe                       val       joe

       f joe;    f tom;                    f tom; f lia;                      f tom; f val;                       f tom; f ann;
         tom      joe                       lia      tom                       val      tom                         ann        tom



                                                                                                                                   10
Scaling-Up: Computing Pr(f(x,y))
                                               Time vs Number of People

                                  50,000

                                  45,000
    Computation Time in Seconds

                                  40,000

                                  35,000

                                  30,000

                                  25,000

                                  20,000

                                  15,000
                                                Figure 5: Computation time for
                                  10,000

                                   5,000

                                      0
                                           0      50,000           100,000       150,000   200,000
                                                           Number of People

                                                                                                     11
Application: Hardware Verification
 x1                    f1                f3
               AND            not
 x2
                                                    f5
                      f2                      AND
              not

                             OR
 x3                                 f4
 Question: Can we set this boolean cirtuit to TRUE?

 f5(x1,x2,x3) = a function of the input signal

                                                         12
Application: Hardware Verification
 x1                      f1               f3
                AND            not
 x2
                                                     f5
                       f2                      AND
               not

                              OR
 x3                                  f4         SAT(f5) ?
 Question: Can we set this boolean cirtuit to TRUE?

 f5(x1,x2,x3) = f3  f4 = f1  (f2  x3) =     M[x1]=FALSE
                (x1  x2)  (x2  x3)         M[x2]=FALSE
                                                M[x3]=FALSE13
        Hardware Verification
• Questions in logical circuit verification
  – Equivalence of circuits
  – Arrival of the circuit to a state (required a
    temporal model, which gets propositionalized)
  – Achieving an output from the circuit




                                                14
 Natural-Language Processing
• Logical semantics
• Probabilistic choice between meanings
• Inference over time




                                          15
Vision: Variability within a category
      Intrinsic              Deformation




                                           16
                 Constellation model of
                   object categories




                                                               17
Burl, Leung, Weber, Welling, Fergus, Fei-Fei, Perona, et al.
Goal




       18
              Goal




                                                                                 19
Burl, Leung, et al. ’96 ’98 Weber, Welling, et al. ’98 ’00, Fergus, et al. ‘03
Goal
       • Use prior knowledge of
       other objects
       • Estimate uncertainties
       in models
       • Do full Bayesian learning
       • Reduce the number of
       training examples


                              20
Variational Approximation Outline
• Motivation
• Outline of the Variational Approximation
  approach
• Loopy Belief Propagation
• Variational methodology
  – Sequential approach
  – Block approach


                                             21
     Variational Inference
        (in three easy steps…)
1. Choose a family of variational
   distributions Q(H).
2. Use Kullback-Leibler divergence
   KL(Q||P) as a measure of ‘distance’
   between P(H|V) and Q(H).
3. Find Q which minimises divergence.
                                         22

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:5/10/2013
language:English
pages:22