Docstoc

Uncertainty

Document Sample
Uncertainty Powered By Docstoc
					                     EE562
            ARTIFICIAL INTELLIGENCE
                FOR ENGINEERS
                    Lecture 16, 6/1/2005

                  University of Washington,
            Department of Electrical Engineering
                         Spring 2005
             Instructor: Professor Jeff A. Bilmes

5/25/2005                   EE562
            Uncertainty & Bayesian
                  Networks
                  Chapter 13/14




5/25/2005              EE562
                     Outline
• Inference
• Independence and Bayes' Rule
• Chapter 14
     – Syntax
     – Semantics
     – Parameterized Distributions
     – Inference in Bayesian Networks


5/25/2005               EE562
                   On the final
• Same format as midterm
• closed book/closed notes
• Might test on all material of the quarter,
  including today (i.e., chapters 1-9, 13,14)
     – but will not test on fuzzy logic.
• Will be weighted towards latter half of the
  course though.

5/25/2005                   EE562
                   Homework
• Last HW of the quarter
• Due next Wed, June 1st, in class:
     – Chapter 13: 13.3, 13.7, 13.16
     – Chapter 14: 14.2, 14.3, 14.10




5/25/2005                EE562
            Bayesian Networks

                Chapter 14




5/25/2005          EE562
                Bayesian networks
• A simple, graphical notation for conditional
  independence assertions and hence for compact
  specification of full joint distributions

• Syntax:
     – a set of nodes, one per variable
     –
     – a directed, acyclic graph (link ≈ "directly influences")
     – a conditional distribution for each node given its parents:
                                 P (Xi | Parents (Xi))

• In the simplest case, conditional distribution represented
  as a conditional probability table (CPT) giving the
  distribution over Xi for each combination of parent values
5/25/2005                          EE562
            Example contd.




5/25/2005         EE562
                             Semantics
The full joint distribution is defined as the product of the local
  conditional distributions:
                              n

            P (X1, … ,Xn) = πi = 1 P (Xi | Parents(Xi))


e.g., P(j  m  a  b  e)

   = P (j | a) P (m | a) P (a | b, e) P (b) P (e)




5/25/2005                              EE562
                Local Semantics
Local semantics: each node is conditionally independent of its
  nondescendants given its parents




Thm: Local semantics  global semantics


5/25/2005                       EE562
            Example: car diagnosis
• Initial evidence: car won’t start
• Testable variables (green), “broken, so fix it” variables (orange)
• Hidden variables (gray) ensure sparse structure, reduce parameters.




5/25/2005                       EE562
            Example: car insurance




5/25/2005            EE562
            compact conditional dists.
•   CPT grows exponentially with number of parents
•   CPT becomes infinite with continuous-valued parent or child
•   Solution: canonical distributions that are defined compactly
•   Deterministic nodes are the simplest case:
     – X = f(Parents(X)), for some deterministic function f (could be logical
       form)
• E.g., boolean functions
     – NorthAmerican  Canadian Ç US Ç Mexican
• E.g,. numerical relationships among continuous variables




5/25/2005                             EE562
            compact conditional dists.
• “Noisy-Or” distributions model multiple interacting causes:
     –   1) Parents U1, …, Uk include all possible causes
     –   2) Independent failure probability qi for each cause alone
     –    : X ´ U1 Æ U2 Æ … Æ Uk
     –    P(X|U1, …, Uj, : Uj+1, …, : Uk ) = 1 - i=1j qi
• Number of parameters is linear in number of parents.




5/25/2005                              EE562
     Hybrid (discrete+cont) networks
• Discrete (Subsidy? and Buys?); continuous (Harvest and Cost)




• Option 1: discretization – large errors and large CPTs
• Option 2: finitely parameterized canonical families
     – Gaussians, Logistic Distributions (as used in Neural Networks)
• Continuous variables, discrete+continuous parents (e.g., Cost)
• Discrete variables, continuous parents (e.g., Buys?)

5/25/2005                            EE562
                Inference
•   by enumeration
•   by variable elimination
•   by stochastic simulation
•   by Markov chain Monte Carlo




5/25/2005            EE562
                  Inference Tasks
• Simple queries: compute posterior marginal, P(Xi|E=e)
     – e.g., P(NoGas|Gague=empty,Lights=on,Starts=false)
• Conjunctive queries:
     – P(Xi,Xj|E=e) = P(Xi|E=e)P(Xj|Xi,E=e)
• Optimal Decisions: decision networks include utility
  information; probabilistic inference required fro
  P(outcome|action,evidence)
• Value of information: which evidence to seek next?
• Sensitivity analysis: which probability values are most
  critical?
• Explanation: why do I need a new starter motor?

5/25/2005                        EE562
        Inference By Enumeration




5/25/2005          EE562
            Enumeration Algorithm




5/25/2005            EE562
            Evaluation Tree




5/25/2005         EE562
   Inference by Variable Elimination




5/25/2005         EE562
  Variable Elimination: Basic operations




5/25/2005          EE562
            Variable Elimination: Algorithm




5/25/2005                 EE562
            Irrelevant variables




5/25/2005           EE562
            Irrelevant varaibles continued:




5/25/2005                 EE562
            Complexity of exact inference




5/25/2005                EE562
      Inference by stochastic simulation




5/25/2005            EE562
            Sampling from empty network




5/25/2005               EE562
            Example




5/25/2005     EE562
            Rejection Sampling




5/25/2005          EE562
            Analysis of rejection sampling




5/25/2005                EE562
            Likelihood weighting




5/25/2005           EE562
            MCMC




5/25/2005    EE562
                        Summary
• Bayesian networks provide a natural representation for
  (causally induced) conditional independence
• Topology + CPTs = compact representation of joint
  distribution
• Generally easy for domain experts to construct
• Exact inference by variable elimination
     – polytime on polytrees, NP-hard on general graphs
     – space can be exponential as well
     – sampling approaches can help, as they only do approximate
       inference.
• Take my Graphical Models class if more interested
  (much more theoretical depth)

5/25/2005                       EE562

				
DOCUMENT INFO