PowerPoint Presentation

Document Sample
PowerPoint Presentation Powered By Docstoc
					         Uncertainty in Artificial
     Intelligence Research at USC:
       Research Presentation for
           Graduate Students
                   September 10, 2004
                     Marco Valtorta
                      SWRG 3A55
                     mgv@cse.sc.edu
UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
          Uncertainty in Artificial
               Intelligence
• Artificial Intelligence (AI)
   – [Robotics]
   – Automated Reasoning
        • [Theorem Proving, Search, etc.]
        • Reasoning Under Uncertainty
           – [Fuzzy Logic, Possibility Theory, etc.]
           – Normative Systems
               » Bayesian Networks
               » Influence Diagrams (Decision Networks)

UNIVERSITY OF SOUTH CAROLINA
                                    Department of Computer Science and Engineering
               Research Interests
• Algorithms for Probability Update in BNs
   – factor tree method, with Mark Bloemeke
• Modeling of uncertain evidence
   – observation variables, with Young-Gyun Kim and Jirka
     Vomlel
• Soft Evidential Update in BNs
   – and the big clique algorithm, with Young-Gyun Kim and
     Jirka Vomlel
• Causal Bayesian networks
• Learning
   – CB algorithm, with Moninder Singh and Bing Xia
   – the effect of data quality on learning, with Valerie
     Sessions
UNIVERSITY OF SOUTH CAROLINA
                                Department of Computer Science and Engineering
        Algorithms and Modeling
• Algorithms for probability update in BNs
   – factor tree method, with Mark Bloemeke
• Modeling of uncertain evidence with observation
  variables, with Young-Gyun Kim and Jirka Vomlel
• Soft evidential update in BNs and the big clique
  algorithm, with Young-Gyun Kim and Jirka Vomlel
• Causal Bayesian networks, with Yimin Huang



UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
         Correlation vs. Causation
• The genotype theory (Fisher, 1958) of smoking
  and lung cancer: smoking and lung cancer are
  both effects of a genetic predisposition
• Three node network                                                           U
• X( smoking) and Y( lung cancer) are in lockstep
• X precedes Y in time (smoke before cancer)
• But, X does not cause Y, because if we set X, Y X                                      Y
  does not change: Y only changes according to
  the value of U (the genotype)




                                  Causality: Models, Reasoning and Inference Chapter 3
UNIVERSITY OF SOUTH CAROLINA
                                      Department of Computer Science and Engineering
            An Example [Cochran through Pearl, 2000]
Soil fumigants (X) are used to increase oat crop yields (Y) by
controlling the eelworm population (Z).
Last year’s eelworm population (Z0) is an unknown quantity
that is strongly correlated with this year’s population.
Through laboratory analysis of soil samples, we can determine
the eelworm populations before and after the treatments (Z1 and
Z2). Furthermore , we assume that the fumigants do not affect
the growth of eelworms surviving the treatment. Instead,
eelworm’s growth depends on the population of birds (B),
which is correlated with last year’s eelworm population and
hence with the treatment itself. Z3 here represents the eelworm
population at the end of the season.

We wish to assess the total effect of the fumigants on yields. But, controlled
randomized experiment are unfeasible and Z0 is unknown.
If we got a correct model, can we obtain consistent estimate of the target
quantity – the total effect of the fumigants on yields – through observations?
    UNIVERSITY OF SOUTH CAROLINA
                                                    Department of Computer Science and Engineering
                       Nonidentifiability
• The identifiablility of the effect of X on Y ensures that it is
  possible to infer the effect of action do(X=x) on Y from
  passive observations and the causal graph G, which
  specifies which variables participate in the determination
  of each variable in the domain
• To prove nonidentifiability, it is sufficient to present two                               U
  sets of structural equations that induce identical
  distributions over observed variables but have different
  causal effects                                                  X                                  Y
• X and Y are observable, U is not. All of them are binary
  variables
• Let P(X=0|U) = (0.5,0.5)                                   Y=0   X =0                X= 1

• P(Y=0|X,U) is given by the table on the right              U =0  0.1                 0.2

• We cannot observe U, so we do not know P(U)
                                                             U=1   0.8                 0.7
• When P(U=0) = 0.5, P(Y|X=0) =(.45,.55)
• When P(U=0) = 0.1, P(Y|X=0) =(.73,.27)
• So, P(Y|do(X)) is non-identifiable
                                              Causality: Models, Reasoning and Inference Chapter 3
   UNIVERSITY OF SOUTH CAROLINA
                                                  Department of Computer Science and Engineering
      Smoking and the genotype theory
• Consider the relation between
  smoking(X) and lung cancer(Y).
• The tobacco industry has managed to
  forestall antismoking legislation by
  arguing that observed correlation
  between smoking and lung cancer
  could be explained by some sort of
  carcinogenic genotype(U) that
  involves inborn carving for nicotine
• Suppose that Z is the amount of tar
  deposited in a person's lungs and we
  believe in the causal model shown on
  the right.
• Can we now recover P( y | x) from
                               ˆ
  observational data only?
                               Causality: Models, Reasoning and Inference Chapter 3
UNIVERSITY OF SOUTH CAROLINA
                                   Department of Computer Science and Engineering
                       Learning
• Parallel learning with background knowledge, with
  Bhaskara Moole
• CB algorithm, with Moninder Singh and Bing Xia
• Effect of data quality on learning, with Valerie
  Sessions




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
         An Example of Learning:
               Chernobyl




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
      A Bayesian Network Model




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                      Simulation




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
          Simulation File Conversion
                                                                           Visual CB
                                                                        simulation case
                                                                              file
                                                                                              8/2/2002
      Hugin simulation                                                        8
         case file                                                            5000
                             8/2/2002                                         visit to Asia
   X,B,S,T,L,E,D,A                                                            bronchitis
   yes,no,yes,no,no,no,yes,no                                                 ….
   no,yes,no,yes,yes,yes,no,yes                                               12122212
                                                                              21211121




         Hugin simulation               Hugin to Visual CB                  Visual CB
               file                      Simulation File                  simulation file
                                        Conversion Utility




                countNodes()                                 assignValues()


                            renameNodes()       countCases()


UNIVERSITY OF SOUTH CAROLINA
                                                         Department of Computer Science and Engineering
                      Sample(s)
                                                    Key

                                          Yes: 1
                                          Read: 1
                                          Received: 1
                                          Heard: 1
                                          Received: 1

                                          No: 2

UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                       Visual CB
• CB [Singh and Valtorta, 1993; 1995]
   – in Visual C++ Bing Xia, MS, 2002




 UNIVERSITY OF SOUTH CAROLINA
                                Department of Computer Science and Engineering
                       Learning




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
     Result on Chernobyl Example




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                       Results II




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                       Results III




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                     Results IV




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                    Applications
• Assessment of the risk of mental retardation in
  infants, with Subramani Mani and Suzanne
  McDermott
• Agent-based intrusion detection with soft evidence,
  with Vaibhav Gowadia and Csilla Farkas
• Support for intelligence analysis, with Michael
  Huhns, Hrishi Goradia, Jiangbo Dang, and
  Jingshan Huang
• Modeling damage in critical resources, with Yimin
  Huang and Bill Full


UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                       MENTOR




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
           The OmniSeer Project
• Represent prior knowledge to support intelligence
  analysis
• Explicate formerly tacit knowledge for use and
  collaboration
• Support relevance analysis, evidence gathering,
  and novelty detection
• …with Bayesian networks!


UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                              The massive data might be
                              filtered by preferences and
                                OmniSeer Functional Architecture
                               interests specified in the
                                   UConn User Model
                Massive                     Events                                    Outdated fragments are
                                                                            Evidence
                 Data
                                Messages  Tasks                                         removed periodically
                                                                                      from the set of partially




                                                                                                                  Analyst
                                      Documents
    <Date >2002-09-20</Date>
   <Person>John Doe</Person>
                                                                                        Knowledge
                                                                                  Tacitinstantiated fragments
   <Place >London</Place>                      Bayesian networks
             …                                                                Outdated fragments
   <Date>2002-09-27</Date>
 <Pe rson>John Doe</Person>                                    BN           BN fragments represent an
                           Tagged
       …
                          messages
                                   Matcher                                          Differences between
                                                           Fragments analyst’s prior knowledge aboutan
Modified Text
                                                                                  analyst’s
                                                                            Forgetter conclusion and the
                                                                            terrorist activities or other
                                                                                situation-specific scenario lead
                                                                          domains explication of formerly
                                                                                to of which
                                       The noun-phrase analyzer analyst exploresinterest specified in tacit
                                                                The
       Instantiated Fragments                  Bayesian
                                        from UConn processes                  the UConn user model
                                           Reasoning Service                    knowledge, represented as new
                                                              information should be acquired
                                      messages; a 3 rd-party tagger reduce uncertainty and fragments
                                                                 to                        BN
                                         processes news feeds assesses the robustnessThe analyst is notified
                                                                                            ofof surprises and
                                                                         conclusions interesting situations, as
                                                Relevant facts extracted from
                                           Value of Information                           specified in the UConn
                                                the documents and messages
                                                 fill in the details of the BN                   User Model
                                                                                        Explanation
                  Situation Specific                                                      Analysis
 Composer             Scenarios            Sensitivity Analyzerinterest
                                                      fragments of
                                                  Instantiated BN fragments are              Visualization
                                                composed into scenarios specific to          Explanation
                                                     Surprise Detector                         Analysis
                                                       the situation at hand
        UNIVERSITY OF SOUTH CAROLINA
                                                                     Department of Computer Science and Engineering
      Competence and Resources
• Several faculty members in the CSE department
  have worked in normative probabilistic reasoning
  for many years
        • Some colleagues and students in the Statistics
          department are also interested
• Tools for editing BNs and IDs, propagation,
  interface with relational databases, soft evidential
  update, learning, etc., have been acquired or
  developed and used in projects and courses (CSCE
  582 and CSCE 822)
UNIVERSITY OF SOUTH CAROLINA
                                    Department of Computer Science and Engineering
      Some Local UAI Researchers
    (Notably Missing: Juan Vargas)
                                  Billy Turkett,
                                  Ph.D. (Wake
                                      Forest)




     Young-Gyun                                                           Wayne Smith, Ph.D.
     Kim, Ph.D.                                                              (Presyterian
     (S.C. State)                                                              College)



                     Miguel                         Clif Presser, Ph.D.
                    Barrientos,                    (Gettysburg College)
                      Ph.D.
UNIVERSITY OF SOUTH CAROLINA
                                                          Department of Computer Science and Engineering
      Judea Pearl and Finn V.Jensen




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
            Additional Information
• Bayesian networks journal club
   – meets every two weeks on Wednesdays: next
     meeting on September 15 at 1pm in 3A75
   – http://www.cse.sc.edu/~mgv/BNSeminar/index.ht
     ml
• 3A55, 777-4641
• mgv@cse.sc.edu
• www.cse.sc.edu/~mgv

 UNIVERSITY OF SOUTH CAROLINA
                                Department of Computer Science and Engineering

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:6/15/2012
language:
pages:28