bayesn belief network2 by hoclaptrinh


									Data Mining
Bayesian Belief Networks

Amani Sami Al_Shannag

  Dr.Qasem Radaideh
        What is a Bayesian Classifier?

   Bayesian Classifiers are statistical classifier
       based on Bayes Theorem .
   They can predict the probability that a
    particular sample is a member of a particular
   Perhaps the simplest Bayesian Classifier is
    known as the Naive Bayesian Classifier
    Naive Bayesian Classification
   Assumes that the effect of an attribute
    value on a given class is independent
    of the values of other attributes. This
    assumption is known as class
    conditional independence
   However, In Naive Bayes there can be
    dependences between value of
    attributes. To avoid this we use
    Bayesian Belief Network which provide
    joint conditional probability distribution.
         Bayesian Belief Networks

    A BBN consists of two components.
    1.   directed acyclic graph
    2.   conditional probability table (CPT)
Bayesian Belief Networks

                  probability table
         Bayesian Belief Networks

   The first is a directed acyclic graph where
       each node represents an variable; variables may
        correspond to actual data attributes or to “hidden
       each arc represents a probabilistic dependence
       each variable is conditionally independent of its
        non-descendents, given its parents
       Bayesian Belief Networks
FamilyHistory    Smoker
                                  FH.S FH.-S -FH.S   -FH.-S
                            Lc     .8    .5    .7      .1
                            -lc    .2    .5    .3      .9
LungCancer      Emphysema

PositiveXRa      Dyspnea
        Bayesian Belief Networks
   The second component of a BBN is a conditional
    probability table (CPT) for each variable Z, which
    gives the conditional distribution P(Z|Parents(Z))
       i.e. the conditional probability of each value of Z for
        each possible combination of values of its parents
   e.g. for for node LungCancer we may have

    P(LungCancer = “True” | FamilyHistory = “True” Smoker = “True”) = 0.8
    P(LungCancer = “False” | FamilyHistory = “False” Smoker = “False”) = 0.9

   The joint probability of any tuple (z1,…, zn)
    corresponding to variables Z1,…,Zn is
        P( z1 ,..., z n )   P( zi | Parents( Z i ))
                               i 1
 By the chaining rule of probability, the joint
  probability of all the nodes in the graph above
P(C, S, R, W) = P(C) * P(S|C) * P(R|C) *
W=Wet Grass, C=Cloudy, R=Rain, S=Sprinkler
Example: P(W∩-R∩S∩C)
= P(W|S,-R)*P(-R|C)*P(S|C)*P(C)
= 0.9*0.2*0.1*0.5 = 0.009
       Training BBNs
   If the network structure is known and all the variables are
    observable then training the network simply requires the
    calculation of Conditional Probability Table
   When the network structure is given but some of the
    variables are hidden (variables believed to influence but
    not observable) a gradient descent method can be
    used to train the BBN based on the training data. The
    aim is to learn the values of the CPT entries
   The case of hidden data is also referred to missing
    values or incomplete data.
      Training BBNs
   Means that we must learn the values of CPT entries.

   Let D be atraining set of s data tuples X1,X2…,XD
   Let wijk be a CPT entry for the variable Yi = yij having
    parents Ui = uik
     e.g. from our example, Yi may be LungCancer, yij
      its value “True”, Ui lists the parents of Yi, e.g.
      {FamilyHistory, Smoker}, and uik lists the values of
      the parent nodes, e.g. {“True”, “True”}
       Training BBNs

   Algorithms also exist for learning the network
    structure from the training data given
    observable variables (this is a discrete
    optimization problem)
   In this sense they are an unsupervised
    technique for discovery of knowledge
     Bayesian Belief Networks

 Bayesian belief networks allow combining
  prior knowledge about (in)dependence
  among variables with observed data.
 A Bayesian belief network infers the
  probability distribution for the target variable
  given the observed values of other variables.
     Bayesian Belief Networks

     A A          B B
     0.1 0.9       0.2 0.8
                                      A B C D E
 A             B                      F F ? F T
                      AB C      C
                      T T 0.9   0.1
           C          T F 0.6   0.4
                      F T 0.3   0.7
                      F F 0.2   0.8
 D             E
C D D                C E E
T 0.9 0.1             T 0.8 0.2
F 0.2 0.8             F 0.1 0.9
    Bayesian Belief Networks
   Gradient ascent for Bayes nets
   Let wijk denote the conditionally probability that
    the network variable Yi will take on the value yij
    given that its immediate parents Ui take on the
    values uik given by uik.
                              Yi= Campfire
                              Ui=<Storm, BusTourGroup>
                              yij= True
                              uik=< False,False>
    Bayesian Belief Networks
 We are interested in Bayesian Net
 Naive Bayes assumption of conditional
  independence is too restrictive.
 But it’s intractable without some such
 Bayesian belief networks describe
  conditional independence among
  subset of variables.
       Advantages of Bayesian Approach

   Bayesian networks can readily handle
    incomplete data sets.
    Bayesian networks allow one to learn
     about causal relationships
    Bayesian networks readily facilitate use of
    prior knowledge.

Thank you !

To top