VIEWS: 13 PAGES: 19 POSTED ON: 3/28/2012
Data Mining Bayesian Belief Networks Amani Sami Al_Shannag 2006930036 Dr.Qasem Radaideh What is a Bayesian Classifier? Bayesian Classifiers are statistical classifier based on Bayes Theorem . They can predict the probability that a particular sample is a member of a particular class Perhaps the simplest Bayesian Classifier is known as the Naive Bayesian Classifier Naive Bayesian Classification Assumes that the effect of an attribute value on a given class is independent of the values of other attributes. This assumption is known as class conditional independence However, In Naive Bayes there can be dependences between value of attributes. To avoid this we use Bayesian Belief Network which provide joint conditional probability distribution. Bayesian Belief Networks A BBN consists of two components. 1. directed acyclic graph 2. conditional probability table (CPT) Bayesian Belief Networks Conditional probability table Bayesian Belief Networks The first is a directed acyclic graph where each node represents an variable; variables may correspond to actual data attributes or to “hidden variables” each arc represents a probabilistic dependence each variable is conditionally independent of its non-descendents, given its parents Bayesian Belief Networks FamilyHistory Smoker FH.S FH.-S -FH.S -FH.-S Lc .8 .5 .7 .1 -lc .2 .5 .3 .9 LungCancer Emphysema PositiveXRa Dyspnea y Bayesian Belief Networks The second component of a BBN is a conditional probability table (CPT) for each variable Z, which gives the conditional distribution P(Z|Parents(Z)) i.e. the conditional probability of each value of Z for each possible combination of values of its parents e.g. for for node LungCancer we may have P(LungCancer = “True” | FamilyHistory = “True” Smoker = “True”) = 0.8 P(LungCancer = “False” | FamilyHistory = “False” Smoker = “False”) = 0.9 … The joint probability of any tuple (z1,…, zn) corresponding to variables Z1,…,Zn is n P( z1 ,..., z n ) P( zi | Parents( Z i )) i 1 BAYESIAN BELIEF NETWORK EXAMPLE BAYESIAN BELIEF NETWORK EXAMPLE By the chaining rule of probability, the joint probability of all the nodes in the graph above is: P(C, S, R, W) = P(C) * P(S|C) * P(R|C) * P(W|S,R) W=Wet Grass, C=Cloudy, R=Rain, S=Sprinkler Example: P(W∩-R∩S∩C) = P(W|S,-R)*P(-R|C)*P(S|C)*P(C) = 0.9*0.2*0.1*0.5 = 0.009 Training BBNs If the network structure is known and all the variables are observable then training the network simply requires the calculation of Conditional Probability Table When the network structure is given but some of the variables are hidden (variables believed to influence but not observable) a gradient descent method can be used to train the BBN based on the training data. The aim is to learn the values of the CPT entries The case of hidden data is also referred to missing values or incomplete data. Training BBNs Means that we must learn the values of CPT entries. Let D be atraining set of s data tuples X1,X2…,XD Let wijk be a CPT entry for the variable Yi = yij having parents Ui = uik e.g. from our example, Yi may be LungCancer, yij its value “True”, Ui lists the parents of Yi, e.g. {FamilyHistory, Smoker}, and uik lists the values of the parent nodes, e.g. {“True”, “True”} Training BBNs Algorithms also exist for learning the network structure from the training data given observable variables (this is a discrete optimization problem) In this sense they are an unsupervised technique for discovery of knowledge Bayesian Belief Networks Bayesian belief networks allow combining prior knowledge about (in)dependence among variables with observed data. A Bayesian belief network infers the probability distribution for the target variable given the observed values of other variables. Bayesian Belief Networks A A B B 0.1 0.9 0.2 0.8 A B C D E A B F F ? F T AB C C T T 0.9 0.1 C T F 0.6 0.4 F T 0.3 0.7 F F 0.2 0.8 D E C D D C E E T 0.9 0.1 T 0.8 0.2 F 0.2 0.8 F 0.1 0.9 Bayesian Belief Networks Gradient ascent for Bayes nets Let wijk denote the conditionally probability that the network variable Yi will take on the value yij given that its immediate parents Ui take on the values uik given by uik. Yi= Campfire Ui=<Storm, BusTourGroup> yij= True uik=< False,False> Bayesian Belief Networks We are interested in Bayesian Net because: Naive Bayes assumption of conditional independence is too restrictive. But it’s intractable without some such assumptions… Bayesian belief networks describe conditional independence among subset of variables. Advantages of Bayesian Approach Bayesian networks can readily handle incomplete data sets. Bayesian networks allow one to learn about causal relationships Bayesian networks readily facilitate use of prior knowledge. ANY QUESTIONS Thank you !