conflicts

Document Sample
conflicts Powered By Docstoc
					      Conflicts in Bayesian Networks


                    January 23, 2007
                     Marco Valtorta
                     mgv@cse.sc.edu


UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
                                Example: Case Study #4
                           Bayesian Network Fragment Matching

1) Report Date: 1
April, 2003. FBI: Abdul
Ramazi is the owner of
the Select Gourmet
Foods shop in              <Protege:Person
                           rdf:about="&Protege;Omniseer_00135"
Springfield Mall.          …..
Springfield, VA. (Phone    Protege:familyName="Ramazi"
                           Protege:givenName="Abdulla“rrdfs:label="Abdull
number 703-659.2317).      a Ramazi"/>
First Union National       …..
Bank lists Select          <Protege:Bank
Gourmet Foods as           rdf:about="&Protege;Omniseer_00614"
                           Protege:alternateName="Pyramid Bank of Cairo"
holding account number
1070173749003. Six
                            rdfs:label="Pyramid Bank of Cairo">
                           <Protege:address
                                                                                                             Partially-
checks totaling $35,000
have been deposited in
                           rdf:resource="&Protege;Omniseer_00594"/>
                           <Protege:note
                                                                                                            Instantiated
this account in the past
                           rdf:resource="&Protege;Omniseer_00625"/>
                           </Protege:Bank>
                                                                                                             Bayesian
four months and are
recorded as having been
                           ….                                                                                Network
                           <Protege:Report
drawn on accounts at the
Pyramid Bank of Cairo,
                           rdf:about="&Protege;Omniseer_00626"
                            Protege:abstract="Ramazi's deposit in the past 4
                                                                                                             Fragment
                           months (1)"
Egypt and the Central       rdfs:label="Ramazi's deposit in the past 4 months
Bank of Dubai, United      (1)">
Arab Emirates. Both of     <Protege:reportedFrom
                           rdf:resource="&Protege;Omniseer_00501"/>
these banks have just
                           <Protege:detail
been listed as possible    rdf:resource="&Protege;Omniseer_00602"/>
conduits in money          <Protege:detail
                                                                                                           BN Fragment
laundering schemes.        rdf:resource="&Protege;Omniseer_00612"/>
      UNIVERSITY OF SOUTH CAROLINA
                      </Protege:Report>
                                                                                                            Repository
                           </rdf:RDF>
                                                                                Department of Computer Science and Engineering
              Example: Case Study #4
          Bayesian Network Fragment Composition




  .....
                  +


  Fragments
                               Situation-Specific Scenario
UNIVERSITY OF SOUTH CAROLINA
                                    Department of Computer Science and Engineering
             Value of Information
• An item of information is useful if acquiring it leads to a better
  decision, that is, to a more useful action
• An item of information is useless if the actions that are taken
  after acquiring it are no more useful than before acquiring it
• In particular, information is useless if the actions that are taken
  after acquiring it are the same as before acquiring it
• In the absence of a detailed model of the utility of actions, the
  decrease in uncertainty about a variable of interest is taken to
  be a proxy for the increase in utility: the best item of
  information to acquire is the one that reduces the most the
  uncertainty about a variable of interest
• Since the value of the new item of information is not known, we
  average over its possible values
• Uncertainty is measured by entropy. Reduction in uncertainty is
  measured by reduction in entropy
  UNIVERSITY OF SOUTH CAROLINA
                                       Department of Computer Science and Engineering
                       Example: Case Study #4
                Computing Value of Information and Surprise

                                                                            This is the output
               Ramazi performed
                illegal banking                                             of the VOI program on a
                  transactions                                              situation-specific scenario
                                                                   Yes      for Case Study #4 (Sign
               Is Ramazi a terrorist?
                                                                            of the Crescent).
       Would it help to know whether he traveled to sensitive locations?

                                                                            Variable Travel (which
                                                                            represents suspicious
                                                                            travel) is significant
                                                                            for determining the
                                                                            state of variable Suspect
                                                                            (whether Ramazi is
                                                                            a terrorist), even
                                                                            when it is already
                                                                            known that Ramazi
                                                                            has performed suspicious
                                                                            banking transactions.
UNIVERSITY OF SOUTH CAROLINA
                                                          Department of Computer Science and Engineering
       Value of Information: Formal Definition
• Let V be a variable whose value affects the actions to be taken
  by an analyst. For example, V indicates whether a bomb is
  placed on a particular airliner
• Let p(v) be the probability that variable V has value v.
• The entropy of V is: H (V )   p(V  v) log( p(V  v))
                                 
• Let T be a variable whose value we may acquire (by
                                 vV


  expending resources). For example, T indicates whether a
  passenger is a known terrorist.
• The entropy of V given that T has value t is:
     H (V | t )   p (V  v | T  t ) log( p (V  v | T  t ))
                  vV


• The expected entropy of V given T is: EH (V | T )   p(T  t ) H (V | t )
                                                         tT
• The value of information is:  ( EH (V | T )  H (V ))
 UNIVERSITY OF SOUTH CAROLINA
                                                    Department of Computer Science and Engineering
             Surprise Detection
 • Surprise is the situation in which evidence (a set of findings)
   and a situation-specific scenario are incompatible
 • Since situation-specific scenarios are Bayesian networks, it
   is very unusual for an outright inconsistency to occur
 • In some cases, however, the evidence is very unlikely in a
   given scenario; this may be because a rare case has been
   found, or because the scenario cannot explain the evidence
 • To distinguish these two situations, we compare the
   probability of the evidence in the situation-specific scenario
   to the probability of the evidence in a scenario in which all
   events are probabilistically independent and occur with the
   same prior probability as in the situation-specific scenario

UNIVERSITY OF SOUTH CAROLINA
                                    Department of Computer Science and Engineering
                              Example: Case Study #4
                                          Computing Surprise
  The VALUE OF INFORMATION of the test node C for the target node A is 0.0
  Parsing the XMLBIF file 'ssn.xml' ...
  done!


      PROBABILITY FOR JOINT FINDINGS = 5.0E-4


  Prior probability for NODE: Suspicious Person=yes is 0.01
  Prior probability for NODE: Unusual Activities=yes is 0.0656
  Prior probability for NODE: Stolen Weapons=yes is 0.05
      PROBABILITY FOR INDIVIDUAL FINDINGS = 3.28E-5


  No conflict was detected.



This shows the output of the surprise detection program. In this case,
the user is informed that no conflict is detected, i.e., the scenario is
likely to be a good interpretive model for the evidence received
   UNIVERSITY OF SOUTH CAROLINA
                                                                 Department of Computer Science and Engineering
                Surprise Detection: Formal Definition
• Let the evidence be a set of findings: e  {Vi  vi | i  I }
• The probability of the evidence in the situation-specific scenario
  is P S (e), where P S (.) is the distribution represented in the
  situation-specific scenario
• The probability of the evidenceI in the model in which all
  variables are independent is      P (e)  iI P S (Vi  vi )
• The evidence is surprising if P S (e)  P I (e)
• The conflict index is defined as cs  log[ P I (e) / P S (e)]
• The probability under P S that c s is greater than Kis  K  2  K
                                                  P I (e)
• Proof [Laskey, 1991]: 1  e P (e)   P (e) 2
                                  I                        I
                                                           S
                                                          P (e)
                                                                P S (e)   P ( e )  2 2 K P S ( e)  2 K  K
                                                                   K
                                                                                           I
                                                                                                   K
                                                         P S (e)                         P S (e)



• If the conflict index is high, it is unlikely that the findings could
  have been generated by sampling the situation-specific scenario
• It is reasonable to inform the analyst that no good explanatory
  model of the findings exists, and we are in the presence of a
  novel or surprising situation
   UNIVERSITY OF SOUTH CAROLINA
                                                               Department of Computer Science and Engineering
     The Independent Straw Model




•   In the absence of conflict, the joint probability of all evidence variables is
    greater than the product of the probabilities of each evidence variable. This is
    normally the case, because P(x|y) > P(x), and P(x,y) = P(x|y)P(y).

UNIVERSITY OF SOUTH CAROLINA
                                               Department of Computer Science and Engineering
         Straw Models in Diagnosis




A bipartite straw model is obtained by the elimination of some variables
from a given model. In diagnosis by heuristic classification, one can
divide variables into three sets: Target, Evidence, and Other

UNIVERSITY OF SOUTH CAROLINA
                                         Department of Computer Science and Engineering
How to Compute the Conflict Index (I)

                                                 • The marginal
                                                   probability of
                                                   each finding is
                                                   the normal
                                                   result of any
                                                   probability
                                                   computation
                                                   algorithm




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
   How to Compute the Conflict Index (II)
                                               • The probability
                                                 of the evidence
                                                 is a bi-product
                                                 of probability
                                                 update
                                                 computed using
                                                 the variable
                                                 elimination or
                                                 junction tree
                                                 algorithms




UNIVERSITY OF SOUTH CAROLINA
                               Department of Computer Science and Engineering
    P(e) from the Variable Elimination Algorithm

P(| =“yes”, =“yes”) = X\ {} (P(|)* P(|)* P(|,)* P(|,)* P()*P(|)*P(|)*P())


  Bucket :        P(|)*P(), =“yes”
                                                                 Hn(u)=xnПji=1Ci(xn,usi)
  Bucket :           P(|)
  Bucket :           P(|,), =“yes”                                                
  Bucket :           P(|,)          H(,)           H()
                                                                                            
  Bucket :           P(|)           H(,,)
                                                                               
  Bucket :           P(|)*P()          H(,,)
  Bucket :                          H(,)                                      
  Bucket :        H()               H()      *k               P (e) = 1- k, where k is
                                                                   a normalizing constant
                  P(| =“yes”, =“yes”)
UNIVERSITY OF SOUTH CAROLINA
                                                       Department of Computer Science and Engineering
             Sensitivity Analysis
• Sensitivity analysis assesses how much the posterior
  probability of some event of interest changes with respect
  to the value of some parameter in the model
• We assume that the event of interest is the value of a
  target variable. The parameter is either a conditional
  probability or an unconditional prior probability
• If the sensitivity of the target variable having a particular
  value is low, then the analyst can be confident in the results,
  even if the analyst is not very confident in the precise value
  of the parameter
• If the sensitivity of the target variable to a parameter is
  very high, it is necessary to inform the analyst of the need
  to qualify the conclusion reached or to expend more
  resources to become more confident in the exact value of
  the parameter

UNIVERSITY OF SOUTH CAROLINA
                                    Department of Computer Science and Engineering
                       Example: Case Study #4
                                 Computing Sensitivity




This is the output of the Sensitivity Analysis program on a situation-specific scenario for Case Study
#4.

In the context of the information already acquired, i.e., travel to dangerous places, large transfers of
money, etc., the parameter that links financial irregularities to being a suspect is much more
important for assessing the belief in Ramazi being a terrorist than the parameter that links dangerous
travel to being a suspect. The analyst may want to concentrate on assessing the first parameter
precisely.
    UNIVERSITY OF SOUTH CAROLINA
                                                          Department of Computer Science and Engineering
     Sensitivity Analysis: Formal Definition

• Let the evidence be a set of findings: e  {Vi  vi | i  I }
• Let t be a parameter in the situation-specific scenario
• Then, P(e)(t )  t   [Castillo et al., 1997; Jensen, 2000]
• α and β can be determined by computing P(e) for two values of t
• More generally, if t is a set of parameters, then P(e)(t) is a linear
  function in each parameter in t, i.e., it is a multi-linear function of t
• Recall that P(a | b) P(a, b) P(b)
• Then, P( A  a | e)(t )  P( A  a, e)(t ) P(e)(t )
• We can therefore compute the sensitivity of a target variable V to a
  parameter t by repeating the same computation with two values for the
  evidence set, viz. e and e  {V  v}




UNIVERSITY OF SOUTH CAROLINA
                                          Department of Computer Science and Engineering

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:12/11/2011
language:
pages:17