Chapter 2 part 1

W
Shared by: ewghwehws
Categories
Tags
-
Stats
views:
1
posted:
8/3/2012
language:
English
pages:
21
Document Sample
scope of work template
							Pattern
Classification


All materials in these slides were taken from
Pattern Classification (2nd ed) by R. O.
Duda, P. E. Hart and D. G. Stork, John Wiley
& Sons, 2000
with the permission of the authors and the
publisher
             Chapter 2 (Part 1):
          Bayesian Decision Theory
             (Sections 2.1-2.2)


• Introduction
• Bayesian Decision Theory–Continuous Features
                                                                                             2

                          Introduction
• The sea bass/salmon example
  • State of nature, prior
     • State of nature is a random variable
     • The catch of salmon and sea bass is equiprobable
         •   P(1) = P(2) (uniform priors)

         •   P(1) + P( 2) = 1 (exclusivity and exhaustivity)




                                                       Pattern Classification, Chapter 2 (Part 1)
                                                                               3




• Decision rule with only the prior information
   • Decide 1 if P(1) > P(2) otherwise decide 2

• Use of the class –conditional information
• P(x | 1) and P(x | 2) describe the difference in
  lightness between populations of sea and salmon




                                         Pattern Classification, Chapter 2 (Part 1)
                                      4




Pattern Classification, Chapter 2 (Part 1)
                                                                                    5




• Posterior, likelihood, evidence
  • P(j | x) = P(x | j)P (j) / P(x)     (Bayes formula)

  • Where in case of two categories
                         j2
               P ( x )   P ( x |  j )P (  j )
                         j 1


  • Posterior = (Likelihood * Prior) / Evidence
                                              Pattern Classification, Chapter 2 (Part 1)
                                      6




Pattern Classification, Chapter 2 (Part 1)
                                                                                   7




•   Decision given the posterior probabilities

    X is an observation for which:

    if P(1 | x) > P(2 | x)    True state of nature = 1
    if P(1 | x) < P(2 | x)    True state of nature = 2

    Therefore:
        whenever we observe a particular x, the probability of
    error is :
               P(error | x) = P(1 | x) if we decide 2
               P(error | x) = P(2 | x) if we decide 1
                                             Pattern Classification, Chapter 2 (Part 1)
                                                                              8




• Minimizing the probability of error
• Decide 1 if P(1 | x) > P(2 | x);
  otherwise decide 2

  Therefore:
           P(error | x) = min [P(1 | x), P(2 | x)]
                     (Bayes decision)


                                        Pattern Classification, Chapter 2 (Part 1)
        Bayesian Decision Theory –                                               9


           Continuous Features

• Generalization of the preceding ideas
  • Use of more than one feature
  • Use more than two states of nature
  • Allowing actions and not only decide on the state of
      nature
  •   Introduce a loss of function which is more general than
      the probability of error



                                           Pattern Classification, Chapter 2 (Part 1)
                                                                                10



• Allowing actions other than classification primarily
  allows the possibility of rejection
   • Rejection in the sense of abstention
   • Don’t make a decision if the alternatives are too close
   • This must be tempered by the cost of indecision


• The loss function states how costly each action
  taken is



                                          Pattern Classification, Chapter 2 (Part 1)
                                                                            11


Let {1, 2,…, c} be the set of c states of nature
(or “categories”)

Let {1, 2,…, a} be the set of possible actions

Let (i | j) be the loss incurred for taking

  action i when the state of nature is j




                                      Pattern Classification, Chapter 2 (Part 1)
                                                                                       12
Overall risk
R = Sum of all R(i | x) for i = 1,…,a and all x

               Conditional risk


Minimizing R            Minimizing R(i | x) for i = 1,…, a


                              j c
               R(  i | x )    (  i |  j )P (  j | x )
                              j 1


               for each action i (i = 1,…,a)
Note: This is the risk specifically for observation x
                                                 Pattern Classification, Chapter 2 (Part 1)
                                                                                   13




Select the action i for which R(i | x) is minimum

         R is minimum and R in this case is called the
         Bayes risk = best performance that can be achieved!




                                             Pattern Classification, Chapter 2 (Part 1)
                                                                                    14

• Two-category classification
       1 : deciding 1
       2 : deciding 2
       ij = (i | j)
loss incurred for deciding i when the true state of nature is j

Conditional risk:

              R(1 | x) = 11P(1 | x) + 12P(2 | x)
              R(2 | x) = 21P(1 | x) + 22P(2 | x)



                                              Pattern Classification, Chapter 2 (Part 1)
                                                                             15

Our rule is the following:
                  if R(1 | x) < R(2 | x)
            action 1: “decide 1” is taken


Substituting the def. of R() we have :
decide 1 if:
             11 P(1 | x) + 12P(2 | x) <
                     21 P(1 | x) + 22P(2 | x)

               and decide 2 otherwise

                                       Pattern Classification, Chapter 2 (Part 1)
                                                                              16

We can rewrite
           11 P(1 | x) + 12P(2 | x) <
                   21 P(1 | x) + 22P(2 | x)

As
      (21- 11) P(1 | x) > (12- 22) P(2 | x)




                                        Pattern Classification, Chapter 2 (Part 1)
                                                                              17

Finally, we can rewrite
                (21- 11) P(1 | x) >
                         (12- 22) P(2 | x)

using Bayes formula and posterior probabilities to
  get:
decide 1 if:

              (21- 11) P(x | 1) P(1) >
                       (12- 22) P(x | 2) P(2)

               and decide 2 otherwise
                                        Pattern Classification, Chapter 2 (Part 1)
                                                                                           18


If 21 > 11 then we can express our rule as a
   Likelihood ratio:

The preceding rule is equivalent to the following rule:

             P ( x |  1 ) 12   22 P (  2 )
          if                         .
             P ( x |  2 )  21  11 P (  1 )



                Then take action 1 (decide 1)
              Otherwise take action 2 (decide 2)

                                                     Pattern Classification, Chapter 2 (Part 1)
                                                                        19




Optimal decision property

“If the likelihood ratio exceeds a threshold value
independent of the input pattern x, we can take
optimal actions”




                                  Pattern Classification, Chapter 2 (Part 1)
                                                                                 20
                         Exercise

Select the optimal decision where:
= {1, 2}
P(x | 1)                N(2, 0.5) (Normal distribution)
P(x | 2)                N(1.5, 0.2)



P(1) = 2/3
P(2) = 1/3
                          1 2
                            
                          3 4 
                                           Pattern Classification, Chapter 2 (Part 1)

						
Related docs
Other docs by ewghwehws
Patent US2100036
Views: 0  |  Downloads: 0
Child__039;s hobbyhorse
Views: 0  |  Downloads: 0
Basket for carburizing retorts
Views: 0  |  Downloads: 0
Porch Post _amp; Bracket Instructions
Views: 0  |  Downloads: 0
Composite piston and method for making same
Views: 1  |  Downloads: 0
Ash remover
Views: 0  |  Downloads: 0
Traction device for vehicle wheels
Views: 0  |  Downloads: 0
Packing material for sealing joints
Views: 0  |  Downloads: 0