Docstoc

Defense

Document Sample
Defense Powered By Docstoc
					Theoretical Foundations of Active Learning

              Steve Hanneke
               Machine Learning Department
                Carnegie Mellon University
                  shanneke@cs.cmu.edu
    Passive Learning
            Data
Learning    Source
                                              Expert / Oracle
Algorithm

                   Labeled data points


             Algorithm outputs a classifier




                                                        Steve Hanneke 2
    Active Learning
            Data
Learning    Source
                                                          Expert / Oracle
Algorithm
               Request for the label of a data point
                     The label of that point
            Request for the label of another data point
                     The label of that point
                            ...




               Algorithm outputs a classifier




                                                                    Steve Hanneke 3
    Active Learning (Sequential Design)
                 Data
Learning         Source
                                                                Expert / Oracle
Algorithm
      How many label requests
                     Request for the label of a data point
                           The label of that point
       are required to learn?
                  Request for the label of another data point
                           The label of that point

         Label Complexity
                                  ...




                     Algorithm outputs a classifier
  e.g., Das04, Das05, DKM05, BBL06, Kaa06, Han07a&b, BBZ07, DHM07, BHW08




                                                                          Steve Hanneke 4
Active Learning Sometimes Helps
An Example: 1-dimensional threshold functions.




         -                        +




                                                 Steve Hanneke 5
      Active Learning Sometimes Helps
      An Example: 1-dimensional threshold functions.

         Take m unlabeled examples
         Repeatedly request the label of the median point between -/+ boundaries.
         Take any threshold consistent with the observed labels.


-      - -        - - -          - - -- -+ +               ++ + +          +

    Used only log(m) label requests,
    but get a classifier consistent with all m examples!
    Exponential improvement over passive!



                                                                       Steve Hanneke 6
Outline
 Formal Model
 Analysis of Uncertainty-based Active Learning
 Strict Improvements Over Passive Learning
 Open Problems




                                                  Steve Hanneke 7
Formal Model




               Steve Hanneke 8
Formal Model




               Steve Hanneke 9
       CAL
       A simple idea from Cohn, Atlas & Ladner (1994).




Assuming =0, produces a perfectly labeled data set, which we can feed into any
passive algorithm!
So we get a natural fallback guarantee.
Can we characterize the label complexity achieved by CAL?
Can we generalize it to handle label noise or non-separable data?


                                                                       Steve Hanneke 10
Disagreement Coefficient [Hanneke,07]




                                 (for our purposes,
                                 take r0 = )

        DIS(B(f,r))
          f

       Concepts in B(f,r)
       look like this

                                    Steve Hanneke 11
Disagreement Coefficient [Hanneke,07]




                                 (for our purposes,
                                 take r0 = )




                                    Steve Hanneke 12
 Characterizes CAL’s Performance




                             Steve Hanneke 13
What about Noise?




                    Steve Hanneke 14
What about Noise?




                    Steve Hanneke 15
      Activized Learning
                   Data
“Activizer”        Source
Meta-algorithm                                                    Expert / Oracle

                       Request for the label of a data point
                             The label of that point
                    Request for the label of another data point
                             The label of that point
                                    ...



          ...          Algorithm outputs a classifier


           Passive Learning
           Algorithm
           (Supervised / Semi-Supervised)                                   Steve Hanneke 16
      Activized Learning
                   Data
“Activizer”        Source
Meta-algorithm                                                    Expert / Oracle

                       Request for the label of a data point
                             The label of that point
                    Request for the label of another data point
                             The label of that point
                                    ...



          ...          Algorithm outputs a classifier

                        Are there general-purpose activizers
           Passive Learning that strictly improve the label
           Algorithm complexity of any passive algorithm?
           (Supervised / Semi-Supervised)                                   Steve Hanneke 17
Formal Model




               Steve Hanneke 18
Uncertainty-based Sampling Doesn’t Activize

Intervals


            -   +         -
   0                                        1




                                     Steve Hanneke 19
Uncertainty-based Sampling Doesn’t Activize

Intervals

              -           -       - -      - - -              -
   0                                                                     1
                  Suppose the target labels everything “-1”




  Uncertainty-based sampling requests every label.
  No improvements over passive.




                                                                  Steve Hanneke 20
What’s Wrong? (formally)




                           Steve Hanneke 21
How Can We Fix It?




                     Steve Hanneke 22
A Simple Activizer

                     So, which ever of the
                     2k classifications can’t
                     be realized by V, look
                      at the label of x and
                       take the opposite.




                           Steve Hanneke 23
This Works for Any C!




                        [HLW94] passive
                          algorithm has
                         O(1/) sample
                           complexity.




                              Steve Hanneke 24
Dealing with Noise and Misspecification




                              Recall passive gets O(1/2)
                                      (minimax)




                                        Steve Hanneke 25
Open Questions
 Question:
  What can we activize with noise?

 Question:
  Can we give more detailed bounds on a when >>1?

 Question:
  Is there a labeled/unlabeled trade-off under arbitrary DXY?



                                                      Steve Hanneke 26
Thank You



            Steve Hanneke 27
       A Simple Activizer




Intervals revisited

       - - - - - - - - - - - - - - -- - -- - - -- - - -
   0                                                                            1
                                               x1
                      Again, suppose the target labels everything “-1”
  Passive algorithm trained on (n2) samples. Improved label complexity. 

                                                                         Steve Hanneke 28
 Efficiency?
 m = # unlabeled examples used by the algorithm.
 Suppose can test separability of O(n) points in poly(n) time
 Then SimpleActivizer runs in poly(n)m time
  (plus the time of the passive algorithm).

 For most learning problems, can set a poly(n) limit on m in the
  algorithm without losing our guarantees.




                                                         Steve Hanneke 29

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:5/7/2013
language:Unknown
pages:29
gegouzhen12 gegouzhen12
About