# A Brief Introduction to Adaboost by mX32o3

VIEWS: 15 PAGES: 35

• pg 1
```									   A Brief Introduction
Hongbo Deng
6 Feb, 2007

1
Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman.
Outline
   Background

   Theory/Interpretations

2
   Can be used with many different classifiers

   Improves classification accuracy

   Commonly used in many areas

   Simple to implement

   Not prone to overfitting

3
Resampling for
A Brief History                estimating statistic

   Bootstrapping

   Bagging                     Resampling for
classifier
   Boosting (Schapire 1989)       design

4
Bootstrap Estimation
 Repeatedly draw n samples from D
 For each set of samples, estimate a
statistic
 The bootstrap estimate is the mean of the
individual estimates
 Used to estimate a statistic (parameter)
and its variance

5
Bagging - Aggregate Bootstrapping

   For i = 1 .. M
 Draw n*<n samples from D with replacement
 Learn classifier Ci

 Final classifier is a vote of C1 .. CM
 Increases classifier stability/reduces
variance                                   D2
D1

D3        D
6
Boosting (Schapire 1989)
   Consider creating three component classifiers for a two-category problem
through boosting.
   Randomly select n1 < n samples from D without replacement to obtain D1
   Train weak learner C1

   Select n2 < n samples from D with half of the samples misclassified by C1 to
obtain D2
   Train weak learner C2

   Select all remaining samples from D that C1 and C2 disagree on
   Train weak learner C3
D
   Final classifier is vote of weak learners                           D3
D1

D2   -
++   -

7
   Instead of resampling, uses training set re-weighting
   Each training sample uses a weight to determine the probability
of being selected for a training set.

   AdaBoost is an algorithm for constructing a “strong”
classifier as linear combination of “simple” “weak”
classifier

   Final classification based on weighted vote of weak
classifiers

8
   ht(x) … “weak” or basis classifier (Classifier =
Learner = Hypothesis)
                  … “strong” or final classifier

   Weak Classifier: < 50% error over any
distribution
   Strong Classifier: thresholded linear combination
of weak classifier outputs

9
Each training sample has a
weight, which determines the
probability of being selected for
training the component classifier

10
Find the Weak Classifier

11
Find the Weak Classifier

12
The algorithm core

13
Reweighting

y * h(x) = 1

y * h(x) = -1

14
Reweighting

In this way, AdaBoost “focused on” the
informative or “difficult” examples.
15
Reweighting

In this way, AdaBoost “focused on” the
informative or “difficult” examples.
16
Algorithm recapitulation   t=1

17
Algorithm recapitulation

18
Algorithm recapitulation

19
Algorithm recapitulation

20
Algorithm recapitulation

21
Algorithm recapitulation

22
Algorithm recapitulation

23
Algorithm recapitulation

24
 Very  simple to implement
 Does feature selection resulting in relatively
simple classifier
 Fairly good generalization
 Suboptimal  solution
 Sensitive to noisy data and outliers

25
References
   Duda, Hart, ect – Pattern Classification

   Freund – “An adaptive version of the boost by majority algorithm”

   Freund – “Experiments with a new boosting algorithm”

   Freund, Schapire – “A decision-theoretic generalization of on-line learning and an application to boosting”

   Friedman, Hastie, etc – “Additive Logistic Regression: A Statistical View of Boosting”

   Jin, Liu, etc (CMU) – “A New Boosting Algorithm Using Input-Dependent Regularizer”

   Li, Zhang, etc – “Floatboost Learning for Classification”

   Opitz, Maclin – “Popular Ensemble Methods: An Empirical Study”

   Ratsch, Warmuth – “Efficient Margin Maximization with Boosting”

   Schapire, Freund, etc – “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods”

   Schapire, Singer – “Improved Boosting Algorithms Using Confidence-Weighted Predictions”

   Schapire – “The Boosting Approach to Machine Learning: An overview”

   Zhang, Li, etc – “Multi-view Face Detection with Floatboost”

26
Appendix
 Bound on training error

27
Bound on Training Error (Schapire)

28
(Friedman’s wording)

29
(Freund and Schapire’s wording)

30
Weighted Predictions (RealAB)

31
Friedman
   LogitBoost
 Solves
 Requires         care to avoid numerical problems

   GentleBoost
   Update is fm(x) = P(y=1 | x) – P(y=0 | x) instead of
   Bounded [0 1]

32
Friedman
   LogitBoost

33
Friedman
   GentleBoost

34
Thanks!!!