Docstoc

Feature selection

Document Sample
Feature selection Powered By Docstoc
					       Feature Selection for
         Image Retrieval
                By Karina Zapién Arreola

                      January 21th, 2005



Feature Selection                          1
                    Introduction
 Variable and feature selection have become
  the focus of much research in areas of
  applications for datasets with many
  variables are available
  Text processing
  Gene expression
  Combinatorial chemistry


Feature Selection                             2
                    Motivation
 The objective of feature selection is three-
  fold:
       Improving the prediction performance of the
       predictors
       Providing a faster and more cost-effective
       predictors
       Providing a better understanding of the
       underlying process that generated the data


Feature Selection                                    3
  Why use feature selection in CBIR
    Different users may need different features
    for image retrieval
    From each selected sample, a specific
    feature set can be chosen




Feature Selection                             4
                    Boosting
    Method for improving the accuracy of any
    learning algorithm
    Use of “weak algorithms” for single rules
    Weighting of the weak algorithms
    Combination of weak rules into a strong
    learning algorithm



Feature Selection                               5
              Adaboost Algorithm
  Is a iterative boosting algorithm
 Notation
  Samples (x1,y1),…,(xm,ym), where, yi= -1,1
  There are m positive samples, and l negative
  samples
  Weak classifiers hi
  For iteration t, the error is defined as:
               εt = min (½)Σi ωi |hi(xi) – yi|
  where ωi is a weight for xi.

Feature Selection                                6
              Adaboost Algorithm
    Given samples (x1,y1),…,(xm,ym), where yi = -1,1
    Initialize ω1,i=1/(2m), 1/(2l), for yi = 1,-1
    For t=1,…,T
       Normalize ωt,i = ωt,i /(Σj ωt,j)
       Train base learner ht,i using distribution ωi,j
       Choose ht that minimize εt with error ei
       Update ωt+1,i = ωt,i βt1-ei
       Set βt = (εt)/(1- εt) and αt = log(1/ βt)
    Output the final classifier H(x) = sign( Σt αt ht(x) )

Feature Selection                                            7
             Adaboost Application
    Searching similar groups
       A particular image class is chosen
       A positive sample of this group is given
       randomly
       A negative sample of the rest of the images is
       given randomly




Feature Selection                                   8
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually
    Dirty data
    Predictor – linear predictor
    Comparison
    Stable solution

Feature Selection                  9
                    Domain knowledge
 Features used        Features used      Features used
    colordb_sumRGB      col_hu_seg_hsv     hist_phc_hsv
    _entropy_d1
                        col_hu_seg_lab     hist_phc_rgb
    col_gpd_hsv
    col_gpd_lab         col_hu_seg_rgb     Hist_Grad_RGB
    col_gpd_rgb         col_hu_yiq         haar_RGB
    col_hu_hsv2         col_ngcm_rgb       haar_HSV
    col_hu_lab2         col_sm_hsv         haar_rgb
    col_hu_lab          col_sm_lab
    col_hu_rgb2                            haar_hmmd
                        col_sm_rgb
    col_hu_rgb          col_sm_yiq
    col_hu_seg2_hsv
                        text_gabor
    col_hu_seg2_lab
    col_hu_seg2_rgb     text_tamura
                        edgeDB
                        waveletDB


Feature Selection                                          10
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features


    Normalize features between an
    appropriated range
    Adaboost takes each feature independent
    so it is not necessary to normalize them


Feature Selection                          11
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually
    Dirty data
    Predictor – linear predictor
    Comparison
    Stable solution

Feature Selection                  12
     Feature construction and space
        dimensionality reduction
    Clustering
    Correlation coefficient
    Supervised feature selection
    Filters




Feature Selection                     13
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables


 Features with the same value for all samples
  (variance=0) were eliminated
                                  From
                                  4912 Linear Features
                                  3583 were selected
Feature Selection                                    14
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually

 When there is no asses method, use
  Variable Ranking method. In Adaboost this
  is not necessary
Feature Selection                        15
                    Variable Ranking
       Preprocessing step
       Independent of the choice of the predictor


    Correlation criteria
       It can only detect linear dependencies
    Single variable classifiers



Feature Selection                                   16
                    Variable Ranking
    Noise reduction and better classification may be
    obtained by adding variables that are
    presumable redundant
    Perfectly correlated variables are truly redundant
    in the sense that no additional information is
    gained by adding them. It doesn’t mean absence
    of variable complementarily
    Two variables that are useless by themselves
    can be useful together

Feature Selection                                   17
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually
    Dirty data
    Predictor – linear predictor
    Comparison
    Stable solution

Feature Selection                  18
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually
    Dirty data
    Predictor – linear predictor
    Comparison
    Stable solution

Feature Selection                  19
              Adaboost Algorithm
    Given samples (x1,y1),…,(xm,ym), where xi, yi -1,1
    Initialize ω1,i=1/(2m), 1/(2l), for yi = -1,1
    For t=1,…,T
       Normalize ωt,i = ωt,i /(Σj ωt,j)
       Train base learner ht,i using distribution ωi,j
       Choose ht that minimize εt with error ei
       Update ωt+1,i = ωt,i βt1-ei
       Set βt = (εt)/(1- εt) and αt = log(1/ βt)
    Output the final classifier H(x) = sign( Σt αt ht(x) )

Feature Selection                                            20
                    Weak classifier
    Each weak classifier hi is defined as
    follows:
       hi.pos_mean – mean value for positive samples
       hi.neg_mean – mean value for negative sample
    A sample is classified as:
       1 if it is closer to hi.pos_mean
       -1 if it is closer to hi.neg_mean


Feature Selection                                      21
                    Weak classifier
       hi.pos_mean – mean value for positive samples
       hi.neg_mean – mean value for negative sample



   hi.pos_mean


                                             hi.neg_mean



       A Linear Classifier was used
Feature Selection                                      22
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually
    Dirty data
    Predictor – linear predictor
    Comparison
    Stable solution

Feature Selection                  23
  Adaboost experiments and results



                              10 positives




                               4 positives



Feature Selection                     24
             Few positive samples
      Use of 4
   positive samples




Feature Selection                   25
            More positive samples
      Use of 10
   positive samples




                                    False
                                    Positive




Feature Selection                       26
    Use of 10
                    Training data
 positive samples


                                      Training data




                                      Test data




                     False negative
Feature Selection                             27
           Changing number of Training
                    Iterations




                         The number
                         of iterations
                        Used was from
                            5 to 50
Iterations = 30
    was set


  Feature Selection                      28
           Changing Sample Size



                                    35 pos
                                   30 pos

                                    25 pos
                                        20 pos
                                   15 pos
                                   10 pos
                                  5 pos



Feature Selection                           29
           Few negative samples
     Use of 15
  negative samples




Feature Selection                 30
          More negative samples
     Use of 75
  negative samples




Feature Selection                 31
           Check list Feature
               Selection
    Domain knowledge
    Commensurate features
    Interdependence of features
    Prune of input variables
    Asses features individually
    Dirty data
    Predictor – linear predictor
    Comparison (ideas, time, comp. resources,
    examples)
    Stable solution

Feature Selection                               32
                    Stable solution
    For Adaboost is important to have a
    representative sample
    Chosen parameters:
       Positives samples: 15
       Negative samples: 100
       Iteration number: 30




Feature Selection                         33
  Stable solution with more samples
            and iterations

                                    Dinosaurs
                                     Roses
                                     Buses
                                     Horses

                    Buildings       Elephants
                    Humans            Food
                     Mountains

                          Beaches



Feature Selection                          34
Use of:
       Stable solution for Dinosaurs
• 15 Positive samples
• 100 Negative samples
• 30 Iterations




Feature Selection                      35
           Stable solution for Roses
Use of:
• 15 Positive samples
• 100 Negative samples
• 30 Iterations




Feature Selection                      36
Use of:
           Stable solution for Buses
• 15 Positive samples
• 100 Negative samples
• 30 Iterations




Feature Selection                      37
Use of:
        Stable solution for Beaches
•   15 Positive samples
•   100 Negative samples
•   30 Iterations




Feature Selection                     38
Use of:
          Stable solution for Food
• 15 Positive samples
• 100 Negative samples
• 30 Iterations




Feature Selection                    39
                    Unstable Solution




Feature Selection                       40
      Unstable solution for Roses
Use of:
• 5 Positive samples
• 10 Negative samples
• 30 Iterations




Feature Selection                   41
   Best features for classification
    Humans
    Beaches
    Buildings
    Buses
    Dinosaurs
    Elephants
    Roses
    Horses
    Mountains
    Food

Feature Selection                     42
   And the winner is…


Feature Selection       43
                                         H
                                          aa
                                                                  Appearen ce tim es
                                                 r_
                                                    rg
                                                ha b_n




                                                                  0.02
                                                                  0.04
                                                                  0.06
                                                                  0.08
                                                                  0.12
                                                                  0.14
                                                                  0.16
                                                                  0.18



                                                                   0.1
                                                                   0.2




                                                                     0
                                           hi a o
                                              st r_ rm
                                                _G h
                                                     ra mm
                                                        d     d




Feature Selection
                                                   ha _R
                                                       ar GB
                                                         _
                                                   ha RG
                                               hi ar B
                                                  s _
                                   03 h t_p HS
                                       -c is hc V
                                           o l t_ _ h
                                     03 _gp phc sv
                                          -c d_ _r
                              03 03 ol_s hsv gb
                                -c -c m .m
                                   ol ol _y a
                                      _h _h iq t
                                           u u .m
                                     03 _se _yi at
                                          -c g_ q.m
                                    04 ol_ hs at
                                         -te sm v.m
                                              xt _l a
                                                _t ab t
                                                   am .m
                                    03                 u a
                                         -c 05 ra.m t
                                   03 _     ol -e
                                       -c sm dg at
                                           o               e


                    Feature
                                   03 l_g _rg DB
                                        -c pd b.
                                                                                       Feat ure's Frequen cy




                                           ol _ m
                                              _g r g a
                                                  p d b .m t
                                                      _
                              03 03 05- lab at
                                 -c -co wa .m
                                    ol
                                      _ l_ ve at
                                03 hu hu let
                                              _ _ D
                                                                                                               Feature frequency




                               03 -col se lab B
                                  -c       _ n g _ .m
                                     ol gc r g at
                                        _h m b.
                                            u _ m
                                      04 _se rgb at
                                           -te g_ .m
                                               xt la at
                                                 _g b.
                                                      ab ma
                                                         or t
                                                           .m
                                                             at
  44
                      Extensions
    Searching similar images
       Pairs of images are built
       The difference for each feature is calculated
       Each difference is classified as:
            1 if both images belong to the same class
           -1 if both images belong to different classes
    Multiclass adaboost


Feature Selection                                          45
                    Extensions
    Use of another weak classifier
       Design weak classifier using multiple features
       → classifier fusion
       Use different weak classifier such as SVM,
       NN, threshold function, etc.
    Different feature selection method: SVM



Feature Selection                                   46
                    Discussion
    Is important to add feature Selection for Image
    retrieval
    A good methodology for selecting features
    should be used
    Adaboost is a learning algorithm
    → data dependent
    It is important to have representative samples
    Adaboost can help to improve the classification
    potential of simple algorithms
Feature Selection                                     47
            Thank you !


Feature Selection         48

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:5/7/2012
language:English
pages:48