# Feature selection

Document Sample

```					       Feature Selection for
Image Retrieval
By Karina Zapién Arreola

January 21th, 2005

Feature Selection                          1
Introduction
Variable and feature selection have become
the focus of much research in areas of
applications for datasets with many
variables are available
Text processing
Gene expression
Combinatorial chemistry

Feature Selection                             2
Motivation
The objective of feature selection is three-
fold:
Improving the prediction performance of the
predictors
Providing a faster and more cost-effective
predictors
Providing a better understanding of the
underlying process that generated the data

Feature Selection                                    3
Why use feature selection in CBIR
Different users may need different features
for image retrieval
From each selected sample, a specific
feature set can be chosen

Feature Selection                             4
Boosting
Method for improving the accuracy of any
learning algorithm
Use of “weak algorithms” for single rules
Weighting of the weak algorithms
Combination of weak rules into a strong
learning algorithm

Feature Selection                               5
Is a iterative boosting algorithm
Notation
Samples (x1,y1),…,(xm,ym), where, yi= -1,1
There are m positive samples, and l negative
samples
Weak classifiers hi
For iteration t, the error is defined as:
εt = min (½)Σi ωi |hi(xi) – yi|
where ωi is a weight for xi.

Feature Selection                                6
Given samples (x1,y1),…,(xm,ym), where yi = -1,1
Initialize ω1,i=1/(2m), 1/(2l), for yi = 1,-1
For t=1,…,T
Normalize ωt,i = ωt,i /(Σj ωt,j)
Train base learner ht,i using distribution ωi,j
Choose ht that minimize εt with error ei
Update ωt+1,i = ωt,i βt1-ei
Set βt = (εt)/(1- εt) and αt = log(1/ βt)
Output the final classifier H(x) = sign( Σt αt ht(x) )

Feature Selection                                            7
Searching similar groups
A particular image class is chosen
A positive sample of this group is given
randomly
A negative sample of the rest of the images is
given randomly

Feature Selection                                   8
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually
Dirty data
Predictor – linear predictor
Comparison
Stable solution

Feature Selection                  9
Domain knowledge
Features used        Features used      Features used
colordb_sumRGB      col_hu_seg_hsv     hist_phc_hsv
_entropy_d1
col_hu_seg_lab     hist_phc_rgb
col_gpd_hsv
col_gpd_rgb         col_hu_yiq         haar_RGB
col_hu_hsv2         col_ngcm_rgb       haar_HSV
col_hu_lab2         col_sm_hsv         haar_rgb
col_hu_lab          col_sm_lab
col_hu_rgb2                            haar_hmmd
col_sm_rgb
col_hu_rgb          col_sm_yiq
col_hu_seg2_hsv
text_gabor
col_hu_seg2_lab
col_hu_seg2_rgb     text_tamura
edgeDB
waveletDB

Feature Selection                                          10
Check list Feature
Selection
Domain knowledge
Commensurate features

Normalize features between an
appropriated range
so it is not necessary to normalize them

Feature Selection                          11
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually
Dirty data
Predictor – linear predictor
Comparison
Stable solution

Feature Selection                  12
Feature construction and space
dimensionality reduction
Clustering
Correlation coefficient
Supervised feature selection
Filters

Feature Selection                     13
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables

Features with the same value for all samples
(variance=0) were eliminated
From
4912 Linear Features
3583 were selected
Feature Selection                                    14
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually

When there is no asses method, use
Variable Ranking method. In Adaboost this
is not necessary
Feature Selection                        15
Variable Ranking
Preprocessing step
Independent of the choice of the predictor

Correlation criteria
It can only detect linear dependencies
Single variable classifiers

Feature Selection                                   16
Variable Ranking
Noise reduction and better classification may be
obtained by adding variables that are
presumable redundant
Perfectly correlated variables are truly redundant
in the sense that no additional information is
gained by adding them. It doesn’t mean absence
of variable complementarily
Two variables that are useless by themselves
can be useful together

Feature Selection                                   17
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually
Dirty data
Predictor – linear predictor
Comparison
Stable solution

Feature Selection                  18
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually
Dirty data
Predictor – linear predictor
Comparison
Stable solution

Feature Selection                  19
Given samples (x1,y1),…,(xm,ym), where xi, yi -1,1
Initialize ω1,i=1/(2m), 1/(2l), for yi = -1,1
For t=1,…,T
Normalize ωt,i = ωt,i /(Σj ωt,j)
Train base learner ht,i using distribution ωi,j
Choose ht that minimize εt with error ei
Update ωt+1,i = ωt,i βt1-ei
Set βt = (εt)/(1- εt) and αt = log(1/ βt)
Output the final classifier H(x) = sign( Σt αt ht(x) )

Feature Selection                                            20
Weak classifier
Each weak classifier hi is defined as
follows:
hi.pos_mean – mean value for positive samples
hi.neg_mean – mean value for negative sample
A sample is classified as:
1 if it is closer to hi.pos_mean
-1 if it is closer to hi.neg_mean

Feature Selection                                      21
Weak classifier
hi.pos_mean – mean value for positive samples
hi.neg_mean – mean value for negative sample

hi.pos_mean

hi.neg_mean

A Linear Classifier was used
Feature Selection                                      22
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually
Dirty data
Predictor – linear predictor
Comparison
Stable solution

Feature Selection                  23

10 positives

4 positives

Feature Selection                     24
Few positive samples
Use of 4
positive samples

Feature Selection                   25
More positive samples
Use of 10
positive samples

False
Positive

Feature Selection                       26
Use of 10
Training data
positive samples

Training data

Test data

False negative
Feature Selection                             27
Changing number of Training
Iterations

The number
of iterations
Used was from
5 to 50
Iterations = 30
was set

Feature Selection                      28
Changing Sample Size

35 pos
30 pos

25 pos
20 pos
15 pos
10 pos
5 pos

Feature Selection                           29
Few negative samples
Use of 15
negative samples

Feature Selection                 30
More negative samples
Use of 75
negative samples

Feature Selection                 31
Check list Feature
Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Asses features individually
Dirty data
Predictor – linear predictor
Comparison (ideas, time, comp. resources,
examples)
Stable solution

Feature Selection                               32
Stable solution
For Adaboost is important to have a
representative sample
Chosen parameters:
Positives samples: 15
Negative samples: 100
Iteration number: 30

Feature Selection                         33
Stable solution with more samples
and iterations

Dinosaurs
Roses
Buses
Horses

Buildings       Elephants
Humans            Food
Mountains

Beaches

Feature Selection                          34
Use of:
Stable solution for Dinosaurs
• 15 Positive samples
• 100 Negative samples
• 30 Iterations

Feature Selection                      35
Stable solution for Roses
Use of:
• 15 Positive samples
• 100 Negative samples
• 30 Iterations

Feature Selection                      36
Use of:
Stable solution for Buses
• 15 Positive samples
• 100 Negative samples
• 30 Iterations

Feature Selection                      37
Use of:
Stable solution for Beaches
•   15 Positive samples
•   100 Negative samples
•   30 Iterations

Feature Selection                     38
Use of:
Stable solution for Food
• 15 Positive samples
• 100 Negative samples
• 30 Iterations

Feature Selection                    39
Unstable Solution

Feature Selection                       40
Unstable solution for Roses
Use of:
• 5 Positive samples
• 10 Negative samples
• 30 Iterations

Feature Selection                   41
Best features for classification
Humans
Beaches
Buildings
Buses
Dinosaurs
Elephants
Roses
Horses
Mountains
Food

Feature Selection                     42
And the winner is…

Feature Selection       43
H
aa
Appearen ce tim es
r_
rg
ha b_n

0.02
0.04
0.06
0.08
0.12
0.14
0.16
0.18

0.1
0.2

0
hi a o
st r_ rm
_G h
ra mm
d     d

Feature Selection
ha _R
ar GB
_
ha RG
hi ar B
s _
03 h t_p HS
-c is hc V
o l t_ _ h
03 _gp phc sv
-c d_ _r
03 03 ol_s hsv gb
-c -c m .m
ol ol _y a
_h _h iq t
u u .m
03 _se _yi at
-c g_ q.m
04 ol_ hs at
-te sm v.m
xt _l a
_t ab t
am .m
03                 u a
-c 05 ra.m t
03 _     ol -e
-c sm dg at
o               e

Feature
03 l_g _rg DB
-c pd b.
Feat ure's Frequen cy

ol _ m
_g r g a
p d b .m t
_
03 03 05- lab at
-c -co wa .m
ol
_ l_ ve at
03 hu hu let
_ _ D
Feature frequency

03 -col se lab B
-c       _ n g _ .m
ol gc r g at
_h m b.
u _ m
04 _se rgb at
-te g_ .m
xt la at
_g b.
ab ma
or t
.m
at
44
Extensions
Searching similar images
Pairs of images are built
The difference for each feature is calculated
Each difference is classified as:
1 if both images belong to the same class
-1 if both images belong to different classes

Feature Selection                                          45
Extensions
Use of another weak classifier
Design weak classifier using multiple features
→ classifier fusion
Use different weak classifier such as SVM,
NN, threshold function, etc.
Different feature selection method: SVM

Feature Selection                                   46
Discussion
Is important to add feature Selection for Image
retrieval
A good methodology for selecting features
should be used
→ data dependent
It is important to have representative samples
Adaboost can help to improve the classification
potential of simple algorithms
Feature Selection                                     47
Thank you !

Feature Selection         48

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 2 posted: 5/7/2012 language: English pages: 48