VIEWS: 35 PAGES: 15 POSTED ON: 12/8/2011
Hand Detection with a Cascade of Boosted Classifiers Using Haar-like Features Qing Chen Discover Lab, SITE, University of Ottawa May 2, 2006 Outline 1. Introduction 2. Haar-like features 3. Adaboost 4. The Cascade of Classifiers 5. Preliminary Results 6. Future Work 2 1. Introduction Hand-based Human Computer Interface (HCI) should meet the requirements of real-time, accuracy and robustness. The purpose of Haar-like features is to meet the real-time requirement. The purpose of the cascade of Adaboosted (Adaptive boost) classifiers is to achieve both accuracy and speed. The algorithm has been used for face detection which achieved high detection accuracy and approximately 15 times faster than any previous approaches. The algorithm is a generic objects detection/recognition method. 3 2. Haar-Like Features Each Haar-like feature consists of two or three jointed “black” and “white” rectangles: Figure 1: A set of basic Haar-like features. Figure 2: A set of extended Haar-like features. The value of a Haar-like feature is the difference between the sum of the pixel gray level values within the black and white rectangular regions: f(x)=Sumblack rectangle (pixel gray level) – Sumwhite rectangle (pixel gray level) Compared with raw pixel values, Haar-like features can reduce/increase the in-class/out-of-class variability, and thus making classification easier. 4 2. Haar-Like Features (cont’d) The rectangle Haar-like features can be computed rapidly using “integral image”. Integral image at location of x, y contains the sum of the pixel values above and left of x, y, inclusive: P( x, y) i ( x' , y ' ) x ' x , y ' y P (x, y) The sum of pixel values within “D”: A B P1 P2 P A, P2 A B, P3 A C, P4 A B C D 1 C D P P4 P2 P3 A A B C D A B A C D 1 P3 P4 5 2. Haar-Like Features (cont’d) To detect the hand, the image is scanned by a sub-window containing a Haar-like feature. Based on each Haar-like feature fj , a weak classifier hj(x) is defined as: where x is a sub-window, and θ is a threshold. pj indicating the direction of the inequality sign. 6 3. Adaboost The computation cost using Haar-like features: Example: original image size: 320X240, sub-window size: 24X24, frame rate: 15 frame/second, The total number of sub-windows with one Haar-like feature per second: (320-24+1)X(240-24+1)X15=966,735 Considering the scaling factor and the total number of Haar-like features, the computation cost is huge. AdaBoost (Adaptive Boost) is an iterative learning algorithm to construct a “strong” classifier using only a training set and a “weak” learning algorithm. A “weak” classifier with the minimum classification error is selected by the learning algorithm at each iteration. AdaBoost is adaptive in the sense that later classifiers are tuned up in favor of those sub-windows misclassified by previous classifiers. 7 3. Adaboost (cont’d) The algorithm: 8 3. Adaboost (cont’d) Adaboost starts with a uniform distribution of “weights” over training examples. The weights tell the learning algorithm the importance of the example. Obtain a weak classifier from the weak learning algorithm, hj(x). Increase the weights on the training examples that were misclassified. (Repeat) At the end, carefully make a linear combination of the weak classifiers obtained at all iterations. f final (x) final ,1h1 (x) final ,n hn (x) 9 4. The Cascade of Classifiers A series of classifiers are applied to every sub-window. The first classifier eliminates a large number of negative sub-windows and pass almost all positive sub-windows (high false positive rate) with very little processing. Subsequent layers eliminate additional negatives sub-windows (passed by the first classifier) but require more computation. After several stages of processing the number of negative sub-windows have been reduced radically. 10 4. The Cascade of Classifiers (cont’d) Negative samples: non-object images. Negative samples are taken from arbitrary images. These images must not contain object representations. Positive samples: images contain object (hand in our case). The hand in the positive samples must be marked out for classifier training. 11 5. Preliminary Results Number of pos. samples: 144 Number of neg. samples: 3142 Sample Resolution: 640X480 Initial sub-window size: 15X30 Scale factor: 1.3 Cascade obtained: 12 grades 12 6. Future Work Extended Haar-like features? Will extended Haar-like features improve the detection accuracy? (Still an Open Problem) The performance tradeoff? Parallel cascades for multiple hand gestures. How to select the hand gesture configurations which can be detected more effectively with the employed Haar-like feature set? Improve the robustness against hand rotation. How much improvement can be achieved with more training samples? Intel face detection classifier: 5000 Pos. 10000 Neg. Accuracy: 98% 13 References: Wu Bo, et al., “A Multi-View Face Detection Based on Real Adaboost Algorithm,” Computer Research and Development, 42 (9)：pp.1612-1621，2005. Paul Viola and Michael J. Jones, “Robust Real-time Object Detection,” Technical Report, Cambridge Research Lab, Compaq. 2001. Cynthia Rudin, Robert E. Schapire, Ingrid Daubechies, “Analysis of Boosting Algorithms using the Smooth Margin Function: A Study of Three Algorithms,” 2004. Rainer Lienhart, Alexander Kuranov, Vadim Pisarevsky, “Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection,” MRL Technical Report, May 2002. Andre L. C. Barczak, Farhad Dadgostar, “Real-time Hand Tracking Using a Set of Cooperative Classifiers and Haar-Like Features,” Research Letters in the Information and Mathematical Sciences, ISSN 1175-2777, Vol. 7, pp 29-42, 2005. Mathias Kölsch and Matthew Turk, “Robust Hand Detection,” Proc. IEEE Intl. Conference on Automatic Face and Gesture Recognition, May 2004. Intel OpenCV Documents. Acknowledgement goes to Urtho’s training data for eye detection and F. Dadgostar’s hand palm database. 14 Thank you and Any Questions? 15
"Rapid Hand Detection with Adaboost Classifiers Based on Haar "