Document Sample

608 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 Bayesian Foreground and Shadow Detection in Uncertain Frame Rate Surveillance Videos Csaba Benedek, Student Member, IEEE, and Tamás Szirányi, Senior Member, IEEE Abstract—In in this paper, we propose a new model regarding the presence of dynamic background and camera ego-motion in- foreground and shadow detection in video sequences. The model stead of the various shadow effects. works without detailed a priori object-shape information, and it Another important issue is related to the properties of the is also appropriate for low and unstable frame rate video sources. Contribution is presented in three key issues: 1) we propose a novel video ﬂow. For several video surveillance applications, high- adaptive shadow model, and show the improvements versus pre- resolution images are crucial. Due to the high bandwidth re- vious approaches in scenes with difﬁcult lighting and coloring ef- quirement, the sequences are often captured at a low [9] or fects; 2) we give a novel description for the foreground based on unsteady frame rate depending on the transmission conditions. spatial statistics of the neighboring pixel values, which enhances These problems appear, especially, if the system is connected to the detection of background or shadow-colored object parts; 3) we show how microstructure analysis can be used in the proposed the video sources through narrow band radio channels or over framework as additional feature components improving the re- saturated networks. For another example, quick ofﬂine evalua- sults. Finally, a Markov random ﬁeld model is used to enhance tion of the surveillance videos is necessary after a criminal inci- the accuracy of the separation. We validate our method on out- dent. Since all the video streams corresponding to a given zone door and indoor sequences including real surveillance videos and should be continuously recorded, these videos may have a frame well-known benchmark test sets. rate lower than 1 fps to save up storage resources. Index Terms—Foreground, Markov random ﬁeld (MRF), For these reasons, a large variety of temporal information, shadow, texture. like pixel state transition probabilities [10]–[12], periodicity cal- culus [2], [13], temporal foreground description [3], or tracking I. INTRODUCTION [14], [15], are often hard to derive, since they usually need a permanently high frame rate. Thus, we focus on using frame rate independent features to ensure graceful degradation if the OREGROUND detection is an important early vision task F in visual surveillance systems. Shape, size, number, and position parameters of the foreground objects can be derived frame rate is low or unbalanced. On the other hand, our model also exploits temporal information for background and shadow modeling. from an accurate silhouette mask and used by many applica- A technique used widely for background subtraction is the tions, like people or vehicle detection, tracking, and event clas- adaptive Gaussian mixtures method of [4], which can be used siﬁcation. together with shadow ﬁlters of, e.g., [16]–[18]. These methods The presence of moving cast shadows on the background classify each pixel independently, and morphology is used later makes it difﬁcult to estimate shape [1] or behavior [2] of moving to create homogenous regions in the segmented image. That objects. Since under some illumination conditions 40%–50% way, the shape of the silhouettes may be strongly corrupted as of the nonbackground points may belong to shadows, methods it is shown in [12] and [19]. without shadow ﬁltering [3]–[5] can be less efﬁcient in scene An alternative segmentation schema is a Bayesian approach analysis. [12]. The background, shadow, and foreground classes are con- In this paper, we deal with an image segmentation problem sidered to be stochastic processes which generate the observed with three classes: foreground objects, background, and shadow pixel values according to locally speciﬁed distributions. The of the foreground objects being cast on the background. We ex- spatial interaction constraint of the neighboring pixels can be ploit information from local pixel-levels, microstructural fea- modelled by Markov random ﬁelds (MRFs) [20]. tures and neighborhood connection. We assume having a stable, Some previous Bayesian methods [21], [22] detect fore- or stabilized [6] static camera, since it is available for several ground objects by building adaptive models regarding the applications. Note that there are papers [3], [7], [8] focusing on background and shadow, and the foreground pixels are purely recognized as nonmatching points to these models. That way, Manuscript received July 18, 2006; revised December 2, 2007. This work was background or shadow colored object-parts cannot be rec- supported in part by the EU project MUSCLE (FP6-567752). The associate ed- ognized. Spatial object description has been used both for itor coordinating the review of this manuscript and approving it for publication was Dr. Anil Kokaram. interactive [23] and unsupervised image segmentation [24]. The authors are with the Distributed Events Analysis Research Group, However, in the latter case, only large objects with typical color Computer and Automation Research Institute, Hungarian Academy of Sci- or texture are detected, since the model [24] penalizes the small ences, H-1111 Budapest, and also with the Faculty of Information Technology, segmentation classes. The authors in [3] have characterized the Pázmány Péter Catholic University, H-1083 Budapest, Hungary (e-mail: bcsaba@sztaki.hu; sziranyi@sztaki.hu) foreground by assuming temporal persistence of the color and Digital Object Identiﬁer 10.1109/TIP.2008.916989 smooth changes in the place of the objects. Nevertheless, in 1057-7149/$25.00 © 2008 IEEE BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 609 the case of low frame rate, fast motion, and overlaying objects, tistics; therefore, the model performance is reasonable also on appropriate temporal information is often not available. the pixel positions where motion is rare. Our method (partly introduced in [25]) is a Bayesian tech- Color space choice is a key issue in several corresponding nique which uses spatial color information instead of temporal methods. We have chosen the CIE space for two well statistics to describe the foreground. It assumes that foreground known properties: we can measure the perceptual distance objects consist of spatially connected parts and these parts can between colors with the Euclidean distance [32], and the color be characterized by typical color distributions. Since these dis- components are approximately uncorrelated with respect to tributions can be multimodal, the object parts should not be ho- camera noise and changes in illumination [33]. Since we derive mogenous in color or texture, while we exploit the spatial infor- the model parameters in a statistical way, there is no need mation without segmenting the foreground components. for accurate color calibration and we use the common CIE In the literature, different approaches are available regarding D65 standard. It is not critical to consider the exact physical shadow detection. Although there are some methods [26], [27] meaning of the color components, which is usually environment which attempt to ﬁnd and remove shadows in the single frames dependent [29]; we use only an approximate interpretation of independently, their performance may be degraded [26] in video the , , components and show the validity of the model via surveillance, where we must expect images with poor quality experiments. and low resolution, while the computational complexity is too Besides the color values, we exploit microstructure infor- high for practical use [27]. mation to enhance the accuracy of the segmentation. In some For the above reasons, we focus on video-based shadow mod- previous works [7], [8], texture was used as the only feature for eling techniques in the following. Here, the “shadow invariant” background subtraction. That choice can be justiﬁed in case of methods convert the images into an illumination invariant fea- strongly dynamic background (like a surging lake), but it gives ture space: they remove shadows instead of detecting them. This lower performance than pixel value comparison in a stable task is often performed by color space transformation. Widely environment. We ﬁnd a solution for integrating intensity and used illumination-invariant color spaces are, e.g., the normal- texture differences for frame differencing in [34]. However, ized rgb [16], [28] and spaces [29]. [30] exploits hue that is a slightly different task than foreground detection, since constancy under illumination changes to train a weak classiﬁer we should compare the image regions to background/shadow as a key step of a more sophisticated shadow detector. We ﬁnd models. With respect to the background class, our color-texture an overview of the illumination invariant approaches in [29] in- fusion process is similar to the joint segmentation approach of dicating that several assumptions are needed regarding the re- [12], which integrates gray level and local gradient features. We ﬂecting surfaces and the light sources. These assumptions are extend it by using different and adaptively chosen microstruc- usually not fulﬁlled in a real-world environment. Outdoors, for tural kernels, which suit better the local scene properties. example, the illumination is the composition of the direct sun- Moreover, we show how this probabilistic approach can be light, the diffused light corresponding to the blue sky, and var- used to improve our shadow model. ious additional light components reﬂected from the ﬁeld objects For validation, we use real surveillance video shots and also with signiﬁcantly different spectral distributions. Moreover, the test sequences from a well-known benchmark set [35]. Table I camera sensors may be saturated, especially in the case of dark summarizes the different goals and tools regarding some of shadows; therefore, the measured colors cannot be calculated by the above mentioned state-of-the-art methods and the proposed simpliﬁed physical models. Since some of these color spaces model. For a detailed comparison, see also Section VII. In sum- ignore the luminance components of the color, the resulting mary, the main contributions of this paper can be divided into models become sensitive to noise. three groups. We introduce a statistical shadow model which is In a “local” shadow model [31], independent shadow processes robust regarding the forthcoming artifacts in real-world surveil- are proposed for each pixel. The local shadow parameters are lance scenes (Section III-B), and a corresponding automatic trained using a second mixture model similarly to the background parameter update procedure, which is usually missing from pre- in [4]. In this way, the differences in the light absorption-reﬂection vious similar methods (Section V-B). We introduce a nonobject properties of the scene points can be notably considered. How- based, spatial description of the foreground which enhances the ever, a single pixel should be shadowed several times till its es- segmentation results also in low frame rate videos (Section IV). timated parameters converge, whilst the illumination conditions Meanwhile, we show how microstructure analysis can improve should stay unchanged. This hypothesis is often not satisﬁed in the segmentation in this framework (Section III-C). outdoor surveillance environments; therefore, this local process We also have a few assumptions in the paper. First, the camera based approach is less effective in our case. stands in place and it has no signiﬁcant ego-motion. Second, we We follow another approach: shadow is characterized with expect static background objects (e.g., there is no waving river “global” parameters in an image (or in each subregion, in case in the background). The third assumption is related to the illu- of videos having separated scene areas with different lightings), mination: we deal with one emissive light source in the scene; and the model describes how the background values of the dif- however, we consider the presence of additional diffused and ferent sites change, when shadow is projected on them. We con- reﬂected light components. sider the transformation between the shadowed and background values of the pixels as a random transformation; hence, we take II. FORMAL MODEL DESCRIPTION several illumination artifacts into consideration. On the other An image is considered to be a 2-D grid of pixels (sites), hand, we derive the shadow parameters from global image sta- with a neighborhood system on the lattice. The procedure as- 610 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 TABLE I COMPARISON OF DIFFERENT CORRESPONDING METHODS AND THE PROPOSED MODEL. NOTES: TEMPORAL FOREGROUND DESCRIPTION, PIXEL STATE TRANSITIONS signs a label to each pixel form the label-set: Considering the low correlation between the color compo- corresponding to three possible classes: foreground nents [33], we approximate the joint distribution of the features ( ), background ( ), and shadow ( ). Therefore, the segmen- by a 4-D Gaussian density function with diagonal covariance tation is equivalent to a global labeling . As it matrix: is typical, the label ﬁeld is modelled as a MRF based on [20]. The image data at pixel is characterized by a 4-D feature vector for . (1) Accordingly, the distribution parameters are mean, and where the ﬁrst three elements are the color components of the standard deviation pixel in the CIE space, and is a microstructural vectors. With this “diagonal” model, we avoid matrix inversion response which we introduce in Section III-C in detail. Set and determinant recovery during the calculation of the marks the global image data. probabilities, and the terms can be directly We use a maximum a posteriori (MAP) estimator for the label derived from the 1-D marginal probabilities ﬁeld, where the optimal labeling , corresponding to the op- timal segmentation, maximizes the probability (2) (3) We assume that the observed image data in the different pixel with . According to (3), each feature con- positions is conditionally independent given a labeling [36]: tributes with its own additional term to the energy calculus. , while to present smooth con- Therefore, the model is modular: the 1-D model parameters, nected regions in the segmented image, the a priori probability , can be estimated separately. of a labeling, , is deﬁned by the Potts model [37]. The key point in the model is to deﬁne the conditional density B. Color Features functions , for all and . For example, is the probability that the background process The use of a Gaussian distribution to model the observed generates the observed feature value at pixel . Later on in color of a single background pixel is well established in the the background will also be featured as a random variable with literature, with the corresponding parameter estimation proce- the probability density function . dures, such as in [4] and [38]. We train the color components We deﬁne the conditional density functions in Sections III–V, of the background parameters in a similar and the segmentation procedure will be presented in Section VII manner to the conventional online K-means algorithm [4]. in detail. Before continuing, note that, in fact, we minimize the vector estimates the mean minus-log of (2). Therefore, in the following, we use the background color of pixel measured over the recent frames, local energy terms, for easier notation. while is an adaptive noise parameter. An efﬁcient outlier ﬁltering technique [4] excludes most of the nonbackground III. PROBABILISTIC MODEL OF THE BACKGROUND pixel values from the parameter estimation process, which AND SHADOW PROCESSES works without user interaction. As we have stated in the introduction, we characterize A. General Model shadows by describing the background-shadow color value We model the distribution of feature values in the background transformation in the images. The shadow calculus is based and in the shadow by Gaussian density functions, like, e.g., [11], on the illumination-reﬂection model [39], which has been [12], and [35]. originally introduced for constant lighting, ﬂat and Lambertian BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 611 Fig. 1. Illustration of two illumination artifacts (the frame in the left image has been chosen from the “Entrance pm” test sequence). 1: Light band caused by a non-Lambertian reﬂecting surface (a glass door); 2: dark shadow part between the legs (more object parts change the reﬂected light). The constant ratio model (see image in the middle) causes errors, while the proposed model (right image) is more robust. reﬂecting surfaces. Usually, our scene does not fulﬁll these re- quirements. The presented novelty is that we use a probabilistic approach to describe the deviation of the scene from the ideal surface assumptions, and get a more robust shadow detection. 1) Measurement of Color in the Lambertian Model: Ac- cording to the illumination model [39] the response of a given image sensor placed at pixel can be written as (4) where is the illumination function, depends on the surface albedo and geometry, is the sensor sensitivity. In the “background,” the illumination function is the composition of a direct and some diffused-reﬂected light components, while a shadowed surface point is illuminated by the diffused-reﬂected Fig. 2. Histograms of the , , and values for shadowed and foreground light only. points collected over a 100-frame period of the video sequence “Entrance pm” With further simpliﬁcations [39], (4) implies the well-known (frame rate: 1 fps). Each row corresponds to a color component. “constant ratio” rule. Namely, the ratio of the shadowed and illuminated value of a given surface point is consid- ered to be constant over the image: . where, as deﬁned earlier, is the observed luminance value The “constant ratio” rule has been used in several applications at , and is the mean value of the local Gaussian back- [11], [12], [21]. There, the shadow and background Gaussian ground term estimated over the previous frames [4]. terms corresponding to the same pixel are related via a globally Thus, if the value is close to the estimated shadow constant linear density transform. In this way, the results may be darkening factor, is more likely to be a shadowed point. More reasonable when all the direct, diffused and reﬂected light can precisely, in a given video sequence, we can estimate the distri- be considered constant over the scene. However, the reﬂected bution of the shadowed values globally in the video parts. light may vary over the image in case of several static or moving Based on experiments with manually generated shadow masks, objects, and the reﬂecting properties of the surfaces may differ a Gaussian approximation seems to be reasonable regarding the signiﬁcantly from the Lambertian model (see Fig. 1). distribution of shadowed values (Fig. 2 shows the global The efﬁciency of the constant ratio model is also restricted by statistics regarding a 100-frame period of outdoor test sequence several practical reasons, like quantiﬁcation errors of the sensor “Entrance pm”). For comparison, we have also plotted the sta- values, saturation of the sensors, imprecise estimation of tistics for the foreground points, which follows a signiﬁcantly and , or video compression artifacts. Based on our experiments different, more uniform distribution. (Section VII), these inaccuracies cause poor detection rates in Due to the spectral differences between the direct and am- some outdoor scenes. bient illumination, cast shadows may also change the and 2) Proposed Model: The previous section suggests that the color components [40]. We have found an offset between the ratio of the shadowed and background luminance values of the shadowed and background values of the pixels, which can be pixels may be useful, but not powerful enough as a descriptor efﬁciently modelled by a global Gaussian term in a given scene of the shadow process. Instead of constructing a more difﬁcult (similarly as for the component). Hence, we deﬁne (and illumination model, for example, in 3-D with two cameras, we ) by overcome the problems with a statistical model. For each pixel , we introduce the variable by (6) As Fig. 2 shows, the shadowed and values follow (5) approximately normal distributions. 612 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 Consequently, the shadow color process is characterized by a This assumption is inaccurate near the border of the objects, but 3-D Gaussian random variable it is a reasonable approximation if the kernel size (and the size of set ) is small enough. To ensure this condition, we use 3 3 kernels in the following. Accordingly, with respect to (10), in the background According to (5) and (6), the color values in the shadow at each (and similarly in the shadow) can be considered as a linear com- pixel position are also generated by Gaussian distributions bination of Gaussian random variables from the following set : (11) with the following parameters: where . We assume that the variables have joint normal distribution; therefore, (7) is also Gaussian with parameters . The (8) mean value can be determined directly [41] by Regarding the (and similarly to the ) component (12) (9) On the other hand, to estimate the parameter, we The estimation and the time dependence of parameters should model the correlation between the elements of . are discussed in Section V-B. In effect, the variables in are nonindependent, since ﬁne alterations in global illumination or camera white balance C. Microstructural Features cause correlated changes of the neighboring pixel values. How- In this section, we deﬁne the fourth dimension of the pixels’ ever, very high correlation is not usual, since strongly textured feature vectors (1), which contains local microstructural re- details or simply the camera noise result in some independence sponses. of the adjacent pixel levels. While previous methods have ignored 1) Deﬁnition of the Used Microstructural Features: Pixels this phenomenon, e.g., by considering the features to be uncorre- covered by a foreground object often have different local tex- lated [12], our goal is to give a more appropriate statistical model tural features from the background at the same location, more- by estimating the order of correlation for a given scene. over, texture features may identify foreground points with back- We model the correlation factor between the “adjacent” pixel ground or shadow like color. In our model, texture features values by a constant over the whole image. Let and be two are used together with color components and they enhance the sites in the neighborhood of , and denote the cor- segmentation results as an additional component in the feature relation coefﬁcient between and by . Accordingly vector. Therefore, we make restrictions regarding the texture if features: we search for components that we can get by low ad- if ditional computing time from the existing model elements, in exchange for some accuracy. where is a global constant. To estimate , we randomly According to our model, the textural feature is retrieved from choose some pairs of neighboring sites. For each selected site a color feature-channel by using microstructural kernels. For pair , we make a set from time stamps corresponding practical reasons, and following the fact that the human visual to common background occurrences of pixels and . There- system mainly percepts textures as changes in intensity, we use after, we calculate the normalized cross correlation between texture features only for the “L” color component. A novelty of time series and , where the proposed model is (as being explained in Section III-C-III) indices are time stamps of the measurements. Finally, that we may use different kernels at different pixel locations. we approximate by the average of the collected correlation More speciﬁcally, there is a set of kernel coefﬁcients for each coefﬁcients over all selected site pairs. site : , where is the set of pixels Thereafter, we can calculate according to the vari- around covered by the kernel. Feature is deﬁned by ance theorem for sum of random variables [41] (10) 2) Analytical Estimation of the Distribution Parameters: (13) Similarly, the Gaussian shadow parameters regarding the mi- Here, we show that with some further reasonable assumptions, the features deﬁned by (10) have also Gaussian distribution, and crostructural components by using (7), (8), and (12) the distribution parameters , can be determined analytically. As a simpliﬁcation, we exploit that the neighboring pixels have usually the same labels, and calculate the probabilities by (14) (15) BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 613 where 3) Strategies for Choosing Kernels: In the following, we deal with zero-mean kernels as a general- Fig. 3. Kernel set used in the experiments: four of the impulse response arrays ization of simple ﬁrst-order edge features by [12]. Here, we face 2 corresponding to the 3 3 Chebyshev basis set proposed by [43]. an important problem from an experimental point of view. Each kernel has an adequate pattern, for which it generates a signif- icant nonzero response, while most of the pixel-neighborhoods also unknown. However, to estimate the local color distribution, in an image are “untextured” with respect to it. Therefore, one we do not need to ﬁnd all foreground pixels, just some sam- single kernel is unable to discriminate an “untextured” object ples in each neighborhood. The key point is that we identify point on an “untextured” background. some pixels which certainly correspond to the foreground: these An evident enhancement uses several kernels which can rec- are the pixels having signiﬁcantly different levels from the lo- ognize several patterns. However, increasing the number of the cally estimated background and shadow values; thus, they can microstructural channels would intensify the noise, because at a be found by a simple thresholding given pixel position all the “inadequate” kernels give irrelevant responses, which are accumulated in the energy term (3). if AND (16) To overcome the above problem, we use one microstructural otherwise channel only [see (1)], and we use the most appropriate kernel where is a threshold (which is analogous with the uniform at each pixel. Our hypothesis is: if the kernel response at is signiﬁcant in the background, the kernel gives more informa- value in previous models [22] choosing ), and is tion for the segmentation there. Therefore, after we have de- a “preliminary” segmentation label of . ﬁned a kernel set for the scene, at each pixel position , the Next, we estimate for each pixel the local color distribution kernel having the highest absolute response in the background of the foreground, using the certainly foreground pixels in the centered at is used. According to our experiments, different neighborhood of . The procedure is demonstrated in Fig. 4 (for kernel-sets, e.g., corresponding to the Laws-ﬁlters [42], or the easier visualization with 1-D grayscale feature vectors). We use Chebyshev polynomials [42], [43], produce similar results. In the following notations: denotes the set of pixels marked as Sections IV–VII, we use the kernels shown in Fig. 3, which we certainly foreground elements in the preliminary mask have found reasonable for the scenes. Regarding the “Entrance pm” sequence, each kernel of the set corresponds to a signiﬁcant number of background points according to our choice strategy (distributed as 44-19-22-15%), showing that each kernel is valu- Note that may be a coarse estimation of the foreground able. [Fig. 4(b)]. Let be the set of the neighboring pixels around , con- IV. FOREGROUND PROBABILITIES sidering a rectangular neighborhood with window size [Fig. 4(a)]. Thereafter, is deﬁned with respect to as the set The description of background and shadow characterizes the of neighboring pixels determined as “foreground” by the pre- scene and illumination properties, consequently it has been pos- processing step: [Fig. 4(c)]. sible to collect statistical information about them in time. In our The foreground color distribution around can be character- case, the color distribution regarding the foreground areas is un- ized by a normalized histogram over [Fig. 4(d)]. How- predictable in the same way. If the frame rate is very low and un- ever, instead of using the noisy directly, we approximate it balanced, we must consider consecutive images containing dif- by a “smoothed” probability density function, , and deter- ferent scenarios with different objects. Previous works [21], [22] mine the foreground probability term as .1 used uniform distribution to describe the foreground process To deal with multicolored or textured foreground compo- which agrees with the long-term color statistics of the fore- nents, the estimated function should be multimodal [see ground pixels (Fig. 2), but it presents a weak description of the a bimodal case in Fig. 4(d)]. Note that we use only to class. Since the observed feature values generated by the fore- calculate the foreground probability value of as . Thus, ground, shadow and background processes overlap strongly in it is enough to estimate the parameters of the mode of , numerous real world scenes, many foreground pixels are mis- which covers [see Fig. 4(e)]. Therefore, we consider as classiﬁed that way. a mixture of a weighted Gaussian term and a residual term Instead of temporal statistics, we use spatial color information , for which we only prescribe that is a probability to overcome this problem by using the following assumption: density function and if . ( is a whenever is a foreground pixel, we should ﬁnd foreground weighting factor: .) Hence pixels with similar color in the neighborhood. Consequently, if we can estimate the color statistics of the nearby foreground sites, we can decide if a pixel with a given color is likely part of 1In the spatial foreground model, we must ignore the textural component of x, the foreground or not. Unfortunately, when we want to assign since different kernels are used in different pixel locations, and the microstruc- a probability value to a given pixel describing its foreground tural responses of the various pixels may be incomparable. Thus, in this section, membership, the positions of the nearby foreground pixels are x is considered to be a 3-D color vector, and h a 3-D histogram. 614 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 Fig. 4. Determination of the foreground conditional probability term for a given pixel s (demonstrated in grayscale). a) Video image, with marking s and its neighborhood V (with window side m = 45 n ). b) Noisy preliminary foreground mask. c) Set F : preliminary detected foreground pixels in V . (Pixels of V F are marked with white). d) Histogram of F , marking x , and its neighborhood e) Result of ﬁtting a weighted Gaussian term for the x [ 0 ; x + ] part of the histogram. Here, = 2 71 : is used (it would be the foreground probability value for each pixel according to the “uniform” model), but the procedure increases the foreground probability to 4.03. f) Segmentation result of the model optimization with the uniform foreground calculus g) Segmentation result by the proposed model. TABLE II FOREGROUND PARAMETER SETTINGS V. PARAMETER SETTINGS Fig. 5. Algorithm for the estimation of the foreground probability term. Nota- tions are deﬁned in Section IV. Our method works with scene-dependent and condition-de- pendent parameters. Scene-dependent parameters can be con- sidered constant in a speciﬁc ﬁeld, and are inﬂuenced by, e.g., Accordingly, the foreground probability value of site is statis- camera settings, a priori knowledge about the appearing ob- tically characterized by the distribution of its neighborhood in jects or reﬂection properties. We provide strategies on how to the color domain set these parameters if a surveillance environment is given. Con- dition-dependent parameters vary in time in a scene; therefore, we use adaptive algorithms to follow them. We emphasize two properties of the presented model. Re- garding the background and shadow processes, only the 1-D The steps of the foreground energy calculation are detailed marginal distribution parameters should be estimated (Sec- in Fig. 5. We can speed up the algorithm, if we calculate tion III-A). On the other hand, we should estimate here the the Gaussian parameters by considering only some randomly color-distribution parameters only, since the mean-deviation selected pixels in [19]. We describe the parameter settings in values corresponding to the microstructural component are Section V-A and in Table II. determined analytically (see Section III-C2). BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 615 Fig. 6. Different periods of the day in the “Entrance” sequence, segmentation results. Above left: in the morning (“am”); right: at noon, below left: in the afternoon (“pm”); right: wet weather. Fig. 7. Shadow statistics on four sequences recorded by the “Entrance” A. Background and Foreground Model Parameters camera of our University campus. Histograms of the occurring , , and values of shadowed points. Rows correspond to video shots from different parts The background parameter estimation and update procedure of the day. We can observe the peak of the histogram strongly depends is automated, based on the work in [4], which presents reason- on the illumination conditions, while the change in the other two shadow parameters is much smaller. able results, and it is computationally more effective than the standard EM algorithm. The foreground model parameters (Section IV) correspond to 1) Re-Estimation of Parameters and a priori knowledge about the scene, e.g., the expected size of the : The procedure is similar to which was used in appearing objects and the contrast. These features exploit basi- [22]. We show it regarding the component only, since the cally low-level information and are quite general; therefore, the component is updated in the same way. method is able to consider a large variety of moving objects in a We re-estimate the parameters at ﬁxed time-intervals . De- scene. In our experiments, we set these parameters empirically. note the parameters at time . is the set con- Table II shows a detailed overview on the foreground param- taining the observed values collected over the pixels detected eters and how to set them. Notes on parameter are given in as shadow between time and Section VII and in Fig. 15. B. Shadow Parameters where upper index refers to time, is the number of the The changes in the global illumination signiﬁcantly alter elements, and and are the empirical mean and the stan- the shadow properties (Fig. 6). Moreover, changes can be dard deviation values of . We update the parameters performed rapidly: indoors due to switch on/off different light sources, while outdoors due to the appearance of clouds. Regarding the shadow parameter settings, we discriminate parameter initialization and re-estimation. From a practical point of view, initialization may be supervised with marking Parameter is a weighting term depending on shadowed regions in a few video frames by hand, once after , namely greater number of detected shadow points increase switching on the system. Based on the training data, we can and the inﬂuence of the , respectively, term. We use calculate maximum likelihood estimates of the shadow param- . eters. On the other hand, there is usually no opportunity for 2) Re-Estimation of Parameters : Parameter continuous user interaction in an automated surveillance envi- corresponds to the average background luminance dark- ronment; thus, the system must adopt the illumination changes ening factor of the shadow. Except from window-less rooms raising a claim to an automatic re-estimation procedure. with constant lightning, is strongly condition dependent. For the above reasons, we use supervised initialization, and Outdoors, it can vary between 0.6 in direct sunlight and 0.95 in focus on the parameter adaption process in the following. The overcast weather. The simple re-estimation from the previous presented method is built into a 24-h surveillance system of our section does not work in this case, since the illumination university campus. We validate our algorithm via four manually properties between time and may rapidly change a evaluated ground truth sequences captured by the same camera lot, which would result in absolutely false detected shadow under different illumination conditions (Fig. 6). values in set presenting false and parameters for the According to Section III-B, the shadow parameters are six re-estimation procedure. scalars: 3–3 components of , respectively, vectors. Fig. 7 For this reason, we derive the actual from the statistics shows the 1-D histograms for the occurring , , and of all nonbackground -s (where the background ﬁltering values of shadowed points for each video shot. We can observe should be done by a good approximation only, we use the that while the variation of parameters , and are Stauffer–Grimson algorithm). In Fig. 8, we can observe that low, varies in time signiﬁcantly. Therefore, we update the the peaks of the “nonbackground” -histograms are approx- parameters in two different ways. imately in the same location as they were in Fig. 7. The video 616 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 Fig. 8. statistics for all nonbackground pixels. Histograms of the occurring , , and values of the nonbackground pixels in the same sequences as in Fig. 7. Fig. 9. Updating algorithm for parameter . where the minimum is searched over all the possible segmen- shots corresponding to the ﬁrst and second rows were recorded tations of a given input frame. The ﬁrst part of (19) con- around noon where the shadows were relatively small; however, tains the sum of the local class-energy terms regarding the pixels the peak is still in the right place in the histogram. of the image [see (3) and (18)]. The second part is responsible These experiments encourage us to identify with the for the smooth segmentation: if and are not location of the peak on the “nonbackground” -histograms for neighboring pixels, otherwise the scene. The description of the update-algorithm of is as follows. We deﬁne a data structure which contains a value with its if . timestamp: . We store the “latest” occurring pairs In applications using the Potts-MRF models, the quality of the of the nonbackground points in a set , and update the his- segmentation depends both on the appropriate probabilistic togram of the values in continuously. The key point is model of the classes, and on the optimization technique which the management of set . We deﬁne MAX and MIN parameters ﬁnds a good global labeling with respect to (19). The latter which control the size of . The queue management algorithm, factor is a key issue, since ﬁnding the global optimum is NP which is introduced in Fig. 9, follows four intentions. hard [44]. On the other hand, stochastic optimizers using simu- • contains always the latest available values. lated annealing (SA) [20], [45] and graph cut techniques [44], • The algorithm keeps the size of between prescribed [46] have proved to be practically efﬁcient offering a ground to bounds MAX and MIN ensuring the topicality and rele- validate different energy models. vancy of the data contained. The results shown in Section VII have been generated by a SA • The actual size of is around MAX in case of cluttered algorithm which uses the Metropolis criteria [47] for accepting scenarios. new states,2 while the cooling strategy changes the temperature • In the case of few or no motion in the scene, the size of after a ﬁxed number of iterations. The relaxation parameters are decreases until MIN. This fact increases the inﬂuence of set by trial and error taking aim at the maximal quality. Com- the forthcoming elements, and causes quicker adaptation, paring the proposed model to reference MRF methods is done since it is faster to modify the shape of a smaller histogram. using the same parameter settings. Parameter is updated similarly to but only in the After verifying our model with the above stochastic opti- time periods when does not change signiﬁcantly. mizer, we have also tested some quicker techniques for practical Note that the above update process may fail in scenarios free purposes. We have found the deterministic modiﬁed metropolis of shadows. However, that case occurs mostly under artiﬁcial (MMD) [36] relaxation algorithm similarly efﬁcient but signif- illumination conditions, where the shadow detector module can icantly faster for this task: processing 320 240 images runs be switched off using a priori knowledge. with 1 fps. We note that a coarse but quick MRF optimiza- tion method is the ICM algorithm [48]. If we use ICM with our VI. MRF OPTIMIZATION model, the running speed is 3 fps, in exchange for some degra- The MAP estimator in (2) is realized by combining a condi- dation in the segmentation results. tional independent random ﬁeld of signals and an unconditional VII. RESULTS Potts model [37]. The optimal segmentation corresponds to the global labeling, , deﬁned by The goal of this section is to demonstrate the beneﬁt of using the introduced contributions of the paper: the novel foreground (19) calculus,the shadow model and the beneﬁt of the textural features. 2A state is a candidate for the optimal segmentation. BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 617 Fig. 10. Synthetic example to demonstrate the beneﬁts of the microstructural features. a) Input frame, i)–v) enlarged parts of the input. b)–d) Result of fore- ground detection based on: b) gray levels; c) gray levels with vertical and hori- zontal edge features [12]; d) proposed model with adaptive kernel. Fig. 11. Shadow model validation: Comparison of different shadow models in three video sequences (from above: “Laboratory,” “Highway,” “Entrance am”). Column 1: Video image; column 2: C C C space based illumination invari- ants [29]; column 3: “constant ratio model” by [21] (without object-based post- The demonstration is done in two ways: in Figs. 10–15, we show processing); column 4: proposed model. segmented images by the proposed and previous methods, while regarding three sequences we perform numerical evaluation. 1) Comparison of Shadow Models: Results of different A. Test Sequences shadow detectors are demonstrated in Fig. 11. For the sake of comparison, we have implemented in the same framework an We have validated our method on several test sequences. illumination invariant (“II”) method based on [29], and a con- Here, we show results regarding the following seven videos. stant ratio model (“CR”), similarly to [21]. We have observed • “Laboratory” test sequence from the benchmark set [35]. that the results of the previous and the proposed methods are This shot contains a simple environment where previous similar in simple environments, but our improvements become methods [12] have already produced accurate results. signiﬁcant in the surveillance scenes. • “Highway” video [35]. This sequence contains dark • In the “Laboratory” sequence, the “II” approach is reason- shadows, but homogenous background without illumina- able, while the “CR” and the proposed method are similarly tion artifacts. In contrast with [21] our method reaches accurate. the appropriate results without post processing, which is • Regarding the “Highway” video, although the “II” and strongly environment-dependent. “CR” ﬁnd the objects without shadows approximately, the • “Corridor” indoor surveillance video. Although, it is on results are much noisier than it is with our model. the face of a simple ofﬁce environment the bright objects • On the “Entrance am” surveillance video, the “II” method and background elements often saturate the image sensors fails completely: shadows are not removed, while the fore- and it is hard to accurately separate the white shirts of the ground component is also noisy due to the lack of lumi- people from the white walls in the background. nance features in the model. The “CR” model also pro- • Four surveillance video sequences captured by the “En- duces poor results: due to the long shadows and various trance” (outdoor) camera of our university campus in dif- ﬁeld objects the constant ratio model becomes inaccurate. ferent lightning conditions. (Fig. 6). These sequences con- Our model handles these artifacts robustly. tain difﬁcult illumination and reﬂection effects and suffer The improvements of the proposed method versus the “CR” from sensor saturation (dark objects and shadows). Here, model can be also observed in Fig. 14 (second and ﬁfth rows). the presented model improves the segmentation results sig- 2) Comparison of Foreground Models: In this paper, we have niﬁcantly versus previous methods. proposed a basically new approach regarding foreground mod- eling, which needs neither high frame rate, in contrast to [3], B. Demonstration of the Improvements Via Segmented Images [11], and [12], nor high level object descriptors [15]. Other pre- In the introduction, we gave an overview on the state-of-the vious models [21], [22] that have used the uniform calculus ex- art methods (Table I) indicating their way of 1) shadow detec- pressing foreground may generate any colors in a given domain tion, 2) foreground modeling, and 3) textural analysis. with the same probability. As it is shown in Figs. 12–14 (third 618 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 Fig. 12. Foreground model validation: Segmentation results on the “Highway” sequence. Row 1: Video image; row 2: results by uniform foreground model; row 3: results by the proposed model. to the foreground (image v. shows an enlarged part of it). The background consists of four equal rectangular regions, each of them has a particular texture, which are enlarged in i-iv. images. Similarly to the real-world case, the observed pixel values are affected by Gaussian noise. Below, we can see results of back- ground subtraction. First (image b), the feature vector only con- sists of the gray value of the pixel. Second (image c), we com- plete it with horizontal and vertical edge detectors similarly to [12]. Finally (image d), we use the kernel set of Fig. 3, with the proposed kernel selection strategy, providing the best results. In Fig. 14, the fourth and ﬁfth rows show the segmentation results with and without the textural components, improvements are observable in the ﬁne details, especially near the legs of the people in the magniﬁed regions. Fig. 13. Foreground model validation regarding the “Corridor” sequence. C. Numerical Evaluation Column 1: Video image; column 2: result of the preliminary detector; column 3: result with uniform foreground calculus; column 4: Proposed foreground The quantitative evaluations are done through manually gen- model. erated ground truth sequences. Since the goal is foreground de- tection, the crossover between shadow and background does not count for errors. and ﬁfth rows), the uniform model is often a coarse approxima- Denote the number of correctly identiﬁed foreground pixels tion, and our method is able to improve the results signiﬁcantly. of the evaluation sequence by TP (true positive). Similarly, we Moreover, we have observed that our model is robust with re- introduce FP for misclassiﬁed nonforeground points, and FN spect to ﬁne changes in the threshold parameter (Fig. 15, third for misclassiﬁed foreground points. row). On the other hand, the uniform model is highly sensitive The evaluation metrics consists of the Recall rate and the Pre- to set appropriately, even in scenarios which can be segmented cision of the detection properly with an adequate uniform value (Fig. 15, second row). 3) Microstructural Features: Complementing the pixel-level feature vector with the microstructural component enhances the segmentation result if the background or the foreground is tex- tured. To demonstrate the additional information, Fig. 10 shows For numerical validation, we used 100 frames from the “En- a synthetic example. Consider Fig. 10 a) as a frame of a se- trance pm” sequence and 50–50 frames from the “Highway” and quence where the bright rectangle in the middle corresponds “Entrance am” video shots. BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 619 Fig. 14. Validation of all improvements in the segmentation regarding “Entrance pm” video sequence, row 1. Video frames, row 2. Ground truth, row 3. Seg- mentation with the “constant ratio” shadow model [21], row 4. Our shadow model with “uniform foreground” calculus [22], row 5. The proposed model without microstructural features, row 6. Segmentation results with our ﬁnal model. TABLE III VALIDATION OF THE MODEL ELEMENTS. RESULTS WITH (#1) “CONSTANT RATIO” SHADOW MODEL WITH THE “UNIFORM” FOREGROUND MODEL, (#2) “CONSTANT RATIO” SHADOW MODEL WITH THE PROPOSED FOREGROUND MODEL, (#3) “UNIFORM” FOREGROUND MODEL WITH THE PROPOSED SHADOW MODEL, (#4) RESULTS WITH OUR PROPOSED SHADOW AND FOREGROUND MODEL Advantages of using MRFs versus morphology based ap- VIII. CONCLUSION proaches were examined previously [12], [19]; therefore, we focus on the state-of-the-art MRF models. The evaluation of the This paper has introduced a general model for foreground seg- improvements is done by exchanging our new model elements mentationwithoutanyrestrictionsonaprioriprobabilities,image one by one for the latest similar solutions in the literature, and quality, objects’ shapes and speed. The frame rate of the source we compare the segmentation results. videos might also be low or unstable, and the method is able to Regarding shadow detection, the “CR” model is the refer- adapt to the changes in lighting conditions. We have contributed ence, and we compare the foreground model to the “uniform” to the state-of-the-art in three areas: 1) we have introduced a more calculus again. accurate, adaptive shadow model; 2) we have developed a novel In Table III, we compare the shadow and foreground model to description for the foreground based on spatial statistics of the the reference methods. The results conﬁrm that our shadow cal- neighboring pixel values; 3) we have shown how different mi- culus improves the precision rate, since it decreases the number crostructure responses can be used in the proposed framework as of false negatively detected shadow pixels signiﬁcantly. Due to additional feature components improving the results. the proposed foreground model, the recall rate increases through We have compared each contribution of our model to previous detecting several background/shadow colored foreground parts. solutions in the literature, and observed its superiority. The pro- If we ignore both improvements both evaluation parameters de- posed method now works in a real-life surveillance system (see crease (#1 in Table III). Fig. 6) and its efﬁciency has been validated. 620 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 17, NO. 4, APRIL 2008 [11] J. Rittscher, J. Kato, S. Joga, and A. Blake, “An HMM-based segmen- tation method for trafﬁc monitoring,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 9, pp. 1291–1296, Sep. 2002. [12] Y. Wang, K.-F. Loe, and J.-K. Wu, “A dynamic conditional random ﬁeld model for foreground and shadow segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 2, pp. 279–289, Feb. 2006. [13] R. Cutler and L. Davis, “Robust real-time periodic motion detection, analysis, and applications,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 781–796, Aug. 2000. [14] L. Czúni and T. Szirányi, “Motion segmentation and tracking with edge relaxation and optimization using fully parallel methods in the cellular nonlinear network architecture,” Real-Time Imag., vol. 7, no. 1, pp. 77–95, 2001. [15] A. Yilmaz, X. Li, and M. Shah, “Object contour tracking using level sets,” in Proc. Asian Conf. Computer Vision, Jaju Islands, Korea, 2004. [16] A. Cavallaro, E. Salvador, and T. Ebrahimi, “Detecting shadows in image sequences,” in Proc. Eur. Conf. Visual Media Production, Mar. 2004, pp. 167–174. [17] R. Cucchiara, C. Grana, G. Neri, M. Piccardi, and A. Prati, “The Sakbot system for moving object detection and tracking,” Video-Based Surveillance Systems-Computer Vision and Distributed Processing, pp. 145–157, 2001. [18] K. Siala, M. Chakchouk, F. Chaieb, and O. Besbes, “Moving shadow detection with support vector domain description in the color ratios space,” in Proc. Int. Conf. Pattern Recognition, 2004, vol. 4, pp. 384–387. [19] Cs. Benedek and T. Szirányi, “Markovian framework for foreground- background-shadow separation of real world video scenes,” in Proc. Asian Conf. Computer Vision, Hyderabad, India, Jan. 2006, vol. LNCS 3851, pp. 898–907. [20] S. Geman and D. Geman, “Stochastic relaxation, gibbs distributions Fig. 15. Effect of changing the foreground threshold parameter. Row 1: Pre- and the Bayesian restoration of images,” IEEE Trans. Pattern Anal. liminary masks (F ); row 2: results with uniform foreground calculus using Mach. Intell., vol. PAMI-6, no. 6, pp. 721–741, Nov. 1984. (s) = ; row 3: results with the proposed model. Note: For the uniform 0 model, = 2:5 is the optimal value with respect to the whole video sequence. [21] I. Mikic, P. Cosman, G. Kogut, and M. M. Trivedi, “Moving shadow and object detection in trafﬁc scenes,” presented at the Int. Conf. Pattern Recognition, 2000. [22] Y. Wang and T. Tan, “Adaptive foreground and shadow detection in image sequences,” in Proc. Int. Conf. Pattern Recognition, 2002, pp. ACKNOWLEDGMENT 983–986. [23] A. Blake, C. Rother, M. Brown, P. Perez, and P. Torr, “Interactive The authors would like to thank Z. Kato, L. Kovács, and image segmentation using an adaptive GMMRF model,” in Proc. Eur. Z. Szlávik for their kind remarks, as well as the anonymous re- Conf. Computer Vision, 2004, pp. 456–468. viewers for their valuable comments and suggestions. [24] Z. Kato, T. C. Pong, and G. Q. Song, “Multicue MRF image segmenta- tion: Combining texture and color,” in Proc. Int. Conf. Pattern Recog- nition, Quebec, QC, Canada, Aug. 2002, pp. 660–663. REFERENCES [25] Cs. Benedek and T. Szirányi, “A Markov random ﬁeld model for fore- ground-background separation,” in Proc. Joint Hungarian-Austrian [1] S. C. Zhu and A. L. Yuille, “A ﬂexible object recognition and modeling Conf, Image Processing and Pattern Recognition, Veszprém, Hungary, system,” Int. J. Comput. Vis., vol. 20, no. 3, 1996. May 2005, pp. 103–110. [2] L. Havasi, Z. Szlávik, and T. Szirányi, “Higher order symmetry for [26] D. Finlayson, S. D. Hordley, C. Lu, and M. S. Drew, “On the removal non-linear classiﬁcation of human walk detection,” Pattern Recognit. of shadows from images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Lett., vol. 27, pp. 822–829, 2006. 28, no. 1, pp. 59–68, Jan. 2006. [3] Y. Sheikh and M. Shah, “Bayesian modeling of dynamic scenes for [27] C. Fredembach and G. D. Finlayson, “Hamiltonian path based shadow object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. removal,” in Proc. Brit. Machine Vision Conf., 2005, pp. 970–980. 11, pp. 1778–1792, Nov. 2005. [28] N. Paragios and V. Ramesh, “A MRF-based real-time approach for [4] C. Stauffer and W. E. L. Grimson, “Learning patterns of activity using subway monitoring,” in Proc. IEEE Conf. Computer Vision and Pattern real-time tracking,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, Recognition, 2001, vol. 1, pp. 1034–1040. no. 8, pp. 747–757, Aug. 2000. [29] E. Salvador, A. Cavallaro, and T. Ebrahimi, “Cast shadow segmenta- [5] Y. Zhou, Y. Gong, and H. Tao, “Background segmentation using spa- tion using invariant color features,” Comput. Vis. Image Understand., tial-temporal multi-resolution MRF,” in Proc. Workshop on Motion and no. 2, pp. 238–259, 2004. Video Computing., 2005, pp. 8–13. [30] F. Porikli and J. Thornton, “Shadow ﬂow: A recursive method to learn [6] A. Licsár, L. Czúni, and T. Szirányi, “Adaptive stabilization of vi- moving cast shadows,” in Proc. IEEE Int. Conf. Computer Vision, 2005, bration on archive ﬁlms,” in Proc. CAIP, 2003, vol. LNCS 2756, pp. vol. 1, pp. 891–898. 230–237. [31] N. Martel-Brisson and A. Zaccarin, “Moving cast shadow detection [7] M. Heikkila and M. Pietikainen, “A texture-based method for modeling from a Gaussian mixture shadow model,” in Proc. IEEE Computer Soc. the background and detecting moving objects,” IEEE Trans. Pattern Conf. Computer Vision and Pattern Recognition, Jun. 2005, vol. 2, pp. Anal. Mach. Intell., vol. 28, no. 4, pp. 657–662, Apr. 2006. 643–648. [8] J. Zhong and S. Sclaroff, “Segmenting foreground objects from a dy- [32] Y. Haeghen, J. Naeyaert, I. Lemahieu, and W. Philips, “An imaging namic textured background via a robust Kalman ﬁlter,” in Proc. IEEE system with calibrated color image acquisition for use in dermatology,” Int. Conf. Computer Vision, 2003, pp. 44–50. IEEE Trans. Med. Imag., vol. 19, no. 7, pp. 722–730, Jul. 2000. [9] S. Chaudhuri and D. Taur, “High-resolution slow-motion sequencing: [33] M. G. A. Thomson, R. J. Paltridge, T. Yates, and S. Westland, “Color How to generate a slow-motion sequence from a bit stream,” IEEE spaces for discrimination and categorization in natural scenes,” in Proc. Signal Process. Mag., vol. 22, no. 2, pp. 16–24, Feb. 2005. Congr. Int. Colour Association, Jun. 2002, pp. 877–880. [10] J. Kato, T. Watanabe, S. Joga, L. Ying, and H. Hase, “An HMM/MRF- [34] L. Li and M. Leung, “Integrating intensity and texture differences for based stochastic framework for robust vehicle tracking,” IEEE Trans. robust change detection,” IEEE Trans. Image Process., vol. 11, no. 2, Intell. Transport. Syst., vol. 5, no. 3, pp. 142–154, Mar. 2004. pp. 105–112, Feb. 2002. BENEDEK AND SZIRÁNYI: BAYESIAN FOREGROUND AND SHADOW DETECTION IN UNCERTAIN FRAME RATE SURVEILLANCE VIDEOS 621 [35] A. Prati, I. Mikic, M. M. Trivedi, and R. Cucchiara, “Detecting Csaba Benedek (S’04) received the M.Sc. degree in moving shadows: Algorithms and evaluation,” IEEE Trans. Pattern computer sciences from the Budapest University of Anal. Mach. Intell., vol. 25, no. 7, pp. 918–923, Jul. 2003. Technology and Economics, Budapest, Hungary, in [36] Z. Kato, J. Zerubia, and M. Berthod, “Satellite image classiﬁcation 2004. He is currently pursuing the Ph.D. degree at using a modiﬁed metropolis dynamics,” in Proc. Int. Conf. Acoustics, the Pázmány Péter Catholic University, Budapest. Speech and Signal Processing, Mar. 1992, pp. 573–576. He is member of the Distributed Events Analysis [37] R. Potts, “Some generalized order-disorder transformation,” Proc. Research Group at the Computer and Automation Cambridge Philosoph. Soc., no. 48, pp. 106–106, 1952. Research Institute, Hungarian Academy of Sciences, [38] D. S. Lee, “Effective Gaussian mixture learning for video background Budapest. As a visitor, he has recently worked with subtraction,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 5, the ARIANA Project at INRIA Sophia-Antipolis, pp. 827–832, May 2005. France. His research interests include Bayesian [39] D. A. Forsyth, “A novel algorithm for color constancy,” Int. J. Comput. image segmentation, change detection, video surveillance, and aerial image Vis., vol. 5, no. 1, pp. 5–36, 1990. processing. [40] E. A. Khan and E. Reinhard, “Evaluation of color spaces for edge clas- siﬁcation in outdoor scenes,” in Proc. Int. Conf. Image Processing, Genoa, Italy, Sep. 2005, vol. 3, pp. 952–955. [41] W. Feller, An Introduction to Probability Theory and Its Applications, Tamás Szirányi (SM’91) received the Ph.D. degree 2nd ed. New York: Wiley, 1966, vol. 1. in electronics and computer engineering and the [42] W. K. Pratt, Digital Image Processing, 2nd ed. New York: Wiley, D.Sci. degree from the Hungarian Academy of 1991. Sciences, Budapest, in 1991 and 2001, respectively. [43] R. Haralick, “Digital step edges from zero crossing of second di- He was appointed to a Full Professor position rectional derivatives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. at Veszprém University, Hungary, in 2001, and at PAMI-6, no. 1, pp. 58–68, Jan. 1984. the Pázmány Péter Catholic University, Budapest, [44] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy mini- in 2004. He is currently a Scientiﬁc Advisor at the mization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. Computer and Automation Research Institute, Hun- 23, no. 11, pp. 1222–1239, 2001. garian Academy of Sciences, where he is the head [45] E. Aarts and J. Korst, Simulated Annealing and Boltzman Machines. of the Distributed Events Analysis Research Group. New York: Wiley, 1990. His research activities include texture and motion segmentation, surveillance [46] Y. Boykov and V. Kolmogorov, “An experimental comparison of systems for panoramic and multiple camera systems, measuring and testing min-cut/max-ﬂow algorithms for energy minimization in vision,” image quality, digital ﬁlm restoration, Markov random ﬁelds and stochastic IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 9, pp. 1124–1137, optimization, and image rendering and coding. Sep. 2004. Dr. Szirányi was the founder and ﬁrst president (1997 to 2002) of the Hun- [47] N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller, garian Image Processing and Pattern Recognition Society. He is an Associate “Equation of state calculations by fast computing machines,” J. Chem. Editor of IEEE TRANSACTIONS ON IMAGE PROCESSING. He was honored with Phys., vol. 21, pp. 1087–1092, 1953. the Master Professor award in 2001. [48] J. Besag, “On the statistical analysis of dirty images,” J. Roy. Statist. Soc., vol. 48, pp. 259–302, 1986.

DOCUMENT INFO

Shared By:

Tags:
adaptive, enhancement, enhancement algorithm, adaptive algorithm, low illumination, A binarization, method with learning, built rules, for document images, Adaptive, document, image binarization, An Objective, Evaluation, Methodology for Document, Image Binarization, Computer vision theory, global Thresholding Techniques, GRAY-SCALE, IMAGE ENHANCEMENT, Image Binarization By Back Propagation, Algorithm, Image Enhancement in the Spatial, Image Enhancement Techniques Using Local, Image Processing

Stats:

views: | 11 |

posted: | 3/29/2012 |

language: | |

pages: | 14 |

Description:
A MODIFIED UNSHARP-MASKING TECHNIQUE FOR IMAGE CONTRAST, A NEW APPROACH FOR VERY DARK VIDEO DENOISING AND ENHANCEMENT, A PDE Approach to Super-resolution with, A Unified Histogram and Laplacian Based for Image, An Adaptive Image Enhancement Technique, An Improved Retinex Image Enhancement Technique, Automatic Exact Histogram Specification for, Bayesian Foreground and Shadow Detection in, Color Image Enhancement and Denoising Using an Optimized Filternet, Content Based Image Retrieval Using, Contrast Enhancement for Ziehl-Neelsen Tissue, DETAIL WARPING BASED VIDEO SUPER-RESOLUTION USING IMAGE GUIDES, Gray-level Image Enhancement By Particle Swarm, Image Contrast Enhancement based on Histogram, Image Enhancement and Segmentation Using Dark, Image Enhancement Technique Based On Improved, Image Quality Improvement fo r Electrophoretic Displays by, Image Reconstruction Using Particle Filters and, IMPROVED IDENTIFICATION OF IRIS AND EYELASH FEATURES, Improving Colour Image Segmentation on Acute, K�R ANALİZ Y�NTEMLERİ İLE İMGE İYİLEŞTİRME, NATURAL RENDERING OF COLOR IMAGE BASED ON RETINEX, Power-Constrained Contrast Enhancement, Research on Road Image Fusion Enhancement, Shadow Detection and Compensation in High Resolution Satellite Image Based, Smoothing Cephalographs Using Modified, Three-Dimensional Computational Integral Imaging Reconstruction by Use of, Towards integrating level-3 Features with

OTHER DOCS BY jasonndcosta

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.