Normalized Cross- Correlation for Tracking Object and Updating the Template: Exploration with Extensive Dataset

					                                                               (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                            Vol. 10, No. 02, February 2012

 Normalized Cross- Correlation for Tracking Object
   and Updating the Template: Exploration with
                Extensive Dataset
         M.H.Sidram, Department of E & EE,                                       Nagappa.U.Bhajantri, Department of CS&E,
      Sri Jayachamarajendra Co llege of Engineering                                       Governement Engineering Co llege
                Mysore , Karnataka, India                                                 Chamarajanagar, Karnataka, India
                 mhsidram@g m                                                 m

Abstract—Tracking is the process explicitly dedicated to estimate           motivated based on the correlation score. Secondly the frame
the path of the object as it moves along the region of scene in the         differencing algorithm is employed to produce the motion
image plane. In other words it is a strategy to detect and track
                                                                            regions. Finally, sub-images are cropped and stored via frames
moving object through a sequence of frames. Here an attempt to
                                                                            which are corresponding to motion regions. In the sequel,
enable the Normalized Cross-Correlation strategy for both
matching and updating the template for tracking the object in an
                                                                            existing temp late will be correlated with sub-images and the
outdoor environment is made. The proposed method explores to                best match will be replaced as a new template. This process is
consume extensive bench mark dataset. The evolved system                    repeated for every fixed interval of frames. An experiment has
critically exhibits the capability to track an object or multiple           been conducted exhaustively employing the benchmark
objects genuinely under varied illumination conditions.                     datasets such as PETS 2001 (1, 2 and 3 clips) and VISOR(
Evidently, the outcome reveals the worthiness of the proposed               video for traffic surveillance clip) and their details are
developed novel system.                                                     tabulated in the table- 1.
Keywords- Object tracking, Normalized Cross-Correlation,                                           T ABLE I. Showing the Dataset
Frame difference, Template updating.

                       I.    INT RODUCTION                                          Sr                      # of
                                                                                               Dataset                Contents          Camera
                                                                                    No                    frames
    Object tracking is basically an attention drawing
mechanis m. It is also a process of establishing the                                            PET S               Human,Cars         Side fixed,
                                                                                     1                     2343
correspondence to the objects in sequence of frames. Perhaps                                   2001 (1)              and People        Moving tree
it unearths many applicat ions but important among them are in                       2
                                                                                                PET S
                                                                                                                                    Top-Down fixed
video surveillance, monitoring the traffic and as a v ision to the                             2001 (2)              and People
                                                                                                PET S               Human,Cars
robot. There is no dearth of relevant literature in tracking                         3
                                                                                               2001 (3)
                                                                                                                     and People
                                                                                                                                        Side fixed
object emerged in a moderate scene. It could be possible                                                   1495
through spatial or appearance based model. Secondly several                          3         VISOR                Human,Cars          Side fixed
processes are evolved fro m frequency sphere. Further too
hybrid approaches are celebrating effect ive perfo rmance.
                                                                               This paper contents are arranged as follows. Section 2 deals
There are several approaches for tracking object in a scene that
                                                                            with the related work. Section 3 emphasizes the proposed
are Po int tracking, Kernel tracking and silhouette tracking.
                                                                            method. The experiment and results are discussed in section 4.
Temp late matching is sub-class of Kernel tracking [8]. So me
                                                                            Conclusion and future work portrayed in section 5.
of the factors make object tracking comp lex due to change in
color and illu mination, noise in the images, abrupt motion of                           II.      COMPREHENSION OF RELATED WORK
the objects and computational aspects for real-time processing
                                                                                In the work of [4] J.P. Lewis et al. encourages the
[8]. Today the prime research in computer vision algorith m is
                                                                            potentiality of normalized cross -correlation based template
detection and tracking of object. One such application is
                                                                            matching in the spatial domain.
analysis of traffic scene. Thus vehicle detection is important
                                                                                The author [1] A lan .J. Lipton et al. attempted to employ
for civilians as well as military usage especially in aerial and
                                                                            the combination of frame d ifferencing and template matching
usual traffic scene since vehicles are vital part of hu man life.
                                                                            to highlight the object in a s cene. The temp late matching is
    This paper attempts to propose a system which tracks the
                                                                            guided by temporal differencing and image based correlation
object vigorously with the correlation between object and
                                                                            to make tracking process robust. Further the Impulse Response
template. Ho wever it takes care of updating the temp late with
                                                                            filter (IIR) is used to update the template, in other words it is
the help of normalized cross -correlation. In order to emphasize
                                                                            known as adaptive template matching method. Researcher
the proposed process with the help of three build ing blocks,
                                                                            Hieu T. Nguyen et al. [2] tried to co mprehend the tracking
such as correlating the template and image is aspired which is
                                                                            process for a rigid object through Kalman filter and

                                                                                                             ISSN 1947-5500
                                                            (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                         Vol. 10, No. 02, February 2012
consequently updating the template to adapt changing
illu mination and orientation of the object is achieved via an
adaptive Kalman filter.                                                                    III.       PROJECTED PROCESS
    In the work of [6] Longin Jan Latecki et al. proposed                         This section the dedicated to present a proposed work
strategy which is based on selective hypothesis tracking                    and aims to track the object and update the template. The
algorith m. It includes the motion regions, image align ment and            simp lified block d iagram of a general system is shown in
minimu m cost estimation to update the template dynamically.                Figure.1
In other words minimu m cost matching is established through
association between the motion region and the aligned
template. Thus motion vector is updated
    Dynamic template matching and controlling the field of
view o f camera by PTZ was remarked by [5] Karan Gupta et                                 Pre-process the
al. using frame difference approach and choosing the proper                             Acquired image and
                                                                                           set Count=0
threshold. This strategy basically tries to consider the instant
updating the template although limited to a single object in a
    In the work of Xue mei et al. [9] used the probabilistic
algorith m fo r tracking, wh ich included template matching and
incremental subspace update. The temp lates are modeled using                             Compute cross-
mixed probabilities and updated based on considerably                                   correlation of image                          T emplate,
                                                                                            and template                              Count=0
changes of the object appearance. The augmentation of the
Kernel Gream matrix with a row and column yields the

   Jiyan pan et al. [11] gradual shifting away fro m the
template in object tracking concept is well addressed through
                                                                                          Localize the object                    Cross-Correlate the
the template drift. In this work it is observed carefully that                              & advance the                         Sub-images and
where template drifts occurs and cons equently the template is                                  Count                            Template to renew
updated. Kalman Appearance Filter [11] emp loyed to update                                                                           Template
the template.

   Wenhui Liao et al. [10] introduced a new method called
Case Based Reasoning (CBR) to maintain accurate temp late of
object automatically. In other words algorith m dynamically                                                                         Crop the Sub-
updates the case base (template). With this, real t ime face                                                                           images
tracking is built to track the face robustly under different                                 Count=                               corresponding to
                                                                                                  k              Yes               moving objects
orientations and conditions.                                                                                                      detected by frame

   The literature surveyed till this point has encouraged us to
propose a system based on norma lized cross-correlation to
track the object and update the template.
   Hence, we are proposing the temp late updating task with
the combination of frame difference and normalized cross -
correlation approach as a novel strategy. Perhaps it is expected
to yield best possible outcomes. In other words this work tries                              End of
to concentrate on the hybrid model for updating the template.                                video
Further proposed work ensures the tracking of single and
mu ltip le objects in a scene. On the other hand projected
system addresses the limitations observed in the literature.

                                                                                     Figure 1: Proposed system

                                                                         The computation of normalized cross -correlation involves
                                                                         through the following mathematical exp ression displayed in
                                                                         equations 1, 2 and 3. Subsequently determine the location

                                                                                                         ISSN 1947-5500
                                                                             (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                          Vol. 10, No. 02, February 2012
where the maximu m value of correlation score occurs and                              moving objects. In the sequel the centroids of moving objects
corresponding location is the best match. Thus it gives the                           are estimated.
evidence to put the bounding box over the object.                                          The proposed strategy encourages to mound the cropped
                                                                                      sub-images with the help of centroid followed by process of
                                                                                      computation of cross -correlation score between the template
                                                                                      and sub-images. Therefore the best match will be the new
       (u,v) =                                                   (1)
                                                                                      template and process of updating is repeated for k- interval of
                                                                                      frames. As it is emp irically observed by the proposed
  Where f (x, y) – is the image, t (x-u, y-v) is template                             experiment, the value of the k reflects with the dataset. It is
positioned upon u & v.                                                                portrayed in the plot shown in figure 2. This entire process is
                                                                                      illustrated in subsequent section through two phases. First
      (u ,v ) is a squared Euclidean distance and summation is                        algorith m is predominantly exhib it object tracking task and
                                                                                      second one dedicates to update the template in turn which
done over x and y.                                                                    supports and provide enhanced knowledge to track the object
                    =                                                                                                .
                                                           (x - u, y - v)]
                                                                                        1. Renovate input video into frames.
                                                                                        2. To obtain noise free frame, med ian filter is emp loyed.
If the terms                       and         (x-u, y-v) are t reated as               3. In itialize with template.
constants.      The appro ximate equation called as cross-                              4. Read the rth frame and the template, co mpute the
correlation is.                                                                            correlat ion score. Put the bounding box over the object
                                                                                           for the best match.
    C(u, v) =                                                          (2)
                                                                                        5. Generating and updating the template after every fixed
                                                                                           interval o f frames using Algorithm-II .
 It is used as a measuring unit of similarity between the image                         6. Step 4 and 5 are repeated for n frames
and the template.
The difficult ies are noticed such as image energy wh ich causes                                                Algorithm-II
correlation score minimu m, sorting of C(u,v) depends on                                Generate and update the templ ate after every fi xed
template size, change in illu minations not affecting the                               interval
equation (2) are eliminated through a process of                                        1. Init ialization of count through k- interval of frames.
normalizat ion. Therefore the normalized cross -correlation (γ)                         2. Get absolute value by subtracting mth frame fro m (m-1)th
expressed through equation 3 as follows.                                                       frame.
                                                                                        3. Using threshold, the difference image is converted to
                                                                                           binary form
                                                                                        4. The mov ing objects are labeled using connected
                                                                                           co mponent analysis.
                                                                                        5. Determine the Centroids of moving objects.
                                                                                        6. Cropped sub-images corresponding to centroids are
Where                   and      are means of image and temp late                       7. Declare a new template using correlat ion between the
                                                                                           template and the sub-images
                                                                                                 IV    RESULTS A ND EXPERIM ENTS
         Further the necessity of temp late updating as we
discussed and same is achieved through the equation 4. In
                                                                                          We have conducted experiments to corroborate the
order to obtain the absolute value of moving object by frame
                                                                                      performance efficacy of the normalized cross -correlation
differencing below equation is exp loited.
                                                                                      approach. The computational aspects of the evolved method
                                                                                      turn out to be polynomial and its order is O ( n6 ). The same is
D = | (fm) – (fm-1) |
                                                                                      tested over the available machine Pentiu m(R) Dual-co re CPU,
                                                                                      T4200 @ 2.00 GHz and 2.83 GB of RAM of 1.20 GHz.
P(i,j) = { f(i,j)         D≥T
                                                                                          The experiment is conducted on the PETS2001 (Video
         { 0              D <T                         (4)
                                                                                      clips 1,2 and 3 clips) and VISOR v ideo dataset (Video for
                                                                                      traffic surveillance clip) fo r the different bunch of frames
         In view of obtaining the binary form fro m the
                                                                                      which includes different objects. Individual objects are t racked
difference image by selecting suitable threshold and post-
                                                                                      using respective templates, few of them are selected to
process it in the later stage using the morphological
                                                                                      experiment are tabulated in the TABLE VI. Single object as
operations. Then, connected component helps to label the

                                                                                                                  ISSN 1947-5500
                                                                       (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                    Vol. 10, No. 02, February 2012
small as 50 pixels is tracked efficiently. Template update is
done empirically for every k frames which yields better
performance. In the experiment environ ment k predo minantly
represents template updating at interval of frames and also                     TABLE IV. Showing the effect of template updating upon tracking for PET S
                                                                                2001 (3) with 52 frames.
known as updating frequency. This is summarized further
through the TABLE II to TABLE V and Figure 2. Here we
have noticed some of the interesting observations which made                                                                                              Mis-
                                                                                    Sl.                           Updating      No. of
us keen upon further explorat ion in the future work                                No                            frequency    updates
                                                                                                                                           tracking      trackin
         It is observed that updating template at every                                       1                      1           52           52            0
alternate frame beco mes computationally expensive. On the                                    2                      2           26           14           38
other hand updating after many frames will fail the tracking.
                                                                                              3                      5           10           46           6
Hence it is emp irically chosen a suitable update frequency as k
because of the stability. It is also further noticed by                                       4                      6           8            49           3
experimentation, that the tracking performance is directly
                                                                                              5                      10          5            10           42
proportional to the size of template. In other words larger the
template, tracking is better. The proposed system has robustly                                6                      15          3            15           37
performed over the different set of frames co mprising                                        7                      20          2            20           32
different objects and varied illu mination conditions. It is
revealed in the TABLE VI that displayed the mis- tracking                                     8                      25          2            25           27
rate is minimal. The tracking results can be observed from the
Figure 5 (a), (b ), (c) and (d) are hu man, car (dark), car (wh ite)
and people respectively

TABLE II. Showing the effect of template updating upon tracking for             TABLE V. Showing the effect of template updating upon tracking for VISOR
PET S2001 (1) with 52 frames.                                                   with 23 frames.

  Sl.      Updating        No. of    tracking      Mis-                                                                                                    Mis-
                                                                                       Sl.                        Updating      No. of
  No       fre quency     updates                 trackin                                                                                   tracking      trackin
                                                                                       No                         frequency    updates
                                                     g                                                                                                       g
   1            1           52           52          0                                          1                     1           23           23            0
   2            2           26           14          38                                         2                     2           11           20              3
   3            5           10           46          6                                          3                     5           4            20              3
   4            6            8           49          3                                          4                     6           3            20              3

   5           10            5           10          42                                         5                     9           2            23              0

   6           15            3           15          37                                         5                    10           2            20              3
                                                                                                6                    15           1            15           87
   7           20            2           20          32
                                                                                                7                    20           1            20              3
   8           25            2           25          27
                                                                                                8                    25           1             0           23

TABLE III. Showing the effect of template updating upon tracking for
PET S2001 (2) with 70 frames.

                                                   Mis-                                                             Template update frequency v/s Dataset
  Sl.      Updating       No. of
                                     tracking     trackin
  No       fre quency    updates
                                                                                  Update frequency count k

                                                     g                                                       10
   1           1            70          30          40
   2           2            35          60          10                                                        6
   3           5            14          52          18                                                        4
   4           6            11          48          22                                                        2
   5           8             8          61          09                                                        0
   6           10            7          60          10                                                              PETS (1)    PETS(2)        PETS(3)          VISOR
   7           15            5          43          27                                                                                   DATASET
   8           20            3          20          50
   9           25            2          50          20                          Figure- 2: Shows the variations of update frequency with

                                                                                                                                       ISSN 1947-5500
                                                (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                             Vol. 10, No. 02, February 2012

Frame-1                         Frame-100                            Frame-1                               Frame-100

                Figure 3 (a)                                                        Figure 4 (a)

                                                                     Frame-1                               Frame-35
Frame-5                              Frame-99

                Figure 3 (b)                                                        Figure 4 (b)

Frame-1                              Frame-35
                Figure 3 (c)
                                                                                    Figure 4 (c)

Frame- 2                         Frame-68
                                                                     Frame-7                               Frame-94
                Figure 3 (d)
                                                                                    Figure 4 (d)

           Figure 3 PET S 2001 (1)
                                                                               Figure 4 PET S 2001 (2)

                                                                                     ISSN 1947-5500
                                                   (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                Vol. 10, No. 02, February 2012

Frame-468                             Frame-540                      Frame-10                              Frame-60
                  Figure 5 (a)                                                          Figure 6 (a)

Frame-2575                            Frame-2665                     Frame-8                                Frame-24
                  Figure 5 (b)                                                          Figure 6 (b)

Frame-830                             Frame-875                     Frame-10                                Frame-10

                  Figure 5 (c)                                                          Figure 6 (c)

Frame-340                             Frame-430                    Frame-13

                 Figure 5 (d)                                                    Figure 6 (d)

             Figure 5 PET S2001 (3)                                    Figure 6 VISOR (video for traffic surveillance)

                                                                                         ISSN 1947-5500
                                                                          (IJCSIS) International Journal of Computer Science and Information Security,
                                                                                                                       Vol. 10, No. 02, February 2012
Table VI- Object tracking on PET S 2001(3) video. The details like type and
number of Objects and number of frames and tracking and mis-tracking.                   [3] S2001/pets2001-dataset.html

 Sl.      Scene       Obj                     Frames      Trac       Mis               [4] Lewis.J.P “ Fast Normalized Cross-Correlation”. Vision
 No       Objects     ects                                ked      tracked                  Interface, P.120-123, 1995.

  1        Human        1        Humn           90         90         0
                                                                                       [5] Karan Gupta and Anjali V.Kulkarni. “Implementation of an Automated
  2                     2         Car           90         89         1                    Single Camera Object Tracking System Using Frame Differencing and
           & Car
                                                                                           Dynamic T emplate Matching”
           Car &
  3                     2      Car (whit)       90         90         0
                                                                                       [6] Longin Jan Latecki and Roland Miezianko. “ Object Tracking with
          Human,                                                                            Dynamic Template Update and Occlusion Detection”.
                                                45         45         0
  4        Car &        3       People
          People                                                                       [7] Javed.O, and Shah.M, "Tracking and Object Classification for
                                                                                             Automated Surveillance", The seventh European Conference on
                                                                                             Computer Vision, Copenhagen, May 2002.
                                                                                       [8] Yilmaz, A., Javed, O., and Shah, M., “Object tracking: A survey”, ACM
                                                                                               Computing Surveys, 38, 4, Article 13, Dec. 2006

     In this paper it is established through normalized cross -                        [9] Xue Mei, Shaohua Kevin Zhouy and Fatih Porikliz “Probabilistic Visual
correlation feature to track mu ltiple objects. This procedure                             Tracking via Robust Template Matching and Incremental Subspace
                                                                                           Update”, Mitsubishi electric research laboratories.
being able to track object as small 50 pixels and update
frequency is empirically decided as k frames. It is observed                           [10] Wenhui Liao, Yan Tong, Zhiwei Zhu, and Qiang Ji
that larger the template, tracking is better on the contrary poor                           “Robust Object T racking with a Case-base Updating
tracking. Experimental results on PETS 2001 and VISOR                                       Strategy” IJCAI-07
video dataset reveal that the approach is capable of spotting                          [11] Jiyan Pan, Student Member, IEEE, and Bo Hu, Member, IEEE
and tracking the object correctly. The future work can be                                   “robust object tracking against template drift ”
focused to track the object for different set of videos and
handle the partial and full occlusions. Hence many future                              [12] - video for
avenues can be thought of based on the success reported in                                  traffic surveillance.
this paper.

                                                                                                                   AUTHORS PROFILE
                        ACKNOW LEDGM ENT

          Thanks are due to the “JSS Research Foundation”                                M.H. Sidram is currently pursuing PhD in Electronics under the University
for constant support in carrying out the research work.                               of Mysore, Karnataka, India. He did his M.Tech in CEDT, Indian Institute of
                                                                                      Science, Bangalore in 2003. He is working as Associate Professor in E&EE
                                                                                      department in Sri Jayachamarajendra College of Engineering, Mysore,
                                                                                      Karnataka, India. Area of research is Image and Video processing.
  [1]    Alan J. Lipton, Horonobo Fujiyoshi and Raju S. Patil, “Moving Object           Nagappa. U. Bhajantri has completed PhD under the university of Mysore,
        Classification and Tracking from Real-Time Video” submitted to IEEE           Karnataka, India. He did his M.Tech in Computer T echnology (Electrical
        WACV 98, 1998. Fourth edition, 2005.                                          Engineering Department) from Indian Institute of Delhi, India in 1999. His
                                                                                      area of interest are Image, Video and Melody processing. He is currently
                                                                                      working as Professor and HOD of Computer Science and Engineering in
[2] Hieu T. Nguyen, Marcel Worring and Rein van den Boomgaard.                        Government Engineering College, Chamarajanagar, Karnataka, India.
    “Occlusion Robust Adaptive Template Tracking”, Intelligent Sensory
     Information Systems, University of Amsterdam, Faculty of
     Science,Kruislaan 403, NL-1098 SJ, Amsterdam, The Netherlands.

                                                                                                                       ISSN 1947-5500

Shared By: