Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

The AIT 3D multimodal person tracker for CLEAR2007

VIEWS: 12 PAGES: 8

									The AIT 3D Multimodal Person
  Tracker for CLEAR 2007
     Nikos Katsarakis, Fotios Talantzis,
        Aristodemos Pnevmatikakis
          & Lazaros Polymenakos
      {nkat,fota,apne,lcp}@ait.edu.gr

                May 8, 2007
Visual 3D PT: Input and Association
                             D
                                             Presentation Area             • Utilize 2D AIT body tracker for bodies,




                                                                   C
          C




                                                (250,360,170) cm
                                                                             face tracker for faces in them in 4 camera
                                                                             views. Use face BBs as input for 3D
                                                                       A
                                                                             association
                                                                           • Span 3D space using cube of 5cm edge
         (110,110,135 ) cm

                                            125cm3 cube spanning
                                 B

                                                                               – Map cube to all camera views
                                                     3D space

     C                                           (270,80,130) cm
 y




 z                                   (200,40,135 ) cm                          – Collect faces if cube center in face BB
          C




                                                                   C



                x
                                                                           • Consider cubes with at least 2 cameras
                                               MK3




                                                                             contributing a face
Visual 3D PT: Validate Associations
 • For multiple people, find sets of mutually
   exclusive associations
 • Select one set of associations based on track
   consistency
   – as indicated by the 3D Kalman trackers that maintain
     the tracks
 • Eliminate too short tracks, getting rid of wrong
   associations
 • Optionally, use AIT body tracker on panoramic
   camera and ask for the found 3D positions to be
   mapped inside the bodies
Visual 3D PT: Validate Associations
                      Audio 3D PT
• State-space approach based on
  particle filters
   – PF assumes that the source moves
     according to a model that has a
     specific consistency across time
     frames
   – The PF uses time delay estimates
     from pairs of microphones as a feed
• A Voice Activity Detector is
  integrated to deal with short
  pauses in speech
• An external PF is initialized at
  every frame to deal with the case
  where speakers interchange or
  particles get trapped in a spurious
  location
                                    Audiovisual 3D PT
           3D positions
          (video tracker)
                                                          3D position
                                                        (audio tracker)
                                                                          • Use video and
                                                                            synchronized audio
                                    Track video
                                                        Synchronization
                                                                            positions
                                     positions
                                                                          • No audio: Track last target
                                    Audio data
                                                                            from video
                    YES                                  NO
                                    available?
                                                                          • Noise or not single speaker:
               Multiple
                                                          Previously
                                                                            no output
               speakers                   YES
                                                                          • Audio position close to
                                                        tracked target?
               or noise?


                 NO
                                                                            video: output video position
           Find closest
           video match
                                                  YES         NO
                                                                          • Audio position far from
                                                                            video: output audio position
                Closer
               than D?
    NO                        YES

Output audio               Output video
                                                          No output
  position                   position
                       3D PT: Results

           70
                                                      Visual
           60                                         Audio
                                                      A/V
           50

           40

           30
MOTA (%)




           20

           10

            0

           -10

           -20

           -30
                 All    AIT   IBM   ITC   UKA   UPC
                              Visual 3D PT: Relation to 2D FT
                        60                                                                  100                                                                                           100

                                                                                             90                                                                                            90
                        50
                                                                                             80                                                                                            80




                                                              3D person tracking MOTA (%)




                                                                                                                                                            3D person tracking MOTA (%)
                        40
PT MOTA - FT MOTA (%)




                                                                                             70                                                                                            70

                        30                                                                   60                                                                                            60

                        20                                                                   50                                                                                            50

                                                                                             40                                                                                            40
                        10
                                                                                             30                                                                                            30
                         0
                                                                                             20                                                                                            20
                        -10
                                                                                             10                                                                                            10

                        -20                                                                   0                                                                                             0
                               AIT    IBM   ITC   UKA   UPC                                       0   20            40             60            80   100                                       0   20       40             60         80   100
                                                                                                           Face tracking false positive rate (%)                                                         Face tracking miss rate (%)




                              • MOTA improvement over 2D FTs
                              • Most important role in 3D MOTA is the 2D
                                miss rate
                                     – large slope in linear fit

								
To top