Docstoc

Object Recognition Using Alignment

Document Sample
Object Recognition Using Alignment Powered By Docstoc
					Object Recognition Using
       Alignment
     Brian J. Stankiewicz
  Approaches to Human Object
         Recognition
• Alignment Approach
  – Store image(s) in memory
  – Use image transformations to bring new view
    into alignment with viewed image.
  Approaches to Human Object
         Recognition
• Alignment Approach




          Template matching Failures
  Approaches to Human Object
         Recognition
• Alignment Approach




   Many different exemplars of category of
   object. How does one handle this type of
   variability?
  Approaches to Human Object
         Recognition
• Structural Description
  – Pre-process image before storing in memory
  – Decompose object into simple parts
  – Describe the object’s shape in terms of their
    parts
     • Parts are described using specific non-accidental
       properties
Structural Descriptions
            • Objects are decomposed
              into “parts”.
            • Objects are described by
              specifying configuration
              of parts and their
              relations.
Structural Descriptions
            • Each part is describe
              by specifying the
              values of particular
              shape parameters.
            • Varying parameter
              varies the shape.
       Structural Descriptions
• Challenge.
  – How do you decompose image into objects and
    objects into parts?
  – How do you determine the shape parameters of
    a part given an image.
     • This topic will be covered next week in Biederman
       and Biederman & Cooper papers.
                        Today…
• Begin by investigating the effect of viewpoint on
  object recognition.
   – Look for evidence of alignment approach
   – Shepard & Metzler
      • Mental rotation of 3d shapes
      • Picture Plane and Depth rotations
   – Tarr & Pinker
      • Mental rotation of 2d shapes
      • Picture plane rotation only
      • Multiple-Views Hypothesis
          Shepard & Metzler
• Wanted to understand how humans
  recognize different views of the same
  object.
  – Different images of same 3D shape can be
    produced by manipulating viewpoint
• Investigated the effect of depth and picture-
  plane rotations.
Same/Different Paraidgm




R              R
Shepard & Metzler: Stimuli
             • “Novel” stimuli: Not a
               lot of previous
               experience
             • Fairly difficult task
                – Cannot simply use
                  simple features
             • Able to carefully
               control view
               information.
  Shepard & Metzler: Procedure
• Two images presented simultaneously
  – Images of identical or “mirror reflected” objects
• Subjects indicated whether two images
  depicted same object
  – Responded by pulling a “lever”
• Record response times
                         Shepard & Metzler: Results
                                             • Response times
                                               increased linearly with
RT To “Same” Responses




                                               orientation
                                             • Suggests that subjects
                                               are “mentally rotating”
                                               images to determine
                                               match.
                         Angle of Rotation
Shepard & Metzler: Results
             • Reaction times
               increased linearly with
               depth orientation
             • Suggests a similar
               mechanism
Shepard & Metzler: Results
             • Not only are both
               depth and picture-
               plane rotations linearly
               increasing, but they
               have very similar
               slopes.
             • Suggestive of a single
               “mental rotation”
               mechanism.
           Object recognition
• Two fundamental approaches to human
  object recognition
  – Alignment approaches
     • Object recognition through alignment process
  – Structural description approach
     • Decomposition of features included in an object
     • Describe the objects’ shape in terms of their parts
       and relation among the parts.
          What is alignment
• Definition
  – A process that transform stored images to bring
    new view into alignment with viewed image.
• Why we need alignment?
  – We cannot recognize object exactly only by
    template matching
  – Need for some process which transform input
    images or data  alignment
2 studies in alignment approaches
• Shepard & Metzler
  – Mental rotation of 3D objects shapes
  – A single mental rotation mechanism
  – Evidence*: same results from rotated depth and
    picture-plane pairs.
• Tarr & Pinker
  – Multiple view hypothesis (?)
                Tarr & Pinker
• Wanted to investigate “mental rotation” in
  more detail
  – Two hypotheses
     • Single canonical image stored in memory and all
       new images are aligned to that single representation
     • Multiple-Views stored in memory.
        – Align new view to closest stored view
       Tarr & Pinker: Method
• Train subjects to recognize small set of
  novel, letter-like objects.
  – Did a “handedness” task
  – Is the image the trained image (standard)or its
    mirror reversal?
Tarr & Pinker: Stimuli
           • Novel, letter-like
             images.
           • Subjects trained on 3
             of the images
              – Reduce stimuli specific
                effects
       Tarr & Pinker: Procedure
• Trained subjects on 4 different orientations
   – (0°,45°,-90°,135°)
• Tested on trained and “surprise orientations”
• Measured response times
Tarr & Pinker: Exp. 1 Results
                      Initial reaction times
                      similar to S&M

                      Surprise orientations
                      slower than trained

                      Performance improves
                      after 13 blocks



                  Block 1~12: practice
                  Block 13: practice + surprise
Tarr & Pinker: Exp. 1 Results
                       Compute best fitting
                       line to compute slope

                        Surprise orientations’
                        required degree to be
                        rotated
                          90 : 45 
                        - 135: 45 
                        - 45 : 45 
                        but 180   : 90



                  “4 different orientation-
                  images stored in memory?”
 Tarr & Pinker: Exp. 1 Results




High slope = much rotation = single canonical image
 Tarr & Pinker: Exp. 1 Summary
• Stimuli showed a similar result to previous
  findings
  – Increased RT with disparate orientations from
    training
  – Subjects showed improvement following
    training
  – Even after training, subjects were slower on
    non-trained (intermediate) orientations
Tarr & Pinker: Exp. 2 Motivation
• Demonstrated an improvement in
  recognition times with training.
  – Not a demonstration of canonical or multiple
    views.
  – Experiment 2, train on a few orientations and
    test on multiple orientations.
  – See if there is evidence for rotating to the
    “nearest” trained orientation.
      Tarr & Pinker: Methods
• Similar to Experiment 1
  – However, classification task rather than
    “handedness” task.
     • Three objects: “Kip”, “Kef”, “Kor”, and distractors
  – Record response times
Tarr & Pinker: Exp. 2 Procedure
                • Train on 3 orientations
                • Test on multiple
                  intervening orientations
                • Look for rotation
                  functions to nearest
                  trained orientation
Tarr & Pinker: Exp. 2 Results
 Tarr & Pinker: Exp. 2 Summary
• Investigated whether subjects show a
  linearly increasing RT to canonical view or
  closest trained view.
  – Showed mixed evidence.
  – For 0° and 210° it appears that there is a dip in
    the surrounding RTs
     • Suggests rotation to nearest orientation
  – For 105° no evidence of alignment.
Tarr & Pinker: Exp. 2 Results
                     Mental Rotation in
                     Block 1

                     By block 13 trained
                     orns are fast

                      Mental rotation
                      rate for untrained
                      orns slower.
         Tarr & Pinker: Study 3
• Wanted to see if “handedness” played a role in
  recognition times.
  – Experiment 1 showed effect for handedness
    judgment.
  – Subjects might engage in handedness judgment
    unnecessarily.
  – Trained on both “standard” and “reversed” images
  – Tested on both set of images
     • No handedness judgment required
Tarr & Pinker: Exp. 3 Results
                        180

               -135 
        90 
- 45

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:6
posted:8/3/2011
language:English
pages:36