# Adaptive Pose Priors for Pictorial Structures

Document Sample

```					Adaptive Pose Priors for Pictorial Structures

Benjamin Sapp, Chris Jordan and Ben Taskar

Presented by Arpit Mittal

June 29, 2010

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Introduction

Test Image                     Global pose priors                 Adaptive pose priors

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Overview

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Contributions

The Adaptive Pictorial Structures model (APS)
A simple greedy procedure to obtain a sparse set of exemplars
An eﬀective kernel based on shape information
State-of-the-art performance on Buﬀy and ETHZ Pascal
datasets

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Basic PS Model
Graphical models where the nodes of the graph represents
object parts, and edges encode pairwise geometric
relationships.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Basic PS Model
Graphical models where the nodes of the graph represents
object parts, and edges encode pairwise geometric
relationships.
For human pose, the PS model decomposes as a tree
structure of unary and pairwise potentials.

where, ψi (li , x) denote the features of the image x at the
location li . The location li of the part Li is deﬁned
as: li = [lix liy liu liv ]T .

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Basic PS Model
Graphical models where the nodes of the graph represents
object parts, and edges encode pairwise geometric
relationships.
For human pose, the PS model decomposes as a tree
structure of unary and pairwise potentials.

where, ψi (li , x) denote the features of the image x at the
location li . The location li of the part Li is deﬁned
as: li = [lix liy liu liv ]T .
Inference can be performed in Θ(|Li |2 ) using dynamic
programming. MAP estimates can be computed eﬃciently
using a generalized distance transform in Θ(|Li |) time. Li
denote the set of positions for a part.
Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
APS Model
APS model has the following canonical form:

where, p0 (l|x) is the ﬁxed portion of the model learned
discriminatively a priori.
µ(x) and φ(l) include both the unary and pairwise factors.
All the parameters (unary and pairwise) have a dependence on
data x.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
APS Model
APS model has the following canonical form:

where, p0 (l|x) is the ﬁxed portion of the model learned
discriminatively a priori.
µ(x) and φ(l) include both the unary and pairwise factors.
All the parameters (unary and pairwise) have a dependence on
data x.

The components of µ are deﬁned as the kernel regression
estimate of features of the labels on the training set
{(x t , l t )}T .
t=1

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Pairwise Potential

Pairwise features are deﬁned as,
φij (l) = li − lj = [lix − ljx liy − ljy liu − lju liv − ljv ]T , for each
pair of connected parts.
The parameters µij (x) is expressed in a locally-parametric
form as a weighted sum of displacements in the training set:
T     t          t            T     t
µij (x) =         t=1 Kij (x)φij (l )/          t=1 Kij (x)

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Unary Potential

Unary features φil (l) is deﬁned for each state location l in
each part i as:

where, binuv (l) denote which angular bin l falls into and ω is
a scalar that incorporates neighbourhood.

The corresponding unary parameters are:
T     t                        T
µil (x) =          t=1 Ki (x)φil     (l t )/            t
t=1 Ki (x)

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Global location prior v/s Adaptive location prior

Global Location Prior                            Adaptive Location Prior

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Exemplar Selection

The kernel Ki (x, (x t , l t )) ≡ Kit (x) denotes the similarity
between x and training example t.
A subset of training examples is selected which can provide a
good kernel regression estimates to the whole training set:

where, err (.) is some error function between features of the
groundtruth, f (l t ) and their kernel regression
estimate.

Selection vector s is a binary vector whose components
indicate whether corresponding training examples are selected
or not.
Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Exemplar Selection
Original problem is approximately solved with a simple greedy,
forward selection of training examples:

Find an example t from the set of unselected examples which
reduce J (s) the most when added to the selected set.
st ← 1
Repeat until s = 1
Choose s from all vectors s seen during the algorithm as the
one with the smallest value J (s)

Sample exemplars
Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Kernels for human pose estimation

In the APS model, aim is to ﬁnd good training examples via
t
the kernels Kit (x) and Kij (x), which can lead to query-speciﬁc
improvements in unary and pairwise terms.
Every training instance (x t , l t ), we generate a set of examples
via aﬃne transformations a ∈ A varying scale and location
t t
points {(xa , la )}
The aﬃne transforms are ﬁltered using a quick, coarse region
t t
support distance dregion (x, (xa , la )) to get a shortlist of
plausible aﬃne transformations, A

Training image             Aﬃne transformations
Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Region distance

The query image x is segmented into superpixels.

Test Image                  Superpixelization
Ground-truth from the training examples (A ) is converted
t
into a binary template mask Ma by rendering the upper and
lower arms as rectangles.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Region distance

t
If at least 10% of the limb mask Ma is contained in a
superpixel, that superpixel is considered as supporting the
groundtruth hypothesis.
The union of all such supporting superpixels yield a binary
x
Superpixel mask is scored using the intersection-over-union
measure to obtain the region distance:

where, (r , c) index all (row, column) pairs in the masks.
Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Contour distance

After initial ﬁltering of aﬃne transforms using region distance,
t t
contour distance dcontour (x, (xa , la )) is used to deﬁne the
kernel value:
2
K t (x) = exp(−dcontour (x, t))/σK )
Salient contours in each image are extracted using the
Probability of boundary detector (Pb) [C x for the query
t
image; Ca for the training examples].
Contours are reﬁned by enforcing minimum length and
consistency with the foreground hypothesized by the
ground-truth.
Contour sets are lists of points which are discretized into 8
angular bins: C = [c1 ...c|C | ], ci = [cix ciy ciθ ]T

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Contour distance

Histograms are build over orientations placed at diﬀerent
coordinates (x, y ) and over varying radii of support r.
hθ (C ) ≡ hθ (C ; x, y , r ) =           c∈C       1(cθ = θ).1(||[cx ; cy ]−[x; y ]||2 < r )

Contour alignments                    Histogram generation
The contour distance is deﬁned as:

Benjamin Sapp, Chris Jordan and Ben Taskar       Adaptive Pose Priors for Pictorial Structures
Parameters Learning

Unary-Potential

For each part, a Gentleboost classiﬁer is learned
discriminatively using HOG features.
This is done by maximizing the conditional likelihood of the
training set.
Unary terms give po , the non-adaptive part of the model.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Parameters Learning

Unary-Potential

For each part, a Gentleboost classiﬁer is learned
discriminatively using HOG features.
This is done by maximizing the conditional likelihood of the
training set.
Unary terms give po , the non-adaptive part of the model.
Pairwise-Potential

i,j is   learned discriminatively as above.
i   are inferred using cross-validation.
All other parameters µij (x) and µi (x) are estimated using
kernel regression framework for APS.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Inference

Marginal distributions p(Li = li |x) for each part is estimated
by forward-backward sum-product message passing.
Location and direction of the body part is inferred as the
max-marginal.

li = maxli ∈Li p(Li = li |x)

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Results

Methods
PS: Baseline pictorial structure
PS+global-lp: PS model with global location priors
Sparse-APS: APS model with the exemplar selection
APS+Kgt : APS model with the oracle kernel which measure the
the distance between groundtruth arm locations in
the test and training examples
Template: Kernel deﬁned as the weighted sum of template
matches (best matched aﬃne transform) from the
training poses.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Results

Evaluation-Measures
PCP (Percentage of correct parts): Percentage of parts
whose distance from the groundtruth part endpoints
is less than some fraction (0.5) of the length of the
groundtruth part.
NJE (Normalized Joint Error): Euclidean distance to
groundtruth endpoints divided by the length of the
groundtruth segments.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Results

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Results

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures
Results

Test Image            Top 5 NN                Initial Belief     Marginals-APS Marginals-PS

Benjamin Sapp, Chris Jordan and Ben Taskar        Adaptive Pose Priors for Pictorial Structures
Conclusion

Data dependent unary and pairwise terms allow the model to
adopt parameters with the test image.
Sparse exemplar set selection reduced the size of the training
data to 16% of the original which boosted the performance.
Shape based kernels provide better resemblance score.

Benjamin Sapp, Chris Jordan and Ben Taskar   Adaptive Pose Priors for Pictorial Structures

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 17 posted: 12/10/2011 language: English pages: 28
How are you planning on using Docstoc?