# Introduction to Sequential Monte Carlo Methods stochastic by gregorio11

VIEWS: 34 PAGES: 32

• pg 1
```									Introduction to Sequential Monte Carlo Methods
Jochen Voss, University of Warwick (UK)

stochastic modelling
ﬁltering
statistical tools
sequential methods
part I
stochastic modelling
Many quantities in nature, economy, etc. can be described as random
phenomena.

examples:
weather
stock prices
election results
throwing dice

Randomness is used either
because the system has a random component or
to describe missing information.
Statistical quantities to describe randomness:
# events occuring
probability: P ≈                       when the experiment is repeated
# trials
very often.
For numerical random quantities we also use:
1    N
expectation/mean: average value over many trials: µ ≈     N    i=1 Xi .
variance: average squared distance from the mean over many trials:
σ 2 ≈ N N (Xi − µ)2 .
1
i=1
Probability distributions are used to completely describe a random
quantity; they give the probability for every event.

example: The Gaussian normal distribution is described by
1   2
P(X ∈ A) =         √ e−x /2 dx
A    2π

−3    −2     −1     0      1     2     3

Probability distributions describe uncertainty or lack of information.
More complex examples of a random objects are stochastic processes:
quantities which depends on randomness and on the time.
examples:
the series of results when throwing a die
the temperature at a given spot as a function of time
stock prices
complex example:
the     change    in
time of an oceanic
velocity ﬁeld can
be described as a
stochastic process.
part II
ﬁltering
conditional probabilities
Conditional probabilities are used when partial information is available
about a random system: if it is already known that event B occurs place,
then the conditional probability that event A occurs is

P(A, B)
P(A|B) =
P(B)

used to incorporate partial information into a model
can be extended to the case P(B) = 0
extremely useful

examples.
P(A, B) = P(A|B) P(B)
P(A, B, C ) = P(A|B, C ) P(B|C ) P(C )
example:
X is a random “signal”
we only observe

observation Y
Y =X+ ,
where is a random
perturbation
given Y , the value of X
is still uncertain,
described by the
conditional distribution
signal X
ﬁltering

Filtering is the task of ﬁnding the conditional distribution of an
(unobserved) signal, given an incomplete observation.

ﬁltering updates a model by incorporating information
the probability distribution of the model before the observations are
taken into account is called the prior distribution.
the conditional distribution which incorporates the observations is
called the posterior distribution.
since the observations remove uncertainty, the posterior typically has
smaller variance than the prior.
complex example:
can one ﬁnd the
posterior of the ran-
dom velocity ﬁeld,
given the path of a
ﬂoater?
part III
statistical tools
Bayes’ Rule
One of the fundamental tools for computing conditional probabilities is
Bayes’ rule:
P(A)
P(A|B) = P(B|A)
P(B)

Typically used when A is the signal and B is the observation:

P(signal|observation) ∼ P(observation|signal)P(signal).

since the observation is known, P(observation) is a constant
P(signal|observation) is often diﬃcult to compute
P(observation|signal) is often easy to compute
prior:
example:
P(X = 1) = 0.6,
P(X = 2) = 0.2,
P(X = 3) = 0.2
p = 0.60   p = 0.20      p = 0.20
Y is Gaussian,
centered around X
posterior:

Now we observe Y = 2.05.
Using Bayes’ rule we can com-                observation
pute the posterior of X .
p = 0.22   p = 0.67      p = 0.11
Monte-Carlo Methods

Distributions can be approximated
by “clouds” of particles:
# particles in A
P(X ∈ A) ≈
# particles
N
1
E (X ) ≈           X (i)
N
i=1

π    766
≈      =⇒ π ≈ 3.064
4   1000
Importance Sampling
Sometimes plain Monte Carlo methods don’t work well. An extension is
importance sampling:
we are interested in the target distribution f
we sample particles from the “wrong” distribution g (the proposal
distribution)
to compensate, we have to add weights to the particles:

f (X (i) )
w (i) =
g (X (i) )

we use the weights in approximations:
N
1
E (X ) ≈             w (i) X (i)
N
i=1
part IV
sequential methods
ﬁltering problem:
(unobserved) signal: xn ∼ p(xn |xn−1 )
observations: yn ∼ p(yn |xn )

x0            x1            x2    x3   ···

y1            y2    y3

task: use the observations y1 , . . . , yN to
ﬁnd the posterior p(xn |y1 , . . . , yn )
There are two commonly used setups:
oﬄine ﬁltering: all the observations are available from the start
online ﬁltering: we want to compute the posterior after every
observation.
Sequential methods are most useful in the second case, i.e. for online
ﬁltering.
We use importance sampling to approximate the posterior:
(i)                      (i)
Xn−1 with weights wn−1 is our approximation for
p(xn−1 |y1 , . . . , yn−1 ).
prior: p(xn |xn−1 , y1 , . . . , yn−1 )
posterior: p(xn |xn−1 , y1 , . . . , yn−1 , yn )
(i)       (i)
updated weights: wn = wn−1 · p(yn |xn )
technical trick: after a while most weights become very small
throw away particles with very small weights
replace particles with big weights
by several particles with smaller weights

```
To top