Davis
Shared by: yaoyufang
-
Stats
- views:
- 7
- posted:
- 9/17/2011
- language:
- English
- pages:
- 51
Document Sample


Heavy Tails and Financial
Time Series Models
Richard A. Davis
Columbia University
www.stat.columbia.edu/~rdavis
Thomas Mikosch
University of Copenhagen
1
Oxford-Man 2008
Outline
Financial time series modeling
General comments
Characteristics of financial time series
Classical extreme value theory
• Extremal types
• Extension to stationary time series
• Extremal index
Regular variation
Multivariate case
Point processes
Applications
GARCH and stochastic volatility processes
Limit behavior of sample correlations
Wrap-up
2
Oxford-Man 2008
Financial Time Series Modeling
2005 Neyman Lecture: ―Dynamic Indeterminism in Science‖ by
Brillinger contains the following quote from Neyman.
―The essence of dynamic indeterminism in science consists in an
effort to invent a hypothetical chance mechanism, called a
‗stochastic model‘, operating on various clearly defined hypothetical
entities, such that the resulting frequencies of various possible
outcomes correspond approximately to those actually observed. ‖
—Neyman (1960), JASA
3
Oxford-Man 2008
Financial Time Series Modeling (cont)
Two strategies for thinking about modeling extremes in time series:
1. Fit a model to the entire data set (e.g., GARCH and SV for financial
time series) and study the extreme value behavior associated with the
fitted model as truth.
2. Construct and fit models only to the extremes (e.g., observations
exceeding a large threshold).
Do fitted models actually capture the desired characteristics of the data?
• How do we assess ―fitted‖ (expected) with ―observed‖?
• Need a mechanism for measuring extremal dependence.
Goal of this talk: Focus on strategy 1 and contrast some of the features of
GARCH and SV models as they relate to extremes including:
• Regular-variation of finite dimensional distributions
• Extreme value behavior
• Sample ACF behavior
4
Oxford-Man 2008
Financial Time Series Modeling (cont)
Bonus quote from Brillinger‘s paper:
―It seems to me that the proper way of approaching economic
problems mathematically is by equations of the above type, infinite
or infinitesimal differences, with coefficients that are not constants,
but random variables; or what is called random or stochastic
equations. . . . The theory of random differential and other
equations, and the theory or random curves are just starting.‖
— Neyman (1938), JASA
6
Oxford-Man 2008
Characteristics of financial time series
Define Xt = ln (Pt) - ln (Pt-1) (log returns)
• heavy tailed
P(|X1| > x) ~ RV(-a), 0 < a < 4.
• uncorrelated
r X (h ) near 0 for all lags h > 0
ˆ
• |Xt| and Xt2 have slowly decaying autocorrelations
r|X | ( h ) and r X 2 ( h ) converge to 0 slowly as h increases.
ˆ ˆ
• process exhibits ‗volatility clustering‘.
7
Oxford-Man 2008
Example: Pound-Dollar Exchange Rates
(Oct 1, 1981 – Jun 28, 1985; Koopman website)
1.0
4
0.8
log returns (exchange rates)
2
0.6
ACF
0.4
0
0.2
-2
0.0
0 200 400 600 800 0 10 20 30 40
day lag
1.0
1.0
0.8
0.8
0.6
0.6
ACF of abs values
ACF of squares
0.4
0.4
0.2
0.2
0.0
0.0
0 10 20 30 40 0 10 20 30 40
lag lag
8
Oxford-Man 2008
Example: Pound-Dollar Exchange Rates
Hill‘s estimate of alpha (Hill Horror plots-Resnick)
5
4
Hill
3
2
1
0 50 100 150
m
9
Oxford-Man 2008
ACF of squares log returns
0.0 0.2 0.4 0.6 0.8 1.0 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2
0
0
Oxford-Man 2008
10
500
20
Lag
time
1000
30
1500
40
ACF of abs values ACF
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
0
0
10
10
20
20
Lag
Lag
30
30
Example: Amazon-returns (May 16, 1997 – June 16, 2004)
40
40
10
Example: Amazon-returns
Hill‘s estimate of alpha (Hill Horror plots-Resnick)
5
4
Hill
3
2
1
0 100 200 300
m
11
Oxford-Man 2008
e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns
-0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4
Oxford-Man 2008
ti m e
ti m e
ti m e
ti m e
e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns
-0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4
ti m e
ti m e
ti m e
ti m e
e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns
-0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4
ti m e
ti m e
ti m e
ti m e
e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns e x c h an ge retu rns
-0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4 -0.4 0 .0 0 .4
ti m e
ti m e
ti m e
ti m e
exchange rate data. Which one is the real data?
Simulated Realizations for the Amazon Data
15 realizations from GARCH model fitted to Amazon +
12
ACF Plots for Amazon
ACF of the squares from the 15 realizations from the GARCH
model on previous slide.
0 .8
0 .8
0 .8
0 .8
ACF
ACF
ACF
ACF
0 .4
0 .4
0 .4
0 .4
0 .0
0 .0
0 .0
0 .0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
L ag L ag L ag L ag
0 .8
0 .8
0 .8
0 .8
ACF
ACF
ACF
ACF
0 .4
0 .4
0 .4
0 .4
0 .0
0 .0
0 .0
0 .0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
L ag L ag L ag L ag
0 .8
0 .8
0 .8
0 .8
ACF
ACF
ACF
ACF
0 .4
0 .4
0 .4
0 .4
0 .0
0 .0
0 .0
0 .0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
L ag L ag L ag L ag
0 .8
0 .8
0 .8
0 .8
ACF
ACF
ACF
ACF
0 .4
0 .4
0 .4
0 .4
0 .0
0 .0
0 .0
0 .0
0 10 20 30 40 0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
L ag L ag L ag L ag
13
Oxford-Man 2008
Two models for log(returns)-cont
Xt = st Zt (observation eqn in state-space formulation)
(i) GARCH(1,1) (General AutoRegressive Conditional
Heteroscedastic – observation-driven specification):
X t s t Z t , σ t2 α0 α1 X t-1 β1σ t-1 , {Z t } ~ IID (0,1)
2 2
(ii) Stochastic Volatility (parameter-driven specification):
X t st Z t , log st2 0 1 log st21 t , {t } ~ IID N(0, s2 )
Main question:
What intrinsic features in the data (if any) can be used to
discriminate between these two models?
14
Oxford-Man 2008
Classical EVT— Extremal Types Theorem
Setup:
• {Xt} ~IID(F)
• Mn= max{X1,…, Xn}
Convergence of types: Now taking un = anx+ bn, an > 0,
P (an-1(Mn – bn ) x) = Fn(anx+ bn)
G(x)
if and only if
n(1-F(anx+ bn)) -log G(x)
Theorem. If G is a nondegenerate distribution, then G has to be one
of the three types,
1. G(x) = exp(-e-x) (Gumbel)
2. G(x) = exp(-x-a), x 0 (Fréchet)
3. G(x) = exp(-(-x)a), x 0 (Weibull)
15
Oxford-Man 2008
Classical EVT— Domains of Attraction
Domains of attraction: There are necessary and sufficient
conditions for F ϵ D(G) for the three extreme value distributions.
The heavy-tailed Fréchet, which is perhaps the most commonly
used extreme value distribution, has the easiest n.a.s. to state (and
check!). In this case,
F ϵ D(exp(-x-a)) if and only if F is RV(-a) for some a > 0.
Regular variation: F is RV(-a) if and only if
F (tx) P( X tx)
x a as t ,
F (t ) P( X t )
for every x > 0.
16
Oxford-Man 2008
Extension to Stationary Time Series
Let (Xt) is a strictly stationary sequence with common df F ∈ D(G), i.e.,
Fn(anx+ bn) G(x).
Theorem If (Xt) satisfies a mixing condition (like strong mixing) and
P( an-1(Mn – bn ) x) H(x),
H nondegenerate, then there exists a q ∈ (0,1] such that
H(x)=Gq(x).
The parameter θ is called the extremal index and is a measure of
extremal clustering.
17
Oxford-Man 2008
Extension to Stationary Time Series—Extremal Index
Fn(anx+ bn) G(x) P( an-1(Mn – bn ) x) Gq(x).
Properties
• θ < 1 implies clustering of exceedances
• 1/θ is the mean cluster size of exceedances.
• In a certain sense, one can view θ as a measure of statistical
efficiency relative to the iid case. That is, one needs 1/θ more
observations to match the behavior of the iid case. Specifically,
P(Mn/q x) ~ Fn(x)
• Suppose c is a threshold such that Fn(c) ~.95 and θ = .5. Then
P(Mn ≤ c) ∼ .951/2 = .975
18
Oxford-Man 2008
Extension to Stationary Time Series—Example
Example (max-moving average) Let (Zt) be iid with a Pareto
distribution, i.e., P(Z1 > x) = x-a for x 1, and set
Xt = max(Zt, Zt-1), ∈ [0,1].
Then
nP(X1 > xn1/a ) (1+a)x-a and Fn(anx) exp(-(1+a)x-a ).
On the other hand
P( n-1/a Mn x) = P( n-1/a max(Z0 ,…, Zn) x) exp(-x-a ).
Thus θ = 1/(1+a).
19
Oxford-Man 2008
Extension to Stationary Time Series—Example
iid (pareto a = 3) max-moving average ( = 1)
q=1 q = 1/2
6
6
5
5
4
4
max ma
iid
3
3
2
2
1
1
0 20 40 60 80 100 0 20 40 60 80 100
t t
Note that cluster size is exactly 2 in this case.
20
Oxford-Man 2008
Extension to Stationary Time Series—Mixing Conditions
Strong Mixing:
sup | P ( A B) P ( A) P( B) | a k 0 as k .
As ( X s ,s 0 ) , Bs ( X s ,s k )
Remarks:
• Since mixing is defined via σ-fields, measurable functions of (Xt)
inherit the same mixing property. For example, if the stationary
sequence (Xt) is strongly mixing, so are (|Xt|) and (Xt2) with a rate
function of similar order.
• If (ak) decays to zero at an exponential rate, (Xt) is strongly
mixing with geometric rate, i.e., the memory between past and
future dies out exponentially fast.
• Strong mixing is much stronger than Leadbetter‘s dependence
condition D(un).
21
Oxford-Man 2008
Extension to Stationary Time Series—D‘
Anti-clustering condition D‘(un): Think of un as anx + bn .
[n / k ]
lim supn n P( X 1 un , X t un ) O(1 / k )
t 2
as k .
Theorem: If (Xt) satisfies D and D‘, FϵD(G), then q = 1 (i.e., no
clustering).
Remarks:
• If (Xt) is iid, then the lim sup of the sum is
limsupn n2/k P2(X1 > un) =O(1/k).
• If (Xt) is a stationary Gaussian process with ACF r(h)=o(1/log h), then
D and D‘ hold and there is no clustering for Gaussian processes.
22
Oxford-Man 2008
Extension to Stationary Time Series—Example
IID N(0,1/(1-.92)) AR(1): Xt = .9 Xt-1 + Zt, (Zt)~IID N(0,1)
6
4
4
2
2
AR(1)
0
iid
0
-2
-2
-4
-6
-4
0 50 100 150 200 0 50 100 150 200
t t
• Even though q = 1, there appears to be some clustering for small n.
• Hsing, Hüsler, Reiss (1996) overcome this problem for Gaussian
processes by considering a triangular array or rvs.
23
Oxford-Man 2008
Point Process Example—baby steps
In particular, for one-dependent sequences,
P(X2 > x| X1 > x) 1-q = a /(1 a ).
Point process convergence (max-moving average): With an=n1/a
nP(Z1 > anx) x-a and nP(X1 > anx) (1+a)x-a
Define the sequence of point processes by
n
N n a 1 ( Z , Z
*
n t t 1 )
t 1
From the convergence
n
t 1
a n 1Z t
d 1 / a ,
t 1
k
k E1 Ek ,
one can show n
N n a 1 ( Z , Z
*
d ( ( k1 / a , 0 )
( 0, 1 / a ) )
n t t 1 ) k
t 1 k 1
24
Oxford-Man 2008
Point Process Example—baby steps
Applying the continuous mapping theorem (need to be careful),
n
N n a 1 ( Z , Z
*
d ( ( k1 / a , 0 )
( 0, 1 / a ) )
n t t 1 ) k
t 1 k 1
we have
n n
N n a 1 X a 1 m ax(Z
n t n t , Z t 1 )
t 1 t 1
d (
k 1
m ax(k1a , 0 )
m ax(0, 1a ) )
k
( 1a 1a ) : N
k k
k 1
0
Red = k-1/a, k=1,…,5
Blue = .75 *k-1/a, k=1,…,5 25
Oxford-Man 2008
Regular Variation — multivariate case
Multivariate regular variation of X=(X1, . . . , Xm): There exists a
random vector q Sm-1 such that
P(|X|> t x, X/|X| )/P(|X|>t) v x-a P(q )
(v vague convergence on Sm-1, unit sphere in Rm) .
• P( q ) is called the spectral measure
• a is the index of X.
Equivalence:
P( X t)
v m ( )
P(| X | t )
m is a measure on Rm which satisfies for x > 0 and A bounded away
from 0,
m(xB) = x-a m(xA).
28
Oxford-Man 2008
Regular Variation — multivariate case
Examples: 1. If X1 and X2 are iid RV(-a), then X= (X1, X2 ) is
multivariate regularly varying with index a and spectral distribution
(assuming symmetry)
P( q =pk/2) = ¼ k=1,2,3,4 (mass on axes).
Interpretation: Unlikely that X1 and X2 are very large at the same
time. Independent Components
Figure: plot of
(Xt1,Xt2) for realization
40
of 10,000.
20
x_2
0
-20
-20 -10 0 10 20
x_1 29
Oxford-Man 2008
2. If X1 = X2 > 0, then X= (X1, X2 ) is multivariate regularly varying
with index a and spectral distribution
P( q = p/4) = 1.
3. AR(1): Xt= .9 Xt-1 + Zt , {Zt}~IID t(3)
P(q = arctan(.9)) = .9898 P(q = p/2) ) = .0102
80
60
40
20
0
-20
0 2000 4000 6000 8000 10000
t
30
Oxford-Man 2008
Figure: plot of (Xt, Xt+1) for realization of 10,000.
Xt= .9 Xt-1 + Zt
80
60
40 AR(1), X_{t+1} v s. X_t
x={t+1}
20
0
-20
-20 0 20 40 60 80
x=t
31
Oxford-Man 2008
Estimation of the spectral distribution of q
Based on the relation
P(|X|> t x, X/|X| )/P(|X|>t) v x-a P(q )
a naïve estimate of the distribution of q is based on the angular
components Xt/|Xt| in the sample. One simply uses the empirical
distribution of these angular pieces for which the modulus |Xt|
exceeds some large threshold. In the examples given below, we
use a kernel density estimate of these angular components for
those observations whose moduli exceed some large threshold.
Here we only consider two components, i.e., q is one dimensional.
36
Oxford-Man 2008
Estimation of the spectral distribution of q
Independent Components Independent Components
40
0.20
20
x_2
0.15
0
0.10
-20
-20 -10 0 10 20 -3 -2 -1 0 1 2 3
x_1 theta
37
Oxford-Man 2008
Estimation of q
AR(1), X_{t+1} v s. X_t AR(1)
80
0.6
60
40
0.4
x={t+1}
20
0.2
0
-20
0.0
-20 0 20 40 60 80 -3 -2 -1 0 1 2 3
x=t theta
Vertical lines on right are at arctan(.9) and arctan(.9) -p
38
Oxford-Man 2008
Examples of Processes that are Regular Varying
GARCH(1): Xt=(a0+a1 X2t-1 +b1s2 t1)1/2Zt, {Zt}~IID.
a found by solving E|a1 Z2 +b1|a/2 = 1.
ARCH(1) case:
a1 .312 .577 1.00 1.57
a 8.00 4.00 2.00 1.00
Distr of q:
P(q ) = E{||(B,Z)|| a I(arg((B,Z)) )}/ E||(B,Z)||a
where
P(B = 1) = P(B = -1) =.5
39
Oxford-Man 2008
Examples of Processes that are Regular Varying
Example of ARCH(1): a0=1, a 1=1, a =2, Xt=(a0+ a1 X2t-1)1/2Zt, {Zt}~IID
Figures: plots of (Xt, Xt+1) and estimated distribution of a for
realization of 10,000.
20
0.18
10
0.16
0.14
0
x_{t+1}
0.12
-10
0.10
0.08
-20
0.06
-20 -10 0 10 20 -3 -2 -1 0 1 2 3
x_t theta
40
Oxford-Man 2008
Examples of Processes that are Regular Varying
Example: SV model Xt = st Zt
Suppose Zt ~ RV(-a) and
log s
2
t j t j ,
j
2j < , { t } ~ IID N(0, s 2 ).
j
Then Zn=(Z1,…,Zn)‘ is regulary varying with index a and so is
Xn= (X1,…,Xn)‘ = diag(s1,…, sn) Zn
with spectral distribution concentrated on (1,0), (0, 1).
10000
Figure: plot of (Xt,Xt+1)
5000
for realization of 10,000.
x_2
0
-5000
-5000 0 5000 10000
41
Oxford-Man 2008 x_1
Extremes for GARCH and SV processes
Setup
Xt = st Zt , {Zt} ~ IID (0,1)
Xt is RV (a)
Choose {an} s.t. nP(Xt > an) 1
Then
P n (an 1 X 1 x) exp{ x a }.
Then, with Mn= max{X1, . . . , Xn},
(i) GARCH:
P(an 1M n x) exp{ x a },
is extremal index ( 0 < < 1).
(ii) SV model:
P (an 1M n x) exp{ x a },
extremal index = 1 no clustering.
44
Oxford-Man 2008
Extremes for GARCH and SV processes (cont)
Absolute values of ARCH
30
20
10
** * * *** ***
0
0 20 40 60
time
45
Oxford-Man 2008
Extremes for GARCH and SV processes (cont)
5
4
3
2
1
0 Absolute values of SV process
** * * * ** *
0 10 20 30 40 50
time
46
Oxford-Man 2008
Summary of results for ACF of GARCH(p,q) and SV models
GARCH(p,q)
a(0,2):
(r X (h))h1,,m d (Vh / V0 )h1,,m ,
ˆ
a(2,4):
(n12 / a
r X (h))h1,,m d 1 (0)(Vh )h1,,m .
ˆ X
a(4,):
(n r X (h))h1,,m d 1 (0)(Gh )h1,,m .
ˆ
1/ 2
X
Remark: Similar results hold for the sample ACF based on |Xt| and
Xt2.
47
Oxford-Man 2008
Summary of results for ACF of GARCH(p,q) and SV models (cont)
SV Model
a(0,2):
s1sh 1
(n / ln n )1/ a
r X (h)
ˆ d
a Sh
.
s1
2
S0
a
a(2, ):
(n r X (h))h1,,m d 1 (0)(Gh )h1,,m .
ˆ
1/ 2
X
48
Oxford-Man 2008
Sample ACF for GARCH and SV Models (1000 reps)
-0.3 -0.1 0.1 0.3 (a) GA R C H
(b) S V Mo
-0.6 -0.2 0.2
49
Oxford-Man 2008
Sample ACF for Squares of GARCH (1000 reps)
0.6
0.4
0.2
0.0 (a) GARCH(1,1) Model, n=10000
b) GARCH(1,1) Model, n=100000
0.6
0.4
0.2
0.0
50
Oxford-Man 2008
Sample ACF for Squares of SV (1000 reps)
(c) SV Model, n=10000
0.15
0.10
0.05
0.0
(d) SV Model, n=100000
0.04
0.03
0.02
0.01
0.0
51
Oxford-Man 2008
Example: Amazon-returns (May 16, 1997 – June 16, 2004)
0.2
0.0
-0.2
log returns
-0.4
-0.6
-0.8
-1.0
0 500 1000 1500
time
1.0
1.0
0.8
0.8
ACF of abs values
0.6
0.6
ACF of squares
0.4
0.4
0.2
0.2
0.0
0.0
0 10 20 30 40 0 10 20 30 40
Lag 52
Oxford-Man 2008 Lag
Amazon returns (GARCH model)
GARCH(1,1) model fit to Amazon returns:
a0 .00002493, a1= .0385, b1 = .957, Xt=(a0a1 X2t-1)1/2Zt, {Zt}~IID t(3.672)
Simulation from GARCH(1,1) model
1.0
1.0
0.8
0.8
0.6
0.6
ACF of squares
ACF abs values
0.4
0.4
0.2
0.2
0.0
0.0
0 10 20 30 40 0 10 20 30 40
Lag Lag
53
Oxford-Man 2008
Amazon returns (SV model)
Stochastic volatility model fit to Amazon returns: simulation based on
fitted model.
1.0
1.0
0.8
0.8
ACF of abs values
0.6
0.6
ACF of squares
0.4
0.4
0.2
0.2
0.0
0.0
0 10 20 30 40 0 10 20 30 40
Lag Lag
54
Oxford-Man 2008
Application to Crystal River
River flow rate for Crystal River located in the mountain of Western
Colorado (see Cooley et al. (2007)). After deasonalizing the data,
we obtain 728 weekly observations from Oct 1, 1990 to Oct 1,
2005.
1.0
8
0.8
6
0.6
crystal.river
4
acf
0.4
2
0.2
0
0.0
-2
0 200 400 600 0 10 20 30 40
t lag (h)
55
Oxford-Man 2008
Application to Crystal River
Estimates of a and the distribution of q for bivariate pairs (Xt-1,Xt)
8
6
Hill
4
Vertical lines at
p/4 and p/4 - p
2
0 20 40 60 80 100 120 140
m
8
0.8
6
0.6
4
X_t
0.4
2
0.2
0
0.0
-2
-2 0 2 4 6 8 -3 -2 -1 0 1 2 3 56
Oxford-Man 2008 X_{t-1} theta
Application to Crystal River
Extremogram for Crystal River A = B = (1,)
1.0
0.8
extremogram
0.6
0.4
0 10 20 30 40
lag
59
Oxford-Man 2008
Application to Crystal River
Fit an AR(6) model to the data (remove all appreciable
autocorrelation in the data). Now we estimate the distribution of q
and the extremogram based on the residuals.
1.0
0.25
0.8
0.20
0.6
extremogram
0.15
0.4
0.10
0.2
0.05
0.0
-3 -2 -1 0 1 2 3 0 10 20 30 40
theta lag
Vertical lines at -p/2, 0, and p/2
60
Oxford-Man 2008
Application to Crystal River
There is still a touch of autocorrelation in the absolute values and
squares of the residuals. We remove these by fitting a GARCH
model to these residuals. The degrees of freedom for the noise
was 3.43
5
4
Hill
3
2
1
0 20 40 60 80 100 120 140
m
61
Oxford-Man 2008
Wrap-up
• Regular variation is a flexible tool for modeling both dependence
and tail heaviness.
• Useful for establishing point process convergence of heavy-tailed
time series.
• Extremal index < 1 for GARCH and 1 for SV.
• ACF has faster convergence for SV.
62
Oxford-Man 2008
Get documents about "