Lecture 5 Estimation of time series by slappypappy129

VIEWS: 46 PAGES: 10

									                                                                                    Lecture 5, page 1



Lecture 5: Estimation of time
series

Outline of lesson 5 (chapter 4)
(Extended version of the book):
a.)   Model formulation
      Explorative analyses
      Model formulation
b.)   Model estimation
      Identification of order
      Estimation of parameters
c.)   Model checking
      Residual checking
      (forecasting)




                          ___________________________________________
         C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 1 of 1
                                                                                                  Lecture 5, page 2



a.) Model formulation

Explorative analysis
• Always plot the data
- Best way to discover features (i.e. non-stationarity) of the data that you
  might want to take into consideration
          (variance change, trends, seasonality, normality)
par(mfrow=c(1,1))
ts.plot(airpass, main="International Airline Passengers (Airpass.dat)",
xlab="Year", ylab="(thousands)", ylim=c(-100, 600))
ts.points(airpass, pch=28, col=8)
ts.lines(airpass.stl$seas, col=4)
ts.lines(airpass.stl$rem, lty=1, col=3)
legend(locator(1), legend=c("Data", "Seasonal effects", "remainder"), lty=1,
col=c(1,4,3))

                                   International Airline Passengers (Airpass.dat)
               600




                                   Data
                                   Seasonal effects
                                   remainder
               400
 (thousands)
               200
               0




                          Jan 54         Jan 56          Jan 58          Jan 60          Jan 62         Jan 64

                                                                  Year



                                      ___________________________________________
                     C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 2 of 2
                                                                                                                                        Lecture 5, page 3

•                         Estimate the ACF and PACF
(Tell you about trends. How?)

                              Seasonal components (Airpass.dat)                                                 Series : airpass.ln.stl$rem




                                                                                                1.0
                    6.0
    ln(thousands)




                                                                                                0.6
                    5.6




                                                                                            ACF
                                                                                                0.2
                    5.2
                    4.8




                                                                                                -0.2
                             Jan 54     Jan 56     Jan 58    Jan 60   Jan 62   Jan 64                  0.0               0.5                  1.0            1.5
                                                                                                                                        Lag
                                                          Year




• Decide on transformations, trend removal etc.
              Residual time series w/linear trend (Airpass.dat)                                                   Detrended time series
                                                                                        0.10
    6.0




                                                                                        0.05
                                                                                        0.0
    5.5




                                                                                        -0.05
    5.0




                                                                                        -0.15




                          Jan 54     Jan 56      Jan 58     Jan 60    Jan 62   Jan 64                  Jan 54   Jan 56         Jan 58    Jan 60     Jan 62    Jan 64
                                      Series : airpass.resid                                                     Series : airpass.resid




                                                            ___________________________________________
                                   C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 3 of 3
                                                                                        Lecture 5, page 4

Model formulation
-> We will return more to this theme. In this context:
What kind of lag-structure is likely/possible in your data?


                         "Disturbance":                     "Disturbance":                     "Disturbance":
                          Weather etc.                       Weather etc.                       Weather etc.
                            Time t-1                            Time t                            Time t+1
     Time t-2                          Time t-1                           Time t                          Time t+1




                                εt-1                              εt
                  Mature adults                         Progeny
                                       Give rise to


                                               Surviving adults and new                  New progeny
                                                        adults


                                                                         Give rise to



- How many time steps back is it reasonable that there is a biological
  relation? Generation time?
- For how long back can we expect external processes to influence?
Remember that the previous generation is also influenced by disturbance,
which is independent of the disturbance of the current generation. We are
looking only for direct influence on "this" generation.




                            ___________________________________________
           C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 4 of 4
                                                                                   Lecture 5, page 5



b.) Model estimation

Identification of order
-> Should be based on biology: What is your biological assessment of the
most comprehensive model that you can believe?



Our formal tool to chose the
appropriate model is AIC
To understand AIC, we need some background.
(Try to understand the concepts.)


       Remember: We wanted our data to be normally distributed
                                                  ⇓
               To do Maximum Likelihood (ML) estimation




                         ___________________________________________
        C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 5 of 5
                                                                                     Lecture 5, page 6

Maximum Likelihood and AIC
If we know the distribution of the noise, we can write out the probability
density function for the data.
We write:
ε1, …, εn iid                  f(x, θ)
(The noise is independent, identically distributed by a function f() that is
dependent on the data x, and some parameters, θ)
If so, we can find the most likely values of the parameters, given these
data. This is denoted L(θ, x) – The likelihood.
If we can assume normality, the likelihood is given by
    T                   1 ε2
          1         −
L=∏             e       2σ 2

   t =1   2Πσ

and the ε2 is dependent on some parameters, θ.
We want to find a θˆ (estimate of the parameters) that is most likely, given
the data and the distribution.
As many distributions are exponential, the likelihood is log'ed and by
technical reasons the negative is often taken. Thus
We want to minimise -log L by respect to some parameters, θˆ (by some
numerical optimality routine)
This is then the NLLH, Negative Log Likelihood.


The AIC – Akaike's Information Criterion is:
AIC = - 2 loglik + 2 (independent parameters. Here: p+q+1)
                                                     2( p + q + 1)n
Corrected AIC, AICC = –2loglik +
                                                     n− p−q−2




                                 ___________________________________________
          C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 6 of 6
                                                                                   Lecture 5, page 7

Practically: We estimate parameters for different models, compare AIC-
values from the models, and adapt the model with the lowest AIC-value (a
difference of 2 is significant).




                         ___________________________________________
        C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 7 of 7
                                                                                                       Lecture 5, page 8


Estimation of parameters
Two alternatives:
1.) If you know (i.e. are sure) that you are to fit an AR-process: AR-
estimation:
- > The parameters can be fitted by least squares estimation (as we do in
  ordinary regression) by minimising:

                              ∑ [x                                                                 ]
                               N
                                            − µ − α 1 ( xt −1 − µ ) − ⋅ ⋅ ⋅ − α p ( xt − p − µ )
                                                                                                   2
                        S=              t
                             t = p +1


(If we assume normality, this is also the maximum likelihood estimates
and AIC can be used to decide on the number of lags. This is what Splus
does.)
In the book: The Partial Autocorrelation Coefficient can be used to
determine the order of the process.


2.) If MA-terms are involved (alone or in addition to the AR-terms),
iterative methods must be used to fit the parameters.
Concept, least square: Different values for the parameters are tried
successively until the lowest deviation from the observed values are
obtained (page 59).
In practise: Maximum likelihood optimised by some exact method by the
software.
(Splus: arima.mle)
Then AIC can be used to choose the appropriate order of the process.




                          ___________________________________________
         C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 8 of 8
                                                                                          Lecture 5, page 9



c.) Model checking
As in all other statistics: We check the residuals:
Residual = observation - fitted value


In time-series, this is the one step ahead forecast (we will return to
forecasting later).
E.g., an AR(1) model:
ε t = xt − α 1 xt −1
ˆ          ˆ

To check the residuals, the book recommends:
- Plot the residuals (in time)
- Calculate the correlogram
- (test)


When fitting models using arima.mle all this can be taken care of by
arima.diag(). Here the test is Ljung and Box (1978)

airpass.arima22 <- arima.mle(airpass.resid, model =
list(order=c(2,0,2)),n.cond=6)
aicc <- function(loglik, p, q, n)
{
         a <- loglik + ((2 * (p + q + 1) * n)/(n - p - q - 2))
         return(a)
}
> airpass.arima22$aic
[1] -531.36
> aicc(airpass.arima22$loglik, p=2, q=2, n=144)
[1] -528.93


par(mfrow=c(1,2))
arima.diag(airpass.arima22)



                                ___________________________________________
               C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 9 of 9
                                                                                                 Lecture 5, page 10


ARIMA Model Diagnostics: airpass.resid
                                  Plot of Standardized Residuals
         2
         1
         0
         -1
         -2
         -3




                  1953     1955               1957     1959         1961     1963         1965

                                              ACF Plot of Residuals
         1.0
         0.5
   ACF
   0.0   -1.0




                   0.0                  0.5                   1.0           1.5

                                         PACF Plot of Residuals
    0.0 0.1
   PACF
         -0.1




                                   0.5                   1.0                1.5

                         P-values of Ljung-Box Chi-Squared Statistics
         0.15
           0.10
     p-value
   0.05  0.0




                  0.4             0.6                0.8              1.0           1.2
                                                        Lag

                         ARIMA(2,0,2) Model with Mean 0




                                         ___________________________________________
              C:\Kyrre\studier\drgrad\Kurs\Timeseries\lecture 05 010123.doc, KL, 26.09.02, page 10 of 10

								
To top