New Method for Time Series Forecasting by slappypappy128


									                  New Method for Time Series Forecasting
                                           THEODOR D. POPESCU
                       National Institute for Research and Development in Informatics,
                              8-10 Maresal Averescu Avenue, 71316 Bucharest

  Abstract: - The paper presents a method for multivariate time series forecasting using inde-
  pendent component analysis, as a preprocessing tool. The ideea of this approach is to do the
   forecasting in the space of independent components (sources), and then transforming back to
  the original time series. The forecasting can be done separately and with a different method for
 each component, depending on its time structure. The method has been applied in simulation, to
  an artificial multivariate time series with five components, generated from three sources and a
 randomly mixing matrix.

 Key-Words: - Independent component analysis, multivariate time series modeling and forecas-
 ting, algorithms, second order statistics, high order statistics, Box - Jenkins approach, simulation.

1 Independent Component                                     by the mixing equation
                                                                              ܴص      ״ص                (2)
1.1 Problem Formulation                                     where ״ص     ×½ ´Øµ     ×Ò ´Øµ Ì is an Ò ¢ ½ col-
Independent Component Analysis (ICA) is a sta-              umn vector collecting the source signals, vector
tistical and computational technique, that can be           ܴص similary collects the Ò observed signals and
seen as an extension to Principal Component Anal-           the square Ò ¢ Ò ”mixing matrix” contains the
ysis (PCA) and Factor Analysis (FA)[1]. ICA is a            mixture coefficients. The ICA problem consists
much more powerful technique, capable of find-               in recovering the source vector ״ص using only
ing the underlying factors or sources when these            the observed data ܴص, the assumption of inde-
classic methods fail completely. The data anal-             pendence between the entries of the input vector
ysed by ICA could originate from many differ-               ״ص and possible some a priori information about
ent kinds of application fields, including digital           the probability distribution of the inputs. It can
images, economic indicators and psychometric                be formulated as the computation of an Ò ¢ Ò
measurements.                                               ”separating matrix” whose output ״ص
     The simple ICA model assumes the existence
of Ò independent signals ×½ ´Øµ      ×Ò ´Øµ and the                           ״ص      Ü ´Øµ               (3)
observation of as many mixtures ܽ ´Øµ        ÜÒ ´Øµ,       is an estimate of the vector ״ص of the source sig-
these mixtures being linear and instantaneous, i.e.         nals
                          Ò                                      ICA is closely related to the method Blind
               Ü ´Øµ              × ´Øµ         (1)         Source Separation (BSS) or blind signal separa-
                              ½                             tion. A ”source” means here an original signal,
for each     ½   Ò. This is compactly represented           i.e. independent component. ”Blind” means that

we no very little, if anything, on the mixing ma-           the corresponding column in        by the
trix, and make little assumptions on the source             same scalar. As a consequence we may
signals. ICA is one method, perhaps the most                quite as well fix the magnitudes of the in-
widely used, for performing blind source separa-            dependent components; as they are random
tion.                                                       variables, the most natural way to do this
     In many applications, it would be more re-             is to assume that each has unit variance:
alistic to assume that there is some noise in the              ×¾      ½. Then the matrix     will be
measurements, which would mean adding a noise               adapted in the ICA solution methods to take
term in the model:                                          into account this restriction.

                                                         2. We cannot determine the order of the in-
             ݴص         ״ص                 (4)          dependent components. The reason is that,
             ܴص       ݴص · Ҵص                         again both × and being unknown, we can
                                                            freely change the order of the terms in the
                                                            sum (1), and call any of the independent
1.2 Identificability of the ICA model                        components the first one.
The identificability of the noise-free ICA model
has been treated in Comon [2]. By imposing the        1.3 Algorithms for ICA
following fundamental restrictions (in addition to
the basic assumption of statistical independence),    Independent Component Analysis is mainly per-
the identifiability of the model can be assured:       formed using the information on signal statistics.
                                                      When the signals are temporal coherent, it is pos-
   1. All the independent components × with the       sible to solve the problem using only the second-
      possible exception of one component, must       order statistics, but if the signals are temporal
      be non-Gaussian.                                white or have identical normalized spectral den-
                                                      sities, without any information on a priori source
   2. The number of the observed linear mix-          distributions, the solution will need using of high-
      tures Ñ must be at least as large as the        order statistics for the received signals. If the
      number of independent components Ò, i.e.        source signal distributions are known, the prob-
      Ñ Ò.                                            lem could be solved by maximum likelihood me-
                                                      thod. We underline that in the case of source
   3. The matrix     must be of full column rank.
                                                      signals temporal white and Gaussian, the blind
                                                      source separation problem has not solution.
     Usually, it is also assumed that Ü and × are
                                                           In the next section we present an approache
centered. If Ü and × are interpreted as stochas-
                                                      that supposes the signals temporal coherent and
tic processes instead of simply random variables,
                                                      exploits the second-order statistics using interco-
additional restrictions are necessary. At the min-
                                                      variance matrix of observations. Other approaches
imum, one has to assume that the stochastic pro-
                                                      that suppose the signals white temporal and ex-
cesses are stationary in the strict sense. Some
                                                      ploits statistics of high-order, using non-linear func-
restriction of ergodicity with respect to the quan-
                                                      tions are described in [1].
tities estimated are also necessary.
     In the ICA model of eq. (2), it is easy to see
that the following ambiguities will hold:             2 ICA Using Second-Order
   1. We cannot determine the variances (ener-
      gies) of the independent components. The
                                                      2.1 Second-Order Information
      reason is that, both × and being unknown,
      any scalar multiplier in one of the sources     The first step of the procedure [3] consists of pre-
      × could always be cancelled by dividing         whitening the signal part ݴص of the observation.
This is done via a whitening matrix Ï, i.e. a           2. Perform the eigendecomposition of the ÊÜ ´¼µ
Ò ¢ Ñ matrix (we consider Ò sources and Ñ mix-             covariance matrix
tures) such that Ïݴص is spatially white. The
whiteness condition is                                                          ÊÜ ´¼µ               À¡ÀÌ                            (9)
      ÁÒ    ÏÊÝ ´¼µÏÌ            Ï     Ì ÏÌ   (5)

where ÁÒ denotes the Ò ¢ Ò identity matrix. Equa-                               À                ½               Ñ
tions (5) implies that Ï is a unitary matrix:
for any whitening matrix Ï, it then exists a uni-
tary matrix Í such that Ï         Í. As a conse-
quence, matrix can be factored as                                           ¡                        ½               Ñ

                                                             with          for       . The number of
             Ï Í        Ï Ù½            ÙÒ    (6)            sources can be estimated starting from the
                                                             spectrum ¡ [4], [5].
where # denotes the pseudoinverse and Í is uni-
tary. The use of second-order information - in the      3. Estimate noise variance ¾ as the average
form of an estimate of ÊÝ ´¼µ which is used to             of the Ñ   Ò smallest eigenvalues of ¡
solve for Ï in (5) - reduces the determination of
the Ñ ¢ Ò mixing matrix to the determination
                                                                            ¾                ½
of a unitary Ò ¢ Ò matrix Í. The whitened pro-                                              Ñ Ò
cess ÜÛ ´Øµ Ïܴص still obeys a linear model:
                                                        4. Compute the whitening matrix Ï as:
 ÜÛ ´Øµ    Ïܴص            Ï´ ״ص · Ҵصµ                                         Ï               ¡
                            Í״ص · ÏҴص (7)
    The signal part of the whitened process now
is a unitary mixture of the source signals. Note
                                                                            ´ ½             ¾
                                                                                                  ½      ¾
                                                                                                                     ´   Ò       ¾
                                                                                                                                       ½   ¾
that all the information contained in the covari-
ance is ’exhausted’ after the whitening, in the              and
sense that changing Í in (7) to any other uni-
tary matrix leaves unchanged the covariance of

                                                                                                 ½               Ò
ÜÛ ´Øµ.
                                                        This resulted matrix is used to obtain the white-
2.2 Whitening Matrix Computation                     ned process
This step is implemented via eigendecomposi-
tion of the sample covariance matrix ÊÜ ´¼µ. We
consider here that the noise covariance is of the          ÜÛ ´Øµ       Ïܴص                Ø           ½           Ì           (12)
form ÊÒ ´¼µ       ¾ Á . The whitening procedure
is the following:                                    2.3 Intercovariance Matrix Estimation

   1. Estimate the covariance matrix ÊÜ ´¼µ us-      Starting from the whitened process ÜÛ ´Øµ, à in-
      ing Ì samples of the observations:             tercovarince matrices of this process are computed:

                             Ì                                                          Ì
              ÊÜ ´¼µ              Ü´ØµÜ´ØµÌ   (8)     ÊÛ ´
                                                                                             ÜÛ ´ØµÜÛ ´Ø                     Ì (13)
                        Ì   Ø ½
                                                                    Ì           Ø       ·½
where ½            Ã . The resulted matrices are     3 Time Series Forecasting by ICA
of Ò ¢ Ò dimension, and the computation effort
does not depend of number of sensors, Ñ. The         The idea of this approach [9] is to forecast the
value of à will be selected to realize a trade off   signal Ü ´Øµ using a ICA method, doing the fore-
between the statistic efficiency and computation      casting in the space of sources, and then trans-
effort. The value of the delays used in compu-       forming back to the original time series. The
tation depends also on the length of the signal      forecasting can be done separately and with a dif-
correlations. If we have a priori information on     ferent method for each component, depending on
spectral density of sources, the value of à can be   its time structure. The following procedure was
optimal chosen.                                      used:

                                                        1. After substracting the mean of each time
2.4 Joint Diagonalization
                                                           series, the independent components × ´Øµ
Let ÊÛ        ÊÛ ´ µ ½           Ã be a set of             and the mixing matrix are estimated.
à matrices with common size Ò ¢ Ò. A joint
diagonalizer of the set ÊÛ is defined as a unitary       2. For each independent component × ´Øµ a
maximizer of the criterion                                 suitable filtering procedure is applied to re-
                                                           duce the effect of noise-smoothing for com-
                                                           ponents that contain very low frequency
               Ã                                           (trend, slow cyclical variations), and high-
      ´Íµ                ´Í
                              Ì Ê ´ µÍµ ¾
                                 Û           (14)          pass filtering for components containing high
                                                           frequencies and/or sudden shocks.
where       ´¡µ is the norm of the vector build
                                                        3. Each smoothed independent component is
from the diagonal of the matrix argument. The
                                                           predicted separately, for instance using some
problem is solved by a generalization of Jacobi
                                                           method of autoregressive (AR) or autore-
technique [6], [7], [8].
                                                           gressive and moving average (ARMA) mod-
                                                           eling. The prediction is done for a number
2.5 Mixing Matrix and Source Signals                       of steps into the future.
                                                        4. The predictions for each independent com-
Let Í      Ù½       ÙÒ be the unitary matrix re-           ponent are combined by weighting them
sulted by joint diagonalization. If the objective          with the mixing coefficients, , thus ob-
of the blind identification is source separation, a         taining the predictions, ÜÔ ´Øµ for the origi-
brute estimation of these can be computed by:              nal time series Ü ´Øµ.

                ״ص    ÍÌ ÜÛ ´Øµ            (15)
    To estimate the mixing matrix need to inverse
                                                     4 Experimental Results
the effect of whitening, and the mixing matrix       To test the method, we applied the procedure to
can be estimated by                                  an artificial multivariate time series with five com-
                                                     ponents, generated from three sources and a mix-
                        Ï Í                  (16)    ing matrix , randomly generated.
   To obtain at the output of the separator a max-       The original sources used in this application
imum signal/noise ratio the source signals are es-   are given in Fig. 1. The mixing signals - the
timated by                                           synthetic data - resulted are given in Fig. 2.
                                                         The sources signals have been estimated from
            ״ص       Ì Ê ´¼µ ½ ܴص                the mixing signals using JADE algorithm. The
                          Ü                  (17)
                                                     estimated sources are represented in Fig. 4. The
                                                     estimated sources, have been assimilated to three
40                                                                                      10


10                                                                                       2
      0     10       20       30       40     50   60   70   80         90       100         0    10        20      30        40   50   60        70   80         90   100

1.5                                                                                    −6

 1                                                                                     −8

0.5                                                                                    −10

 0                                                                                     −12
      0     10       20       30       40     50   60   70   80         90       100         0    10        20      30        40   50   60        70   80         90   100

50                                                                                       6


20                                                                                     −2
      0     10       20       30       40     50   60   70   80         90       100         0    10        20      30        40   50   60        70   80         90   100

                          Fig. 1: Original sources                                                               Fig. 3: Estimated sources
−10                                                                                     10

−20                                                                                      8

      0     10       20       30         40   50   60   70   80           90     100
 50                                                                                      4

  0                                                                                      2
                                                                                             0         20                40        60        80             100        120

−50                                                                                    −6
      0     10       20       30         40   50   60   70   80           90     100
      0     10       20       30         40   50   60   70   80           90     100
150                                                                                    −12
                                                                                             0         20                40        60        80             100        120

 50                                                                                      4
      0     10       20       30         40   50   60   70   80           90     100

−60                                                                                    −2
      0     10       20       30         40   50   60   70   80           90     100         0         20                40        60        80             100        120

                     Fig. 2: Time series analysed                                                Fig. 4: Forecasting of estimated sources

monovariable time series and have been analysed                                               ¾
                                                                                       with ¯ ¼ ½
with TSYSTEM program package [10]. The fol-                                                Starting from the resulted models we perfor-
lowing ARIMA models resulted for s1, s2 and s3,                                        med forecasting for the estimated sources, for an
respectively, where ¯Ø is innovation and ÜØ                                            horizont of prediction of 12 steps using TSYS-
ÜØ   ÜØ ½ :                                                                            TEM program package. The confidence interval
                                                                                       for prediction has been chosen 95%. The resulted
      ´½         µ×½Ø             ´½    ¼     ¼¾      ¼ ¾¿        ¾
                                                                      µ¯Ø      (18)    are presented in Fig. 4. Based on the forecast-
                                                                                       ing results of the estimated sources and knowing
with ¯                ¼ ¿¿¼                                                            the mixing matrix it was possible to determine
                                                                                       the forecasting values of the multivariate time se-

      ×¾Ø                 ¼       ·´½·¼              ¼ ½              µ¯Ø      (19)
                                                                                       ries investigated. The forecasting results with the
                                                                                       confidence intervals are given in Fig. 5.
with ¯                ¼           ¾

                                  µ×¿Ø                                         µ¯Ø
      ´½ · ¼ ½ ½                              ½¼   · ´½ · ¼           ¾
                                                                    Trans. on ASSP, vol. 35, no. 11, pp. 1533-
                                                                    1538, 1987.
      0        20      40      60      80     100       120

                                                              [6]   G.H. Golub and C.F.V. Loan, Matrix Com-
                                                                    putation, The John Hopkins University Press,
      0        20      40      60      80     100       120
−10                                                                 1989.

      0        20      40      60      80     100       120
                                                              [7]   A. Souloumiac and J.F. Cardoso, ”Com-
                                                                    paraison de methodes de separation de sour-
                                                                    ces”, Proc. GRETSI, Juan les Pines, 1991.
      0        20      40      60      80     100       120
                                                              [8]   A. Souloumiac and J.F. Cardoso, ”Givens

                                                                    angles for simultaneous diagonalization”,
      0        20      40      60      80     100       120
                                                                    SIAM J. Matrix Anal. Appl., 1994.
          Fig. 5: Forecasting of original time series         [9] S. Malaroiu, K. Kiviluoto and E. Oja, ”Time
                                                                   series prediction with independent compo-
5 Conclusions                                                      nent analysis, Proc. Int. Conf. on Ad-
                                                                   vanced Investment Technology, Gold Coast,
The results are promising as the ICA based fore-                   Australia, 2000.
casting is expected to work better than direct fore-
casting. It is known that multivariate time series            [10] Th. Popescu, Th., Time Series. Applica-
modeling put many problems concerning canon-                       tions in System Analysis, Technical Pub-
ical representation of these series. The presented                 lishing House, Bucharest, 2000.
approach will need to be intensive investigated
for other real time series, and the results com-
pared with those provided by other classical meth-


[1]                a
            A. Hyv¨ rinen, J. Karhunen and E. Oja, In-
            dependent Component Analysis, John Wi-
            ley & Sons, Inc, 2001.
[2]         P. Comon, ”Independent Component anal-
            ysis - a new concept ?”, Signal Processing,
            vol. 36, pp. 287-314, 1994.
[3]         A. Belouchrani, K. Abed Meraim, J.F. Car-
            doso and E. Moulines, ”A blind source sep-
            aration technique based on second order
            statistics”, IEEE Trans. on Signal Process-
            ing, vol. 45, no. 2, pp. 434-444, 1997.
[4]         M. Wax and T. Kailath, ”Determining the
            number of signals by information theoretic
            criteria”, Workshop on spectral estimation
            II, Florida, pp. 192-196, 1983.
[5]         Y. Yin and P. Krishnaiah, ”Methods for de-
            tection of the number of signals”, IEEE

To top