Stock Markets Closing Prices Today by cic99420

VIEWS: 24 PAGES: 6

More Info
									      A study of techniques for mining data from the Australian Stock Exchange
                           Mark B Barnes, Russell J Rimmer and Kai Ming Ting1
                                 School of Computing and Mathematics
                                           Deakin University
                                    Clayton, Victoria 3168, Australia

                      ABSTRACT
Next-day predictions of the All Ordinaries Index of the      inputs. Training was on 500 weeks of data. Predictions
Australian Stock Exchange are generated using lagged         were formed for the following 20 through to 50 weeks.
values of the index. Three approaches are compared:
                                                             Researchers distinguish short- and long-term prediction.
exponential moving average, linear regression and neural
                                                             Short term is taken to mean one step ahead. For the
network. The first predicts 'tomorrow' best when little
                                                             Hang-Sen analysis one step is a week. (In one
account is taken of previous index values other than the
                                                             exchange-rate study [8] the short-term is one hour.) The
value 'yesterday'. Sliding windows are used to do one-
                                                             approach below is short-term technical. That is, next-
day ahead prediction for the other approaches. In two
                                                             day predictions of the All Ords closing values are
periods when the All Ordinaries Index behaves
                                                             generated using only lagged values of the index. In
differently, neural networks did not perform as well as
                                                             Section II the All Ords is reviewed briefly. Section III
the other techniques. Performance, as measured by
                                                             contains a discussion of the three approaches to
mean average percentage error, depends on which day is
                                                             technical analysis. Short-term prediction results for two
being predicted. It appears that linear regression and
                                                             periods are reported in Section IV. Further research is
exponential moving averages better account for
                                                             discussed in Section V.
relationships between index values on successive days.

                  1 INTRODUCTION                                2. AUSTRALIA’S ALL ORDINARIES INDEX
Technical analysts aim to predict values of financial        All Ords closing values are investigated for trading days
instruments using only previous values or prices of the      in 1986-8 and 1994-7. Over these years the All Ords
instrument. Proponents of technical systems ‘know’ they      was a weighted average of more than 250 share prices.
work and may rely on little else. Seasoned brokers and       It accounted for about 95 percent of total market
fundamentals analysts look at other information, but are     capitalisation of listed entities. It is a widely cited
often mindful of technical forecasts. Considerable effort    indicator of activity on the ASX. The Sydney Futures
is expended to predict changes in the All Ordinaries         Exchange hosts very active trading in contracts based on
Index (All Ords) of the Australian Stock Exchange            the All Ords.
(ASX). Three approaches are investigated in this paper.      Generally ASX activity was buoyant over 1994-7. The
They are exponential moving averages, linear regression      All Ordinaries grew from 1, 846 to 2, 881. There was
and neural networks.                                         increased activity in initial public offerings. Many
Neural networks can capture nonlinear relationships in       fundamentals were favourable. Among them were five
financial markets. In [9] lagged values of the Kuala         reductions in the official cash rate and the election of a
Lumpar Stock Exchange Index are augmented with               Commonwealth government committed to budgetary
values derived in technical analysis to identify important   restraint. The period 1986-8 encompassed a severe
features. The researchers trained on data for 1984 to        downturn. In many weeks during 1986-8 the All Ords
1988 and forecast after 1991 having validated using          closing value fell each day from Monday to Friday.
intervening values. Predictions are nearly three years       Over October 1987 the All Ords lost approximately 40
beyond the training period. In [5] neural-network            percent of its value.
forecasting of the Hang Sen index is augmented with
autocorrelation analysis to decide the number of lagged

1
 The authors would like to thank Mr. Charles Arcovitch
of Unica Technologies, Inc. for providing Pattern
Recognition Workbench (PRW).
        3. THE PREDICTION TECHNIQUES                                                     nodes in the hidden layer should be smaller than the
3.1 Exponential moving average (EMA)                                                     square root of the number of training patterns. The
This is the convex combination of actual closing value                                   Baum-Haussler rule suggests that the hidden layer in
and the prediction for today. That is,                                                   our configuration could have as few as two nodes [5].
 Et  X t 1  (1   ) Et 1 0    1 ,                                               One, two, three, 6, 7 and 8 were tried for 1986-8 and
                                                                                         1994-7.
where E t is tomorrow’s estimated value, X t 1 is actual
                                                                                         Training sets of size s = 5, 10, 20, 65, 130 and 260 were
closing value today and E t 1 is the prediction for today.                              tried. When selecting training sets the final 450
 E 0 was taken to be X0. In this approach the prediction                                 observations in each period were reserved as test cases.
for tomorrow (time t) lies in the interval between actual                                To do this initial training sets of size s were drawn as
and predicted values today (time t – 1). This technique                                  the s observations immediately before the first of the
will therefore be in error whenever tomorrow’s index is                                  450 test items. Thus for all training sizes, the rth
outside this interval. If  = 1 then no account is taken of                              prediction concerns the same day among the 450, for r =
earlier predictions. Or to put this another way, the index                               1, 2, …, 450.
tomorrow is best predicted using today’s value. When                                    3.3 Linear regression
= 0, actual value is of no importance. For each day the                                  PRW also gives a linear forecasting process. To predict
prediction is unchanged. EMA involves all of the                                         tomorrow’s value, E t , the pairs
available data when 0    1 . However as t advances,
                                                                                         ( X t  s 2 , X t  s 1 ), ( X t  s 3 , X t  s 2 ), ( X t  s 4 , X t  s 3 ),
data from the ‘distant’ past have diminishing effect. In
the other techniques a fixed number of earlier values of                                                ( X t 3 , X t 2 ), ( X t 2 , X t 1 )
Xt are used.                                                                                                                             ˆ
                                                                                         are first used to estimate the coefficients a and b in  ˆ
3.2 Neural network                                                                              ˆ ˆ
                                                                                          X j  a  bX j 1 . Then set X j 1  X t 1 (closing value
Backpropagation neural networks are capable of
producing smooth nonlinear mappings between input                                                                         ˆ ˆ
                                                                                         today) and compute Et  a  bX t 1 . An obvious
and output variables. A simple neural network consists                                   misspecification is that the actual model is
of three layers: input, output and hidden. Each layer
contains one or more nodes. The numbers of nodes in the                                   X t  a  b1 X t 1  b2 X t 2    br X t m  ut .
input and output layers are the same as the numbers of                                   This was tried with our data for m =2, 3, …, 10.
input and output variables in the problem under                                          However, predictions one-day ahead were not
consideration.                                                                           improved. Our finding for EMA in Section IV supports
A sliding window [4] is used to do one-step ahead                                        this dependence on only one lagged value.
prediction. Data X 0 , X 1 , X 2 ,  are treated as a                                    3.4 Comparing predictions
continuous stream, with training size, test size and slide                               Prediction accuracy is measured with a single indicator,
size pre-defined. A window consists of a training set and                                the mean absolute percentage error [6],
a test set is obtained by sliding across the stream of data                                            n
                                                                                                          X t  Et
                                                                                                            
                                                                                                   1
by the slide size. This process is repeated until the data is                             MAPE                     100 ,
                                                                                                   n t 1    Xt
exhausted. We are interested in predicting the index
value ‘tomorrow’ given the value ‘today’. In this case                                   where as before Xt is the true value and Et is the
the test- and slide-sizes are one. On day t – 1 (call it                                 predicted value at time t. Changing patterns in the data
‘today’) s pairs                                                                         are assessed with the mean absolute difference between
                                                                                         actual closing values on successive days,
( X t  s 2 , X t  s 1 ), ( X t  s 3 , X t  s 2 ), ( X t  s 4 , X t  s 3 ),
                                                                                                            n

                                                                                                          X
          ( X t 3 , X t 2 ), ( X t 2 , X t 1 )                                                   1
                                                                                         MAD                       t    X t 1 .
of the form (value one day, next day’s value) are used to                                             n   t 1
train a backpropagation neural network in Pattern                                        MAPE and MAD are averages over n test sets.
Recognition Workbench (PRW) [4]. The network                                             Specification of n requires care. For example we report
consists of one input layer with one node for closing                                    the mean of the absolute percentage errors in predicting
value ‘today’, Xt-1; a single hidden layer; and one output                               closing value each Monday. In the 450 items reserved as
node for closing value ‘tomorrow’, Xt.                                                   test data there are 90 Monday closing values. Hence n =
For our data the PRW software suggests that 6 nodes be                                   90. There are 90 Tuesday closing values also. So, like
used. The PRW documentation notes that the number of                                     MAPE, MAD is a mean over 90 items.
                       4. RESULTS                                        performed best when tomorrow’s expected value is
Predictions of the All Ords with the three techniques are                derived nearly exclusively from today’s value. On each
compared. Data for 1994-7 are examined first (Table 1                    day the MAPE is greatest for the neural net and (except
and Figure 1). Only the smallest MAPEs are given. For                    for Monday) least for regression. Each approach
the regression the smallest error arose when s = 260.                    returned the smallest MAPE when predicting closing
Minimum errors for the neural network consistently                       value on Thursdays. This is apparent in Figure 1. At
arose when s = 130. The number of hidden nodes                           worst (predicting Thursday’s close) the neural-net
producing the minima is shown in brackets in Table 1.                    MAPE is greater than the regression error by 11.5
The learning rate and momentum were set to 0.1.                          percent. At best (predicting the Monday closing value)
Sensitivity to reduction of these is reported below.                     the neural-net MAPE is greater by 5.9 percent. Other
The EMA approach minimised MAPE when                                     research finds in favour of neural nets [2; 8].
 Et  0.99X t 1  0.01Et 1  X t 1 . That is, EMA


                                           MAD                                         MAPE
                                                        Neural net (m             Linear regression   Exp’l moving
                                                        hidden nodes)                 (s = 260)         average
          Successive days                                 (s = 130)                                     ( = 0.99)
          Friday & Monday                 11.315            0.560 (2)                   0.529             0.513
          Monday & Tuesday                11.388            0.528 (3)                   0.496             0.513
          Tuesday & Wednesday             11.354            0.541 (8)                   0.502             0.519
          Wednesday & Thursday             9.967            0.466 (6)                   0.418             0.457
          Thursday & Friday               12.199            0.573 (7)                   0.525             0.559

                               Table 1 MADs and MAPEs for the All Ordinaries Index, 1994-7.

           0.60


           0.58


           0.56


           0.54


           0.52


           0.50


           0.48


           0.46


           0.44


           0.42


           0.40
                      Monday              Tuesday            Wednesday                Thursday         Friday

                                              NN min          Linear Regression          EMA

                               Figure 1 MAPEs for the prediction of the All Ordinaries 1994-7.
The EMA returns a constant error for Monday and               markedly improve the neural-network predictions
Tuesday, while the MAPEs for the other methods                compared with linear regression and EMA.
decrease over the two days. All three MAPEs decrease          Are the predictive outcomes replicated for 1986-8? Now
from Wednesday to Thursday and increase again on              EMA predicts best. MAPEs are again largest for the
Friday. What explains this uniformity among the               neural network. As in the earlier table and figure, neural-
percentage errors in prediction? Consider Thursday.           net results in Table 2 and Figure 2 are for learning rate
Recall that after training the neural net or estimating the   and momentum set at 0.1. The largest MAPE in Table 1
regression, Thursday’s closing value is predicted from        is 0.573. In Table 2 it is 1.407, more than twice as large.
Wednesday’s value. Note in Table 1 that the MAD is            The smallest MAPE in Table 2 is 0.715. This exceeds
minimised for Wednesday and Thursday. Index values            the largest error in Table 1. Given the rapidity and
on these days are usually clustered around the median         magnitudes of changes in the All Ords it is probably not
value in any week [3]. That is, on these days the             surprising that the errors are much greater than during
maximum or minimum weekly values are relatively               the more ebullient times of 1994-7.
unlikely compared with other pairs of successive trading
                                                              Now compare Figures 1 and 2. Over 1994-7 MAPES are:
days. On Wednesday the median value for the week is
                                                              (i) generally greater on Monday and Friday; and (ii)
most likely, while the second-highest or second-lowest
                                                              lowest on Thursday. But for 1986-8 MAPES are: (i) low
weekly values are most likely on Thursday. With six
                                                              on Monday, Wednesday and Friday; and (ii) highest on
hidden nodes the neural net predicted about as well as
                                                              Tuesday. Prediction success on Thursday for 1994-7
EMAs on Thursday.
                                                              appears to be associated with features of the closing-
Now consider Friday. The MAPEs increase from                  value distributions on Wednesday and Thursday that are
Thursday to Friday. MAD is greatest on these days.            absent in the earlier data. Rather, in 1986-8 the pairs of
While closing values on Thursday tend not to fall at the      distributions on Friday and Monday, Tuesday and
weekly extremes, on Friday the weekly maximum or              Wednesday, and Thursday and Friday have common
minimum is likely [3]. This may be one reason                 features. This further contrasts with 1994-7 where the
prediction on Friday seems difficult. The neural network      first and third pairs of distributions differ in important
does best with 7 hidden nodes.                                respects. Greater detail for 1986-8 is available in [3].
The MAD for Friday and Monday is 7.8 per cent less
than for Thursday and Friday. Yet the MAPE on
                                                                                 5. CONCLUSION
Monday with the neural network is down by only 2.3 per
                                                              Predictions one day ahead for the All Ords index of the
cent. For EMA the error declined 8.2 per cent – close to
                                                              ASX were obtained using a neural network, linear
the reduction in the MAD. As for Friday, weekly highs
                                                              regression and exponential moving averages for the
and lows are more likely on Monday. When  = 0.99,
                                                              years 1986-8 and 1994-7.
EMA effectively predicts Monday using only Fridays
close. If a high (low) on Friday is generally followed by     Broadly it seems the neural net responded to patterns in
a high (low) on Monday EMA will capture that.                 the closing values, but not as sensitively as either linear
Whatever the pattern, with two nodes the neural network       regression or EMA. Nevertheless, the neural-network
does not predict as well. (Nor does it do better on           percentage errors approximately tracked the profiles of
Monday with more hidden nodes and/or reductions in            the errors for the other techniques. This was
learning rate and momentum.)                                  accomplished by varying the number of nodes in a
                                                              single hidden layer. The number of hidden nodes in
Between Monday and Tuesday the MAD increases
                                                              1986-8 was different on each day. For 1994-7 the
marginally. On Tuesday the commonest closing value is
                                                              number of hidden nodes coincided on only Monday and
the second lowest for the week, followed by the
                                                              Thursday. Further, the number of hidden nodes for any
minimum [3]. Regression and the neural network have
                                                              day in 1986-8 differs from the number required on the
improved prediction success compared with Monday.
                                                              same weekday in 1994-7. The networks were also
From generally low weekly values on Tuesday,                  finetuned with learning rate and momentum values
Wednesday is characterised by values most commonly            ranging from 0.1 down to 0.01. In addition inputs were
at the weekly median. Further, values either side of the      pre-processed [1]. For none of these did the neural net
median are less common, but about equally likely [3].         outperform EMA in 1986-8 or linear regression in 1994-
The MAPEs increased from Tuesday errors, even though          7 on the MAPE criterion.
there is little change (and that a reduction) in the MAD.
                                                              In 1986-8 the ranked MAPEs for regression and EMA
Reductions in learning rate and momentum did not
                                                              are highly correlated with the ranked MADs. Thus, as
                                                              MAD increases there is always an increase in the
MAPEs. For the neural net the rank-order correlation is                series we forecast. Some researchers derive additional
0.8. In 1994-7 the corresponding correlations are                      time series from the index and use them as inputs also
positive, but smaller. Moreover, the largest rank-order                [9]. Others, including financial economists and many
correlation is between MAD and the MAPE for the                        stock-market agents, use evidence from economic and
neural net. Thus what drives the MAPEs is not                          company accounts, other financial markets and social,
completely captured by the MADs. Therefore a direction                 political and international news. These are avenues we
for further inquiry is how robust our results are when the             will explore. There are precedents in neural science [8].
indicator of prediction success is changed from MAPE                   At another level, suppose that the data for the period
and the indicator of difference on successive days is no               1986 to 1997 were amalgamated. Our results suggest
longer MAD. Another strategy is to employ multiple                     that short-term prediction should be handled differently
indicators of predictive success and of data patterns on               in two sub-periods. This might be associated in part with
successive days.                                                       different expectations and psychological outlooks in the
In the tradition of technical analysis, we have used as                time intervals. Further research is required to investigate
inputs for each technique only lagged values of the
                                           MAD                                         MAPE
                                                         Neural net (m            Linear regression    Exp’l moving
                                                         hidden nodes)                (s = 260)          average
         Successive days                                   (s = 130)                                     ( = 0.99)
         Friday & Monday                  11.638             1.176 (7)                  0.921              0.794
         Monday & Tuesday                 14.514             1.407 (6)                  1.176              0.986
         Tuesday & Wednesday              11.065             1.051 (3)                  0.822              0.755
         Wednesday & Thursday             12.456             1.147 (7)                  0.937              0.866
         Thursday & Friday                10.453             1.057 (2)                  0.786              0.715

                                 Table 2 MADs and MAPEs for the All Ordinaries, 1986-8.

            1.50



            1.40



            1.30



            1.20



            1.10



            1.00



            0.90



            0.80



            0.70
                       Monday             Tuesday             Wednesday               Thursday          Friday

                                              NN min          Linear Regression          EMA

                                   Figure 2 MAPEs for the All Ordinaries Index, 1986-8.
whether in longer time series these ‘animal spirits’ might
improve forecasting accuracy. To an extent the judicious
use of technical series derived from the original data
may serve this purpose. Alternately, additional inputs
might be sought from among the many surveys of
business and investor confidence.

                   6. REFERENCES
[1] E. Azoff, Neural Network Time Series Forecasting
      of Financial Markets, Chichester: John Wiley,
      1994.
[2]   S. Avouyi-Dovi and R. Caulet, "Using artificial
      neural networks to forecast prices of financial
      assets", in A. Refenes, Y. Abu-Mostafa, J. Moody
      and A. Weigend, Neural Networks in Financial
      Engineering, Proceedings of the Third International
      Conference on Neural Networks in the Capital
      Markets, Singapore: World Scientific, 1996.
[3]   M. Barnes, R. Rimmer and K. Ting, "Data patterns
      and estimating the All Ordinaries Index",
      Melbourne: Deakin University, 1999.
[4]   R. Kennedy, Y. Lee, B. Roy, C. Reed and R.
      Lippman, Solving Data Mining Problems through
      Pattern Recognition, Upper Saddle River: Prentice
      Hall, 1998.
[5]   F. Lin, H. Xing, S. Gregor and R. Irons, "Time
      series forecasting with neural networks",
      Complexity International, Vol. 2, 1995, pp. 1-12.
[6]   S. Makridakis, S. Wheelwright and V. McGee,
      Forecasting: methods and applications, Chichester:
      John Wiley, 1983.
[7]   T. Mills, The econometric modelling of financial
      time series, Cambridge: Cambridge University
      Press, 1993.
[8]   A. Refenes and M. Azema-Barac, "Neural network
      applications in financial asset management",
      Neural Computation and Applications, Vol. 2,
      1994, pp. 13-39.
[9]   J. Yao, and H. Poh, "Equity forecasting: a case
      study on the KLSE index", in A. Refenes, Y. Abu-
      Mostafa, J. Moody and A. Weigend, Neural
      Networks in Financial Engineering, Proceedings of
      the Third International Conference on Neural
      Networks in the Capital Markets, Singapore: World
      Scientific, 1996.

								
To top