Document Sample

                                             Li Lin, Longbing Cao, Chengqi Zhang
                     1. Faculty of Information Technology, University of Technology Sydney, NSW 2007
                                         2. Capital Market CRC, Sydney, NSW 2000

                                                                 usually exhibits a different behavior (or may fail to trade
ABSTRACT                                                         successfully at all) when tested out-of-sample. This
                                                                 difficulty is related to the fact that many financial time
In stock market or other financial market systems, the           series do not show stability of their statistical behavior
technical trading rules are used widely to generate buy          over time especially when they are analyzed intra-daily.
and sell alert signals. In each rule, there are many             To minimize over-fitting during optimization, the
parameters. The users often want to get the best signal          optimization process must include the following
serious from the in-sample sets, (Here, the best means           important ingredients:
they can get the most profit, return or Sharpe Ratio, etc),      (1) A good measure of the trading model performance,
but the best one will not be the best in the out-of-sample       (2) Indicator evaluation for different time series,
sets. Sometimes, it does not work any more. In this paper,       (3) Large data samples,
the authors set the parameters a sub-range value instead of      (4) A robust optimization technique,
a single value. In the sub-range, every value will give a        (5) Strict testing procedures.
better prediction in the out-of-sample sets. The improved        On the foreign currency exchange (FX) markets since
result is robust and has a better performance in                 1995, thus providing large data samples for developing
experience.                                                      trading models. It has also produced a good trading model
                                                                 technology that has found applications in successful real-
KEY WORDS                                                        time trading models for the major FX rates.
Stock Market Data Mining; Technical Trading Rules;               The new element we want to present in this paper is a way
Genetic Algorithms; Robust; Optimization                         to automatically search for improved trading models.
                                                                 Genetic Algorithm (GA) offers a promising approach for
1. Introduction                                                  addressing such problems. It considers a population of
                                                                 possible solutions to a given problem and evolves it
Trading models are algorithms proposing trading                  according to mechanisms borrowed from natural genetic
recommendations for financial assets. In our approach we         evolution: reproduction, crossover, mutation and
limit this definition to a set of trading rules based on the     selection. The criterion for selecting an individual is
history financial data. The financial data, which are            based on its fitness to the environment, or more precisely,
typically series of prices, bid, ask, volume, time, date, etc,   to the quality of the solution it bears. A possible solution
enter the trading model in the form of indicators                is coded as a gene, which is formally the data structure
corresponding to various kinds of averages. Although the         containing the values of the quantities characterizing the
progress has been made in understanding financial                solutions.
markets, there is no definitive prescription on how to           In the framework of the present application, a gene will
build a successful trading model and how to define the           contain the indicator parameters, for example time
indicators. Automatic search and optimization techniques         horizons and a weighting function for the past, and also
can be considered when addressing this problem.                  the type of operations used to combine them. The fitness
However, optimizing trading models for financial assets          function will be based on the return obtained when
without over-fitting is a very difficult task because the        following the recommendations of a given trading model.
scientific understanding of financial markets is still very
limited. Over-fitting means building the indicators to fit a     2. The technical trading rule models
set of past data so well that they are no longer of general
value: instead of modeling the principles underlying the         In this section, we review the different ingredients that
price movements, they model the specific movements               constitute the basis of a trading model and reformulate
observed during a particular time period. Such a model           them in terms of simple quantities that can be used in
conjunction with a genetic algorithm. We first need to          3. Genetic Algorithms
specify an appropriate universe of trading rules from
which the current GA may have been applied to. Real             Genetic Algorithm [3-9] is a heuristic function for
trading models can be quite complicated and may require         optimization, where the extreme of the function (i.e.,
many different rules that also depend on the models own         minimal or maximal) cannot be established analytically.
trading history. Here we limit ourselves to simple models       A population of potential solutions is refined iteratively
that depend essentially on a set of indicators that are pure    by employing a strategy inspired by Darwinist evolution
functions of the price history or of the current return.        or natural selection. Genetic Algorithms promote
E.g. Moving Average rules (MA). [10]                            “survival of the fittest”. This type of heuristic has been
Define the following parameters for a simple Moving             applied in many different fields, including construction of
average rule:                                                   neural networks and finance.
Short run (term/day): sr                                        We represented the parameters of a trading rule with a
Long run (term/day): lr                                         one-dimension vector that is called “chromosome”, each
Short run average: s, is the average price of the sr days       element is called a “gene”, and all of the “chromosomes”
trading price;                                                  are called “population”. Here, each gene stands for a
Long run average: l, is the average price of the lr days        parameter value; each chromosome is the set of
trading price;                                                  parameters of one trading rule.
Fix band: x.                                                    Generally, genetic operations include: “crossover”,
Generating a buy alert signal, when sr-lr>x;                    “mutation” and “selection”.
Generating a sell alert signal, when lr-sr>x; (see Figure 1)    “Crossover” operator. Suppose S1={s11, s12, … , s1n},
                                                                S2={s21, s22, … , s2n}, are two chromosomes, select a
                                                                random integer number 0 d r d n, S3, S4 are offspring of
                                                                crossover(S1, S2),
                                                                S3={si | if i d r, si  S1, else si  S2},
                                                                S4={si | if i d r, si  S2, else si  S1}.
                                                                “Mutation” operator. Suppose a chromosome S1={s11,
                                                                s12, … , s1n}, select a random integer number 0 d r d n, S3
                                                                is a mutation of S1,
                                                                S3={si | if i z r, si=s1i, else si =random(s1i)}.
                                                                “Selection” operator. Suppose there are m individuals,
                                                                we select [m/2] individuals but erase the others, the ones
                                                                we selected are “ more fitness” that means their profits are
                                                                Genetic Algorithm.
                                                                1. InitializePopulation: Producing a number of individuals
                                                                randomly, each individual is a chromosome which is an n-
                                                                length array, n is the number of parameters.
   Figure 1.        Moving Average trading rule.                2. Test if one of the stopping criteria (running time,
About the data set, we divide them into two parts: in-          fitness, generations, etc) holds. If yes, stop the genetic
sample set (training set) and out-of-sample set (testing        procedure.
set). In training set, we find the robust parameters and we     3. Selection: Select the better chromosomes. It means the
use the same parameters in the testing set to evaluate the      profit under these parameters is greater.
result. In this paper, we use a one-year-long data as           4. Applying the genetic operators: such as “ crossover”
training set and continued one-year-long data as testing        and “ mutation” to the selected parents to generate an
set.                                                            offspring.
We have tested four trading rules: Filter Rules, Moving         5. Recombine the offspring and current population to
Average (MA), Support and Resistance and Channel                form a new population with “ selection” operator.
Break-outs. [10]                                                6. Repeat steps 2-5.
The evaluation used in this paper is Sharpe Ratio [16],         GA is also shown as following:
which is defined by ( R p  R f ) / V p , where R p is            P     InitializePopulation();
                                                                  While (not stop (P)) do
Expected portfolio return,   R f is Risk free rate and V p is        Parents[1..2]       SelectParents(P);
portfolio standard deviation.                                        Offspring[1] Crossover(Parents[1]);
The reason we use Sharpe Ratio is because it considers               Offspring[2] Mutation(Parents[2]);
both return and risk at the same time, and more and more                 P     Selection(P, Parents[1..2], Offspring [1..2]);
researchers and traders are considering it.                     Endwhile.
For finding the robust result, we add a filter onto the GA
algorithm to remove the single peak points. For each
point, we compute the average of the neighborhood
points, if its value is far from the average, we will discard
it. While we finding the fitness one, we also consider its
neighborhood points.
Algorithm 1. Finding the best sub-range for each trading
Step 1. For every parameter, set an initial size s1 , and step
 t for the first parameter sub-range;
Step 2. Computing the Sharpe Ratio with GA in every
sub-range combination;
Step 3. If there is the best robust sub-range, in which all
the values are positive and better than in the others, then
output the sub-range and finish the algorithm; else
running step 4.
                                                                     Figure 3.         The result before optimization. The
Step 4. Reset another sub-range size s2           s1 , repeat    best points are discrete so we can not find a better range.
                                                2                (“ +” : the positive value; “ o” : zeroed value; “ -” : negative
Step 2 and 3.                                                    value)
If s2 t , (in every sub-range, there is the least size,
generally only one value.) the algorithm becomes the
ordinary algorithm.
We can use the Algorithm 1 to find the robust optimized
point for the trading rules.

4. Experiments
We show the simple GA and Robust GA results in the
following figures. The data is the real stock price from
ASX (Australian Stock Exchange) and the in-sample data
is one-year from 01/01/2000 to 31/12/2000, out-of-sample
data is one year continuously after in-sample data.

                                                                   Figure 4.       Before optimization results, the system
                                                                 generated three signals only.

                                                                 The above figures show the results before optimization.
                                                                 They are the best result in Mathematical computation. But
                                                                 they do not make sense from financial definitions and
                                                                 applications. (The average value is far from the original
                                                                 data, it can not be used as prediction) And some users
                                                                 often want to change some value a little according to their
                                                                 experience. But these algorithms do not give them any
                                                                 suggestions and ranges. So they have to test randomly by
                                                                 The following algorithm after optimization will give the
   Figure 2.        Before optimization, the algorithm gets
                                                                 best parameter combination and meanwhile the sub-range,
the best Sharpe Ratio, the result is 28.251. The normal
                                                                 in which any value can make a positive profit. Moreover,
value is between -4 to +4, obviously, the value 28.251 is a
                                                                 the experienced users can choose some value from the
noisy, but it is really the “ best” one from mathematical
                                                                Figure 7.       After optimization, the system
                                                             generated the alert signals.
                                                             In these figures, we find the “ best” individual is not really
                                                             the reasonable value. But, the robust algorithms can get a
                                                             better result than before optimization. We know, the
                                                             trading rule features, the reasonable value in training sets
                                                             will be reasonable in testing set, too.[10] And we also do
                                                             some experiments to prove it. [9]

                                                             5. Conclusions
                                                             In every stock market analyzing systems, the most
                                                             important task is giving some buy/sell alert signal to the
                                                             users. And the traditional methods to do this are technical
                                                             analysis methods, which do the prediction according to
                                                             the historical data. But, in the historical data there maybe
                                                             some noisy, random, occasional trends and signals during
                                                             the trading period. So it may mislead the analyzing
                                                             system to get some wrong information or alert.
                                                             In this paper, we have presented an optimization method
                                                             to get a robust result, which is a sub-range for every
   Figure 5.      The robust parameters combination and      parameter instead of only one value. In this sub-range, it
range. (Fix Band range is [0.006, 0.053], Short run is [4,   can remove the noisy disturbance.
5] and Long run is [18, 19].)                                The sub-range can be gotten in a very short time with GA,
                                                             only 10 to 20 seconds, which is acceptable in a real time
                                                             trading system.
                                                             Another advantage of the new algorithm is the users can
                                                             also change a little of the parameter value by their own
                                                             experience in the sub-range. They can still get a positive
                                                             profit in the out-of-sample data. This idea is being
                                                             accepted by more and more users and systems.
                                                             In the next step, we want to improve the algorithms to get
                                                             the result more quickly and exactly when the data is huge
                                                             or the range is more than one. And we will also consider
                                                             optimizing the size of training set and testing set. The new
                                                             algorithm can find the best size automatically for different
                                                             data and different stocks.

   Figure 6.     The optimization sub-range, in which,
all parameter combination can give a positive Sharpe         The authors would like thank Dr. SC Zhang, JQ Wang,
Ratio.                                                       WL Chen and JR Ni for their helpful discussion, and
                                                             Capital Market CRC[17], SIRCA[18] and AC3[19] for
                                                             funding this project and providing large realistic data of
                                                             ASX for this research. All supports and feedbacks from
                                                             financial Prof. Mike Aitken in UNSW, Prof. Alex Frino in
                                                             Sydney University, Australia and others are much


                                                             [1] Dacorogna M. M., et al, A geographical model for
                                                                 the daily and weekly seasonal volatility in the FX
                                                                 market, Journal of International Money and
                                                                 Finance, 1993. 12(4), 413-438.
                                                             [2] Pictet O. V., Dacorogna M. M. et al, Real-time
                                                                 trading models for foreign exchange rates, Neural
                                                                 Network World, 1992. 2(6), 713-744.
[3] Allen F., Karjalainen R., Using genetic algorithms to
     find technical trading rules, Working Paper. The
     Rodney L. White Centre for Financial Research, The
     Wharton School, University of Pennsylvania, 1993.
     20-93, 1-17.
[4] Yin X., Germay N., A fast genetic algorithm with
     sharing scheme using cluster analysis methods in
     multimodal function optimization, Proc. Inter. Conf.
     Artificial Neural Nets and Genetic Algorithms,
     Innsbruck, Austria, 1993. 450-457.
[5] Chen, S. H., Genetic Algorithms and Genetic
     Programming in Computational Finance (Boston,
     MA: Kluwer. 2002.)
[6] Neely, C., Weller, P., Dittmar, R., Is Technical
     Analysis in the Foreign Exchange Market
     Profitable? A Genetic Programming Approach.
     Journal of Financial and Quantitative Analysis.
     1997, 32:405-26.
[7] Thomas, J., Sycara, K., The Importance of
     Simplicity and Validation in Genetic Programming
     for Data Mining in Financial Data. Proceedings of
     the Joint AAAI-1999 and GECCO-1999 Workshop
     on Data Mining with Evolutionary Algorithms.
[8] Allen, F., Karjalainen, R., Using Genetic Algorithms
     to Find Technical Trading Rules. Journal of
     Financial Economics. 1999. 51:245-271. Studies
[9] Li, L., Longbing, C., et al, The Applications of
     Genetic Algorithms in Stock Market Data Mining
     Optimisation. The 5th International Conference on
     Data Mining, Text Mining and their Business
     Applications, Malaga, Spain. 2004. 273-280.
[10] Ryan, S., Allan, T., Halbert, W., Data-snooping,
     Technical Trading Rule Performance, and the
     Bootstrap. The Journal of Financial, 1999. 54,
[11] Mitchell, M., An Introduction to Genetic
     Algorithms( MIT press, Cambridge, USA. 1996.)
[12] Whitley, D., An Overview of Evolutionary
     Algorithm: Practical Issues and Common Pitfalls.
     Information and Software Technology. 2001.43:
[13] So, M.K.P., Lam, K., et al, Forecasting exchange
     rate volatility using autoregressive random variance
     model. Applied Financial Economics, 1999. 9: 583-
[14] Lam, K., Lam, K.C., Forecasting for the generation
     of trading signals in financial markets. Journal of
     Forecasting, 2000, 19: 39-52.
[15] Robert, E., John, M., Technical analysis of stock
     trends (Seventh edition, New York, 1997. page 4.)
[16], 2005
[17], 2005
[18], 2005
[19], 2005

Shared By:
liamei12345 liamei12345 http://