Document Sample

Using a Rule-Based Expert System to Solve a Hypothetical Stock Market Problem Dressler Abstract— In this paper, I propose a design for a rule-based implement neural networks or genetic algorithms, but given expert system that attempts to maximize profits while playing a the particulars of our experimental situation, rule-based simulated stock market which, unlike in the real market, system have an enormous advantage over these other systems actually contains embedded patterns within the stock price – they do not need to learn how to recognize patterns. This data. This expert system will use linear regression analysis in conjunction with other methods to break down recent stock means that even with a very short stock history, we can begin history trends into graphical models and then use those models our investments because the ability to recognize patterns is to predict future stock prices. Using this, the system can predict already built into the system as part of its expertise. which stocks are most likely to yield the highest profit and Given the problem on hand and the advantages of the rule- invest accordingly. based expert system, this paper will explore a variety of possible solutions, assess their pertinence, narrow down an approach, and then finally lay out the details for a system 1. Introduction that will solve the problem.Procedure for Paper Submission Number crunching is the digital computer's finest (and 2: Rule Standardizing perhaps its only) trait. Not long after its invention, the stock market quickly became a target of this useful ability. Many In developing a stock trading expert system, a large part of techniques have since been refined to sift through the the challenge is choosing which rules to use, especially when the rules may come from very different prediction incomprehensibly large amounts of data needed to predict environments. Then, the knowledge engineer must then stock prices. More and more sophisticated systems are being tackle the problem of standardizing these rules in such a way built as people realize that predicting a future stock price is they may all interact with one another in a single rule-base. not as simple as looking at a graph of the data and plotting One particular system used 350 trading rules where each the next point. The truth is that there are countless economic rule was manipulated to return -1 to indicate to sell, 0 to factors that influence the real stock price and they are not indicate to do nothing, and 1 to indicate to buy [2] Many of always obvious [1]. its other rules returned values falling between -1 and 0 to But let us instead suppose for a moment that pattern were indicate some degree of confidence in favor of one direction truly embedded within the stock price data rather than being or the other [2]. The Stochastic Oscillator, Ease of interdependent upon external economic factors. Then, if Movement, and Relative Strength Index indicators were each provided a complete history of closing stock prices, perhaps independently refined to return values appropriate to the [-1, the next future stock price could accurately be predicted. It 1] scale and were subsequently incorporated into the rule- seems natural to use the number-cruncher’s ability to do the base [2]. So, even though the system incorporated four job, but how exactly might such a system be built and different indicators, each of the four indicators were operate? standardized such that the rules produced values from -1 to In addition to there being patterns in the stock price data, 1. let us also assume that we have $25,000 and that we know in Note that scaling the output using a methodology such as advance how many days we have to attempt to make a profit. this is important. It puts all of the rules on the same playing Suppose also that we are provided with a fixed number of field and allows them to interact to one another’s output even though the rules may have been derived from very different fabricated stocks (with no prior history) and are provided on environments. This concept will play a key role in a daily basis with each stock’s closing price. Additionally, at developing the final solution. the end of each day, we also have the option to buy or sell stocks, where each trade costs $5. Given these rules, how should we then proceed to maximize our profit? 3: Playing a Make-Believe Stock Market A rule-based expert system is a natural choice. It Note that our make-believe stock market in a way is very allows for a system to incorporate expert-level knowledge unlike the real stock market problem. In our problem, stock without having to learn anything. Many rival systems would prices are not influenced by external factors such as quarterly earnings, projected profits, general market trends, and dividend times. The problem is therefore no longer finding the pattern between stocks prices, the value of the company, indefinitely if the goal is to achieve an optimized rule-base, and the surrounding economy. The patterns are instead which is why other systems have been developed that embedded directly within the provided stock-price data. We incorporate genetic algorithms. One system uses genetic only need to know how to intelligently sift through it so as to algorithms in conjunction with an expert system to select the maximize profits. good rules using historical data [1]. Instead of manually One challenge in creating such a stock trading system is tweaking the system’s underlying rule-base over time, all finding rules that rely solely on stock price. In one case, an possible rules are available in a rule-base [1]. From this expert system estimated the stock price moving direction rule-base, subset rule-bases are selected and incorporated over the past five days by fitting the past five days worth of iteratively into system [1]. Each new rule set is then stock prices to a line using a linear regression technique [3]. analyzed for performance against the historical data [1]. Once an equation was computed, the slope of the line was Using this evolutionary system, rule subsets are both created used to extrapolate the expected future price [3]. Note that and tested programmatically [1]. The advantage here is that at first this seems like a very powerful technique – one alone there is no ambiguity as to what which rules need to be capable of predicting stock price. However, the resulting added or removed to improve the performance. Using brute line equation rarely ever fits the data perfectly, so one force, the system continues test for a superior rule-subset. important variable here is the measure of error in how well The above approach may be a bit radical in that each the line fits the data. This will become more important when rule is either accepted or rejected as being part of the system discussing the final solution to the problem. - there is no middle ground. An entirely different Another pertinent system which used stock prices as a genetic/rule-based hybrid approach assigned weights to each primary means for judging which stocks to buy and sell used rule denoting their importance [2, 5]. The resulting weighted Granville’s Law as its expertise [4]. As in the previous rule-set was referred to as an expert and were generated system, here, graphical models representing the recent stock using a genetic algorithm [2]. At runtime, the system would data are used. One of the main differences here is that this consult various experts and judge each expert’s performance one relies heavily upon the use of a moving average line based on certain criteria [2]. Once an expert best suited to (MAL). Using both a stock’s moving end price line (MEL) the stocks available at the time was selected, the expert was and its MAL, a stock’s graph is broken down into a sort of used to pick which stocks to buy and sell [2]. simple grammar to describe how a stock’s moving average When genetic algorithms are hybridized with expert line (MAL) compares to its end price line (MEL) and also to systems, the goal is frequently to programmatically generate describe their trends as a whole [4]. For example sub-rule-sets and evaluate their performance. The advantage mal(d(2),o,i(1),b), would mean “The MAL and MEL are here is that the computer does all the number crunching. The decreasing at a rate of 2 (where the MAL is over the MEL) disadvantage based on the previous two approaches is that and then increasing at a rate of 1 (where the MAL is below these genetic approaches do not manipulate the rules in any the MEL)” [4]. Once the stock price is represented using kind of intelligent way. If a specific rule is undervalued or such grammar, rules can be applied such as the following: uninformed, there is nothing in the system to improve the rule itself or create completely new ones. The system can only improve its performance by weakening a rule’s IF mal(d(-),o, c, -, i(3), b) THEN buy importance or by removing it entirely. This is a particularly compelling example because it shows 5: The Solution how graphical models can be represented without using formulas. The perfect strategy for solving our imaginary stock market problem would seem to involve fitting a formula to the plotted points representing the stock price. Then, using 4: Rule-based - Genetic Hybrids the resulting formula, extrapolate the future stock price. Using this projected price, a system could then invest on a Rule-based expert systems also offer another huge daily basis all funds into the stock that is predicted to yield advantage – flexibility. They allow us to manipulate with the highest profit for that day. ease the heuristics by which a system makes its decision to 5.1: Using Quadratic Regression Analysis buy or sell certain make-believe stocks. If the rules are too strict or relaxed, they can be tweaked at will to optimize gains. If certain rules do not perform well or other rules Since this is clearly an advantageous approach, this system might result in higher profit, they can be added or removed will use quadratic regression analysis on the most recent ten from the database with little effort. days or so of stock price data to derive a formula that may be Here lies a particularly interesting piece of the puzzle. used to predict how the stock price will behave one step into Such subtle tweaks over time to a large rule-base can last the future. Although this is an extraordinarily attractive approach, there are serious disadvantages. Fitting a graph to data is not a trivial task and the results tend to be less than perfect. For example, figure 0 demonstrates what happens when quadratic regression analysis is used to attempt to fit a formula to three points that fall on a line. Not only does the resulting function contain a great deal of unnecessary curvature, but it also does not fit the points perfectly (not obvious in Figure 0). Figure 0.1 Using the tangent line from last stock price to predict the next stock price 5.2: Two Supplemental Systems To compensate for the many disadvantages of quadratic linear regression, the system will also utilize two other techniques for predicting future stock prices. First, one technique will attempt to detect any approximately linear relationships among points comprising recent stock data graph (line fitting). The second technique will will attempt Figure 0. Quadratic Regression Analysis poorly fits to recognize and take advantage of any radical, but three points that fall on a line. predictable, directional changes in the stock price (zero- sloped tangent line and asymptotes) as the data approaches It also takes a lot of computing power to fit a function to a the current stock price. . complex series of points using quadratic analysis. The more inflections points there are in the best fitting function, the more sophisticated the resulting polynomial function must be 5.2.1: Linear Relationship Detection and also the longer it takes to compute a good fit. Note that the use of quadratic regression analysis means As previously discussed, the system will incorporate a that the resulting formula that fits the data will quickly move mechanism for recognizing stock prices that fall more or less toward either negative or positive infinity (depending on on a line. Linear regression analysis (or a similar procedure) whether or not the polynomial is even) after the most current to fit each stock’s recent price history graph to a line. Once stock price. Naturally, projecting the price an entire day into the stock price values are fitted to a line, then the expert the future using the resulting formula may result in an system can judge what to do with the stock based on the erroneous result. Therefore, the slope of the tangent line at line’s slope. the stock’s most recent closing price will be used. The This at first seems like a complete technique to creating system will assume that the next stock price will fall on this such a system. The problem is that the line that best fits the line. point is not always a good fit. One of the results of attempting to fit a line to a set of points is some value denoting the degree of error in the fit. Figure 1 demonstrates that linear regression analysis will find a good fit when the stock price is increasing or decreasing at a relatively constant rate. Conversely, this technique will not find a good fit when there is a great deal of fluctuation in the increase/decrease of stock price as in figure 2. stock prices. To address this problem, suppose the system concentrates on the most recent five points in the stock’s history P1, P2, P3, P4, and P5, where P5 is the current end price. Then, the system computes the slope between each of the adjacent points to create S1, S2, S3, and S4. Now, suppose that S1 is used only for comparison purposes and S1 is compared to S2. If S2 minus S1 is positive, then the Figure 1. Fitting a line to linear stocks system denotes the difference with a P. If the difference is zero, then the points fall on a straight line and system denotes the difference with an O. If the difference is negative, the system denotes the difference with a N. The system will then compare each of the adjacent slopes one by one and save off their corresponding P, O, or N. Finally, Figure 2. Fitting lines to non-linear graphs once all the slopes are compared, the system will concatenate the 3 final values of P or N to convey in a fairly It would seem then that linear regression analysis would straightforward and readable manner the trend of the points. be very good only for evaluating stocks that show an overall PPP, for example, means that the stock price is increasing linear increase/decrease in stock price. However this is not beyond that of a linear rate, whereas NNN would mean that necessarily true. If the system is modified to fit the line to the stock price is decreasing beyond that of a linear rate. only a small subset of the points, such as those comprising only the last five days [3] as in figure 3, then linear regression analysis may still allow for a close fit to the data even though there may be many inflection points in the data. Note, however, that as the system selects fewer points, it also makes the fit more volatile. Although there is a good fit in figure 3, the line does not foresee the upcoming increase in stock price. In fact, in figure 4, when the system should be most inclined to buy, it actually begins to compute a large amount of error in the fit. This is one of the unavoidable Figure 5. Detecting an increase inferiorities of using linear regression analysis. Note that a progressive increase alone in the slopes is inadequate to determine whether or not to buy or sell. As shown in figure 6, any five adjacent points along the curve will compute using this algorithm to PPP. However, it is only the bottom few points where the system should actually perform an action, namely to buy. Figure 3. Shortening the history Figure 4. Predicting a direction change Figure 6. Any adjacent 5 points will form an increase 5.2.3: Flexible Rule Representation For Graphs There therefore has to be another factor to help determine The best way to compensate for the weaknesses of using whether to buy, hold, or sell. Once again, the system shall linear regression analysis is to use an alternative approach use the most recent five points and compare each adjacent for when linear regression analysis yield too high of a fitting point, but this time, it will compare their y values to one error. Just as one system previously mentioned used a another. D will be used to indicate a decrease in price, C will grammar to represent the graphed stock prices [4], so shall be used to indicate no change in price, and I will be used to this solution. indicate an increase in price. Therefore, the middle five The primary problem with linear regression analysis is its points of figure 6 would be denoted as DDII. inability to detect radical changes in direction of the graphed So, just as Granville’s Law was represented using a series IF FIT_LVL_3 and SLOPE_LVL_NEG_1 then SELL of letters that formed a linguistic representation of a graph IF FIT_LVL_2 and SLOPE_LVL_NEG_3 then SELL [4], this system will use slope comparisons and y-value IF FIT_LVL_2 and SLOPE_LVL_NEG_2 then SELL comparisons to form a flexible rule set that will be integrated IF FIT_LVL_2 and SLOPE_LVL_NEG_1 then SELL into the rule-base as the antecedents. For example, the IF GRAPH_IPIPI then BUY points from Figure 7 would be represented as DPDPIPI IF GRAPH_CPIPI then BUY IF GRAPH_ININD then SELL IF GRAPH_INDND then SELL IF GRAPH_INCND then SELL IF GRAPH_IPDND then SELL IF GRAPH_IPCND then SELL IF GRAPH_DNDND then SELL IF GRAPH_DCDCD then SELL Here, FIT_LVL_0, Figure 7. A rule representation of the graph FIT_LVL_1, FIT_LVL_2, Rules created using this sort of grammatical representation FIT_LVL_3 might look like this: denote how successful the linear regression analysis was at fitting a line to the data from worst to best. IF GRAPH_DPDPIPI then BUY Also, And if desired, the system will accept varying numbers of SLOPE_LVL_NEG_3, points by lengthening or shortening the pattern of rules like SLOPE_LVL_NEG_2, this: SLOPE_LVL_NEG_1, SLOPE_LVL_CONST, IF GRAPH_DPIPI then BUY SLOPE_LVL_POS_1, SLOPE_LVL_POS_2, Furthermore, the rules may even exclude either slop- SLOPE_LVL_POS_3 comparison values y-value comparison values like this: denote slope of the fitted line from most negative to most positive. IF GRAPH_NNN then SELL Observe that these rules are very much weighted towards selling to prevent from taking losses. On a SELL, the system This sort of rule representation for specifying a graph is sells everything that has been invested in the stock. On a incredibly powerful. It allows individuals who modify the BUY, however, the expert system uses only a portion of all rule-base to specify all sorts of unforeseen graphs (including available funds to invest in the stock. Clearly, much more asymptotes and zero sloped tangent lines) that would be will be invested in a STRONG_BUY over a BUY. matched with potentially very poorly fitting lines if plugged into a linear regression analysis 6.0 Conclusion In order to address the stock market problem, most expert 5.1 The Rule Set systems use a hybrid approach. However, due the particulars The starting rules in the system shall look like this: of our problem – namely, finding patterns in stock price data, a hybrid route is not necessary and may even interfere with IF FIT_LVL_3 and SLOPE_LVL_POS_3 the results. then STRONG_BUY Instead this system uses a standard rule-based expert IF FIT_LVL_2 and SLOPE_LVL_POS_3 system with standardized rules among its two major then BUY subsystems; linear regression analysis and grammatical graph IF FIT_LVL_3 and SLOPE_LVL_POS_2 representation. then STRONG_BUY Using these two critical systems stock price trends can be IF FIT_LVL_3 and SLOPE_LVL_POS_1 then BUY predicted and exploited without the use of multiple IF FIT_LVL_3 and SLOPE_LVL_CONST then SELL regression analysis. The resulting system is extremely IF FIT_LVL_3 and SLOPE_LVL_NEG_3 then SELL flexible, scalable, and robust. IF FIT_LVL_3 and SLOPE_LVL_NEG_2 then SELL Since this system has not yet been constructed, the results are unavailable at this time. 7.0 Reference [1] Lam, S.S. “A genetic fuzzy expert system for stock market timing” Evolutionary Computation, 2001. Proceedings of the 2001 Congress on Volume 1, 27-30 May 2001 pp. 410 - 417 vol. 1 [2] Korczak, J.J.; Lipinski, P. “Evolutionary building of stock trading experts in a real-time system” ;Evolutionary Computation, 2004. CEC2004. Congress on Volume 1, 19-23 June 2004 pp. 940 - 947 Vol.1 [3] Simutis, R. “Fuzzy logic based stock trading system” Computational Intelligence for Financial Engineering, 2000. (CIFEr) Proceedings of the IEEE/IAFE/INFORMS 2000 Conference on 26-28 March 2000 pp. 19 – 21 [4] Yamaguchi, T.; Tachibana, Y. “A technical analysis expert system with knowledge refinement mechanism” Artificial Intelligence on Wall Street, 1991. Proceedings., First International Conference on 9-11 Oct. 1991 pp. 86 – 91 [5] Zargham, M.R.; Sayeh, M.R. “A Web-based information system for stock selection and evaluation” Advance Issues of E-Commerce and Web-Based Information Systems, WECWIS, 1999. International Conference on 8-9 April 1999 pp. 81 - 83

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 37 |

posted: | 5/12/2012 |

language: | English |

pages: | 6 |

Description:
College Final Paper for Artificial Intelligence 400 level class

OTHER DOCS BY ellonnic

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.