VIEWS: 609 PAGES: 103 CATEGORY: Business POSTED ON: 5/22/2009 Public Domain
PART A: FORECASTING REAL WORLD DATASETS WITH SPREADSHEET SOF TWARE Forecasting is an important concept for stock investors, since they are willing to invest in the stock with the highest expected gain. In this part of the project, a 20-day forecast for 3 different stocks will be carried out, using different time series forecasting methods. The details of the methods will be included in the Appendix. The given datasets show 89 daily closing prices of Microsoft, General Electric (GE) and Intel stocks from 7/12/1999 to 11/12/1999. However, only the first 69 values will be used for the forecast analysis of the following 20 days and the data for the last 20 days will be used to calculate the error and analyze the sensitivity of the forecast. In presence of “actual value- forecasted value graphs” and MAD, MPE, MSE and RMSE calculations for each method applied to all three datasets, a comparison between the 7 different time series forecasting methods will be made and the performance of the methods will be tested. Next, the seasonality estimates of the stocks will be compared on a line graph and control charts for the first 69 days will be developed to see the efficiency f the forecast anaysis. Lastly, the correlation between the stocks will be calculated by developing associative regression models. 1.1. Microsoft 1.1.1. Naïve Forecast Naïve forecast is the simplest and easiest of the 7 methods that will be carried out. Only the 2 previous data are necessary to make a prediction on the following day. Since in this project, it is supposed that the actual values of the last 20 days are unknown; the forecasted values will be used for the forecasts after the 70th day. Only for the first forecast (18/10/1999), actual values of the 2 previous days’ closing prices will be known. Figure 1 shows the relation between the given data and the forecasted values. For further information on the formula, see Appendix A. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 1 / 103 Figure 1 Figure 2 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 2 / 103 In figure 2, the values in the pink column are the actual values and the ones in the light turquoise column are the forecasted values. The orange cells show the errors for the last 20 days. The MAD, MSE, MAPE and RMSE values are calculated using this data. These values will be represented in the same way for all the forecast methods in this project. See Appendix B for details on these calculations. Figure 3 – Actual values & Forecast graphs In this chart, it is obvious that the naïve method is not suitable for a 20-day forecast although it is reasonable for the first 20 days for which the actual values are given. 1.1.2. Moving Average (n=3) The moving average method is also based on the previous actual values, but it doesn’t only trace the actual data like the naïve method; instead it also smooths the by using a longer range of data. In this project, the number of data points (n) is taken 3. See Appendix C for details. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 3 / 103 Figure 4 In figure 4, it can be seen that the forecasted value depends on the 3 previous values. For the last 20 days, previous forecast values are used instead of actual values since they are supposed to be unknown. Figure 5 – Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 4 / 103 Figure 6 – Actual – Forecast Graph In this chart, it can be seen that the moving average method gives smoother values than the naïve method but the forecast values still lack accuracy without using actual values.The smoothness of the forecast increases when the number of period s (n) increases but consequently the forecast becomes less responsive to real changes. 1.1.3. Weighted Moving Average (n=3) The weakness of moving average is that it gives the same weight to all values. In weighted average method this problem is fixed by giving more weight to more recent values in a series. See Appendix D for the formula. In order to find the optimum weight values, Excel add- in Solver was used. We tried to minimize the RMSE (root mean square error) that was calculate for the first 69 days (for which the actual values were used to forecast), given the constraints that the sum of all weights must be smaller than or equal to 1 and each weight must have a value between 0 and 1. Excel Solver didn’t work properly and was indifferent to the given weights so, the optimum weights were found through trial and error by observing the change in RMSE with different weights. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 5 / 103 Figure 7 Figure 8 – Using Solver to minimize RMSE ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 6 / 103 Figure 9 - Actual &Forecasted values and error calculations Figure 10 – Actual – Forecast graphs In this graph, it can be seen that weighted moving average method gives values closer to actual values but ther is still a lack of responses to real changes. For the last 20 days (where ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 7 / 103 the forecast was based on previous forecast values) none of the rise and falls of the actual values can be observed. 1.1.4. Exponential Smoothing Exponential smoothing goes one step beyond weighted moving average by adding a percentage of the forecasting error to each new forecast. See Appendix E for details. Figure 11 The yellow cell in figure 11 shows the starting forecast. In this project, the actual value of the first day is taken as the starting forecast but instead, average of the first 3 actual values or the naïve approach might also be used. Solver was used to calculate the optimum value of the smoothing constant “α” which minimizes RMSE and gives a forecast with the desired smoothness and responses to real changes. The RMSE mentioned above is the RMSE of the first 69 days for which the actual values are supposed to be known and used to perform the forecast. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 8 / 103 Figure 12 – Using Solver to minimize the RMSE of the first 69 days Figure 13 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 9 / 103 Figure 14 – Actual Values & Exponential Smoothing Graphs Here, in figure 14, it can be seen once more that the forecasted values for the last 20 days don’t show any of the rises and falls in the actual values. That is because previous forecast values were used to forecast the following period. However, the first 5 forecasts (06.Oct.1999 – 16.Oct. 99) are close to the actual values and they also respond well to real changes. 1.1.5. Holt’s Exponential Smoothing In Holt’s exponenetial smoothing method there is an additional smoothing constant “β” which is used to calculate another component of the forecast, the trend: “Tt ”. See Appendix F for details. Solver is used to minimize the RMSE of the first 69 days in order to find the optimum values for α and β given the constraint that α and β must have values between 0 and 1. In figure 15, it can be seen that solver found the solutions: α=0.942545 and β = 0. Since β = 0, Tt = 0 for all periods. The yellow celle in figure 16 shows the initial value of Tt which is estimated to be reasonably 0. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 10 / 103 Figure 15 – Using Solver to minimize RMSE and to find the optimum values for alpha and beta Figure 16 – The relations between alpha, beta, closing prices and Ft, Tt and Ht ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 11 / 103 Figure 17 Since the actual values are unknown for the last 20 periods, it isn’t possible to calculate the F t and Tt values for these periods. Instead the last (69th) values of these components are used to calculate Ht . Actually T69 is multiplied by the period number for these last 20 periods but since T69 = 0 this doesn’t make a difference and it can be seen in figure 17 that Holt’s forecast value is constant for the last 20 days. Consequently, in figure 19, the forecasted values for the last 20 days form a straight line. But it should be noticed that in Holt’s method, these values are almost the average of the actual values, in other words they are closer to the rises and falls. It can be said that Holt’s exponential smoothing is the most responsive to real changes of the methods that have been used so far. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 12 / 103 Figure 18 - Actual &Forecasted values and error calculations Figure 19 – Actual values & Holt’s Exponential Smoothing graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 13 / 103 1.1.6. Linear Regression In linear regression method, we will try to find a linear equation for which the sum of square errors is minimized. See Appendix G for details. Figure 20 – The precedents of the slope “b” Figure 21 – The precedents of the y-intercept “a” Figure 22 – The precedents of F(t) = a + b.t ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 14 / 103 Figure 23 – The relation of a, b and the actual values with the forecasted values In figure 23, it can be seen that the a and b values are constant for the last 19 days. When the actual values are known, a and b of the 2nd previous period is used. However the actual values for the last 20 days are assumed to be unknown. That is why no new a and b values are calculated for these periods. Hence, the latest a and b values shown in the light blue cells are used in forcasting the last 19 days. Figure 24 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 15 / 103 Figure 25 – Actual values & Linear Regression graphs In figure 25, small changes in forecasted values corresponding to real change can be observed but for the last 19 days for which a and b values are fixed, there is a straight line. This shows once again the difference in accuracy without information on actual values. 1.1.7. Winter’s Exponential Smoothing In Winter’s exponential smoothing method, there is an additional component to calculate the forecasted value: the seasonality “St”. Seasonality helps to integrate periodic c hanges into the forecast. Another constant, “gamma” (γ) is used for calculating seasonality. See Appendix H for detailed formulas. Figure 26 is the screenshot of Solver, being used for calculating the optimum alpha, beta and gamma values that minimize the RMSE for the first 69 days, but it didn’t work properly. Instead it kept the initial values of these 3 contants. Therefore, we determined the optimum values through trial and error. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 16 / 103 Figure 26 – Using Solver to minimize RMSE of the first 69 days to calculate the optimum alpha, beta and gamma values Figure 27a – The precedents of Ft and St In figure 27a, the initial values of Ft and St are shown in yellow cells. The initial value of Ft is simply the average of the actual values of the previous 5 days. The number of days were chosen to be 5 on purpose; a season is made up of 5 days in our example (the stock market). The initial St values are calculated by using the formula: S t A 1 A 2 A A 5 t 3 A 4 A 5 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 17 / 103 Again, the reason for taking the average of the first 5 actual values is considering 5 days as 1 season for the stock market. Figure 27b – The precedents of Tt and Wt In figure 27b the initial trend value Tt shown in the yellow cell is taken as 0 since it is assumed that there is no trend at the beginning. For calculating Wt, the corresponding St value is the seasonality value of the period (t-p) where p is the number of days that create a season. For the stock market p=5 since the stock market works only in weekdays. Figure 28 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 18 / 103 Figure 29 – Actual values & Winter’s Exponential Smoothing Graphs In figure 28, it can be seen that for the last 20 days, the error of the forecast grows larger and the forecast line has a downward slope. There is also a periodic variation after 18.10.99. These cycles are caused by the seasonality component that was used in the Winter’s method. Although the Winter’s method is the most complex method we have examined, surprisingly, it isn’t any more accurate than the simpler methods but it would have been more responsive to real changes if there was really a seasonal change in the the closing prices rather than random variations suc as the sudden jump on 21.10.99. For the following 2 datasets (Intel and GE) the same procedures will be held. Therefore, there will only be explanations when there is a particular result. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 19 / 103 1.2.Intel 1.2.1 Naïve Method Figure 30 Figure 31 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 20 / 103 Figure 32 – Actual values & naïve forecast graphs 1.2.2. Moving Average (n=3) Figure 33 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 21 / 103 Figure 34 - Actual &Forecasted values and error calculations Figure 35 – Actual values & Moving average graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 22 / 103 1.2.3. Weighted Moving Average Figure 36 – Using Solver to minimize the RMSE of the first 69 days and to find the optimum weight values Figure 37 – The calculation of weighted moving average ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 23 / 103 Figure 38 - Actual &Forecasted values and error calculations Figure 39 – Actual values & Weighted moving average graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 24 / 103 1.2.4. Exponential Smoothing Figure 40 – Using Solver to minimize the RMSE of the first 69 days and to find the optimum alpha value Figure 41 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 25 / 103 Figure 42 - Actual &Forecasted values and error calculations Figure 43 – Actual values & Exponential smoothing graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 26 / 103 1.2.5. Holt’s Exponential Smoothing Figure 44 Figure 45 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 27 / 103 Figure 46 – Actual values & Holt’s exponential smoothing graphs 1.2.6. Linear Regression Figure 47 – Calculation of a and b Figure 48 – Calculation of the forecasted value, given actual values ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 28 / 103 Figure 49 - Actual &Forecasted values and error calculations Figure 50 – Actual values & Linear regression graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 29 / 103 1.2.7. Winter’s Exponential Smoothing Figure 51 – Using solver to minimize the RMSE of the first 69 days to find the optimum alpha, beta and gamma values Figure 52 – Initial Ft and St values ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 30 / 103 Figure 53 – Tt and St calculation Figure 54 – Wt calculation Figure 55 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 31 / 103 Figure 56 – Actual values & Winter’s exponential smoothing graphs for the last 40 days 1.3. General Electric (GE) 1.3.1 Naïve Method Figure 57- Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 32 / 103 Figure 58 - Actual values & Naïve forecasting graphs 1.3.2. Moving Average (n=3) Figure 59 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 33 / 103 Figure 60 - Actual values & Moving Average Graphs for the last 40 days 1.3.3. Weighted Moving Average (n=3) Figure 61 - Actual &Forecasted values and error calculations ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 34 / 103 Figure 62 - Actual values & Weighted Moving Average graphs 1.3.4. Exponential Smoothing Figure 63 – Using Solver to minimize the RMSE of the first 69 days to get the optimum alpha value ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 35 / 103 Figure 64 - Actual &Forecasted values and error calculations Figure 65 - Actual values & Exponential Smoothing Graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 36 / 103 1.3.5. Holt’s Exponential Smoothing Figure 66 – Using Solver to minimize the RMSE of the first 69 days to get the optimum alpha and beta values and the calculation of Ft, Tt and Ht Figure 67 - Actual &Forecasted values and error calculations for the last 20 days ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 37 / 103 Figure 68 - Actual values & Holt’s Exponential Smoothing graphs 1.3.6. Linear Regression Figure 69 – Calculation of the forecasted values with unknown actual values ( a a nd b are constant) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 38 / 103 Figure 70 - Actual &Forecasted values and error calculations for the last 20 days Figure 71 - Actual values & Linear Regression Graphs ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 39 / 103 1.3.7. Winter’s Exponential Smoothing Figure 72 – Calculation of Ft, Tt, St and Wt Figure 73 - Actual & Forecasted values and error calculations for the last 20 days ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 40 / 103 Figure 74 - Actual values & Winter’s Exponential Smoothing Graphs 2. Seasonality Seasonality refers to regular, repeating variations. In our example of stock market, the seasonal variations are expected occur in every 5 days because the stock market works in 5day periods. The seasonality values have already been calculated for the Winter’s exponential smoothing method. Those values will be used in this part. Seasonality graphs of the 3 companies will be compared in order to see the relation between them. Taking the seasonalities for a single 5-day period is enough to make this comparison. In figure 75, the seasonality values between 11.10.99 and 15.10.99 are taken for the 3 companies but actually the values don’t have to be taken from the same dates, as long as the order of days correspond with each other (i.e. if the first St is taken from the first day of the 5-day period then it should be taken for the first day for teh other companies, the second must be the second for all companies etc.) The dates don’t matter because the seasonal changes are supposed to be the same for every 5-day period (In fact this is the definition of seasonality.). ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 41 / 103 Figure 75 – Seasonality values of corresponding periods of MS, Intel and GE Figure 76 – Closing prices of MS, Intel and GE in the days between 11.10.99 and 16.10.99 In figure 75, it can be seen that the seasonalities of the 3 companies aren’t similar. However, the rises and falls of MS and Intel correspond with each other, although these variations are ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 42 / 103 much more tenser for intel. On the other hand, GE’s seasonality increases after the 4th day, whereas Intel and MS are both decreasing. Similarly between 3rd and 4th days, MS and Intel seasonalities are increasing but GE is decreasing. In figure 76, the closing prices for the same 5 days of MS, Intel and GE are drawn. In this graph it is obligatory to take the values for the same dates. It seems like the 3 lines are very similar but in fact the small changes aren’t visible on this graph. 3. Control Charts It is natural to have errors in forecasts. In fact there are no forecasts with perfect accuracy. However it is possible to analyze the errors to have better forecasts. Errors are acceptable as long as they are random. Control charts are used to check the randomness of errors. See Appendix I for details. In this part, only the control charts for MS will be explained in detail. The RMSE values that were used to determine the upper and lower control limits are the RMSE that were calculated for the first 69 days for which the actual values are assumed to be known, not the RMSE that were calculated at the end of the forecasts, which reflect the error of the forecasts of the last 20 days. The error values (A-F) that are represented on the charts are the error values that were calculated for the last 20 days for which the actual values are assumed to be unknown. (These errors are shown in the orange cells in figures 2, 5, 9, 13, 18, 24 and 28 for MS). ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 43 / 103 3.1. Microsoft 3.1.1. Naïve Forecast Figure 27 – Control chart for the naïve forecasting of MS (RMSE=2,99925) In figure 77, it can be seen that naïve forecasting isn’t a suitable method for this stock. 18 of the errors are above the upper control limit. There is also a visible tendency to increase in these errors(“trend”). This is caused by the fact that the naïve method relies only on the previous actual values. Therefore when there isn’t clear information on actual values (since the actual values are assumed to be unknown for the last 20 days) naïve forecast lags the real changes. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 44 / 103 3.1.2. Moving Average (n=3) Figure 78 – Control chart for moving average forecasting of MS (RMSE=2,5101) In figure 78 we can observe that there aren’t any error points outside the control limits but only 1 point very close to UCL. What is more, the errors are all situated on the upper side of the centerline (x=0). This is another nonrandomness (“bias”). ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 45 / 103 3.1.3. Weighted Moving Average (n=3) Figure 79 – Control chart for weighted moving average forecasting of MS (RMSE=2,07097) In figure 79, it can be seen that there are only 2 error dots outside the upper and lower control limits. However there are still 6 error values on or very close to UCL. The distribution of errors look a lot like the distribution for moving average but there are more errors outside the control limits. That is probably because the RMSE and hence the control boundary is smaller for weighted moving average. 3.1.4. Exponential Smoothing In figure 80, there are 7 error points outside the control limits, the errors are biased (all of them are above the centerline) but RMSE is smaller than the RMSE of the previous examples. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 46 / 103 Figure 80 – Control chart for exponential smoothing of MS (RMSE=2,0605) 3.1.5. Holt’s Exponential Smoothing Figure 81 – Control chart for Holt’s exponential smoothing of MS (RMSE=2,0605) There is only 1 error outside the control limits in figure 81. Although the errors are still biased, they seem to be distributed more randomly. It shouldn’t be forgotten that the RMSE ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 47 / 103 value is the same as exponential smoothing method therefore, it can be said that Holt’s exponential smoothing method has a more random error distribution than previous methods. 3.1.6. Linear Regression In figure 83, we observe that there are no errors outside the control limits. The errors seem to have a random distribution despite being biased. It should be noted that the RMSE value is larger for this forecast. Figure 82 – Control chart for linear regression of MS (RMSE=5,055539) 3.1.7. Winter’s Exponential Smoothing In figure 83, it can be seen that there are no error points beyond the control limits and although the points are still biased, they are scattered close to the centerline. Note that the RMSE is very large for this method. It can be said that this method has random errors and is suitable for forecasting this stock. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 48 / 103 Figure 83 – Control chart for Winter’s exponential smoothing of MS (RMSE=17,4562) 3.1.8. General Observations When we observe the control charts for all forecast methods, we conclude that Winter’s Exponential Smoothing is the one with the most random errors, hence it can be said that it is the best method to forecast MS closing prices. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 49 / 103 3.2. Intel 3.2.1. Naïve Forecast Figure 84 – Control chart for Naïve forecasting of Intel (RMSE=3,00594) 3.2.2. Moving Average (n=3) Figure 85 – Control chart for Moving average of Intel (RMSE=2,32763) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 50 / 103 3.2.3. Weighted Moving Average (n=3) Figure 86 – Control chart for Weighted moving average of Intel (RMSE=1,99513) 3.2.4. Exponential Smoothing Figure 87 – Control chart for Exponential smoothing of Intel (RMSE=1,95341) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 51 / 103 3.2.5. Holt’s Exponential Smoothing Figure 88 – Control chart for Holt’s exponential smoothing of Intel (RMSE=1,97853) 3.2.6. Linear Regression Figure 89 – Control chart for Linear regression of Intel (RMSE=6,428476) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 52 / 103 3.2.7. Winter’s Exponential Smoothing Figure 90 – Control chart for Winter’s exponential smoothing of Intel (RMSE=12,2019) 3.3. General Electric 3.3.1. Naïve Forecast Figure 91– Control chart for Naïve forecast of GE (RMSE=2,57017) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 53 / 103 3.3.2. Moving Average (n=3) Figure 92 – Control chart for Moving Average of GE (RMSE=2,9307) 3.3.3. Weighted Moving Average (n=3) Figure 93 – Control chart for Weighted Moving Average of GE (RMSE=1,98847) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 54 / 103 3.3.4. Exponential Smoothing Figure 94 – Control chart for Exponential Smoothing of GE (RMSE=1,96155) 3.3.5. Holt’s Exponential Smoothing Figure 95 – Control chart for Holt’s Exponential Smoothing of GE (RMSE=1,96155) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 55 / 103 3.3.6. Linear Regression Figure 96 – Control chart for Holt’s Exponential Smoothing of GE (RMSE=4,881793) 3.3.7. Winter’s Exponential Smoothing Figure 97 – Control chart for Winter’s Exponential Smoothing of GE (RMSE=21,2034) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 56 / 103 4. Correlation Correalation is a concept used to determine the strength and direction of a linear relationship between two random variables. In this project the variables are the closing prices o f the 3 stocks; MS, Intel and GE. In this part, the dependence of these variables to each other will be analyzed by using the correlation formule. For details, see appendix J. Figure 98 – Using the “correl” function in MS Excel to calculate the correlation values Figure 99 – Calculation of the dependent values using the equation: y(x)=ax+b In the correlation graphs, the values of the independent variable x are placed on the x-axis and the values of the dependent variable y are placed on the y-axis. For each graph, there are 2 y values: the actual values (drawn in dark blue) and the forecasted values (drawn in pink) which ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 57 / 103 are found by the linear equation y(x)=ax+b. The square of the correlation constant (r^2) takes values between 0 and 1. If r^2 is close to 0 than the forecasted values aren’t close to the actual values. In other words the variations of the independent variable x don’t effect the dependent variable in the same way and the 2 variables aren’t correlated. If r^2 is close to 1, than 2 variables are correlated and the forecasted values are close to the real values. This effeet can be observed in the correlation graphs. The closer the pink and blue lines to each other, the more the variables are correlated. The third graphs show the closing prices for both stocks. These graphs also help to understand the relation between the 2 variables. We will observe whether or not the changes in prices of one stock effects the other or not. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 58 / 103 4.1. Microsoft – Intel Correlation Figure 100 – Actual MS prices vs. MS prices depending on Intel prices (r^2=0,040996) Figure 101– Actual Intel prices vs. Intel prices depending on MS prices (r^2=0,040996) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 59 / 103 Figure 102 – Actual closing prices of MS and Intel In figure 102, it can be seen that in the first shaded area (in blue) MS prices are decreasing while Intel prices are increasing at the same time.At the second shaded area (in pink) both stocks have a sudden fall followed by a recovery. In between these two shaded areas, the prices of the 2 stocks have similar variations. In fact r^2=0,040996 , which is a close value to 0. This shows that MS and Intel prices are weakly correlated. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 60 / 103 4.2. MS – GE Correlation Figure 103 – Actual MS prices vs. MS prices depending on GE prices (r^2=0,023287) Figure 104 - Actual GE prices vs. GE prices depending on MS prices (r^2=0,023287) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 61 / 103 Figure 105 – Actual closing prices of MS and GE In figure 105, we observe that the variations of closing prices of the 2 stocks are almost exactly the same except for the shaded area where there is a small decrease in the closing price of MS followed by a recovery while the closing price of GE has a sudden increase. For these values, r^2=0,023287 which is very close to 0; hence we can only say that MS and GE are weakly correlated. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 62 / 103 4.3. Intel-GE Correlation Figure 106 – Actual Intel prices vs. Intel prices depending on GE prices (r^2=0,028517) In figure 106, it is easy to see that the actual values and forecasted values aren’t close to each other at all. This means that Intel prices aren’t dependent on GE prices. Figure 107 – Actual GE prices vs. GE prices depending on Intel prices (r^2=0,028517) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 63 / 103 Figure 108 – Actual closing prices of Intel and GE In figure 108, it can be seen that there isn’t a particular relation between the closing prices of GE and Intel, apart from the variation in the shaded area where the prices of Intel are increasing while the prices of GE are decreasing. Through the rest of the graph, the rise and falls of the 2 lines correspond with each other even though these are very small changes. The value of r^2=0.028517 is close to 0; it is possible to say that GE and Intel are not correlated. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 64 / 103 PART B 1. NEURAL NETWORKS 1.1.1. What is a Neural Network? “A neural network is computing processing paradigm that is using the way like biological nervous systems, such as the brain, process information. The main idea of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements that work together, working in together to solve specific problems. (n.d., Stergiou C.; Siganos D.)” 1.1.2. Historical background The history of neural separate to several periods: First attempts: McCulloch and Pitts (1943) developed models of neural networks based on their understanding of neurology. Another attempt was using computer simulations by two groups are Farley and Clark, 1954; and Rochester, Holland, Haibit and Duda, 1956. Promising and e merging technology: Psychologists and engineers also have additions to the neural networks. Rosenblatt (1958) developed the PERCEPTRON. This system provides the connecting the output using input. Another system was the ADALINE (Adaptive Linear Element) which was developed in 1960 by Widrow and Hoff (of Stanford University). The ADALINE was an analogue electronic device made from simple components. (Christos Stergiou and Dimitrios Siganos) Period of frustration and disrepute: In 1969 Minsky and Papert wrote a book which is to eliminate funding for research with neural network simulations so considerable prejudice against this field was activated. (Christos Stergiou and Dimitrios Siganos) ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 65 / 103 Innovation: “Grossberg's (Steve Grossberg and Gail Carpenter in 1988) developed the ART (Adaptive Resonance Theory) networks based on biologically plausible models. Anderson and Kohonen developed associative techniques independent of each other. Klopf (A. Henry Klopf) in 1972, developed a basis for learning in artificial neurons based on a biological principle for neuronal learning called heterostasis. Werbos (Paul Werbos 1974) developed and used the back-propagation learning method. Amari (A. Shun-Ichi 1967) was involved with theoretical developments. While Fukushima (F. Kunihiko) developed a step wise trained multilayered neural network for interpretation of handwritten characters. The original network was published in 1975 and was called the Cognitron. (n.d., Stergiou C.; Siganos D.)” Re-emergence: Extensive books and conferences provided a comment between people in diverse fields with specialized technical languages, and the response to conferences and publications was quite affirmative. The news media picked up on the increased activity and tutorials are affected to developed the technology. Academic programs appeared and courses were inroduced at most major Universities (in US and Europe). Attention is now highlighted on funding levels throughout Europe, Japan and the US and as this funding becomes usable, several new commercial with applications in industry and finacial institutions are appear. Today: “Significant progress has been made in the field of neural networks-enough to attract a great deal of attention and fund further research. Advancement beyond current commercial applications appears to be possible, and research is advancing the field on many fronts. Neurally based chips are emerging and applications to complex problems developing. Clearly, today is a period of transition for neural network technology (n.d., Stergiou C.; Siganos D.).” 1.1.3. Why use neural networks? Neural networks, with their extraordinary capability to derive meaning from complicated or uncertain data, can be used to extract samples and detect trends that are too complex to be noticed by either humans or other computer techniques. “A trained neural network can be ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 66 / 103 thought of as an "expert" in the category of information it has been given to analyse. This expert can then be used to provide projections given new situations of interest and answer "what if" questions (n.d., Stergiou C.; Siganos D.).” Other advantages include: Adaptive learning, Self-Organisation, Real Time Operation, Fault Tolerance via Redundant Information Coding 1.1.4. Types of neural networks Single-layer feedforward network There are many different kinds of neural networks. From witch only a few will be described here. One of them is called single-layer feedforward network. Though, we will shortly explain some things about the different kinds of networks. Multilayer-network A network as this doesn't contain any neurons that are connected to themselves or any others earlier in the system. This makes it easy to train, but there are certain limitations - it cannot achive any given asignment. Minsky and Papert discovered the so called XOR-problem which was impossible to solve without another layer of neurons. The solution to this was multilayernetwork which has two or more layers. Recurrent networks A recurrent network has, not like the feedforward networks, neurons that transports a signal back through the network. 1.2.1. The Basics of Neural Networks Neural neworks are typically organized in layers. Layers are made up of a number of interconnected 'nodes' that contain an 'activation function'. Patterns are presented to the network via the 'input layer', which communicates to one or more 'hidden layers' where the actual processing is done via a system of weighted 'connections'. The hidden layers then link to an 'output layer' where the answer is output as shown in the graphic below: ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 67 / 103 Figure 109 Most Neural Networks contain some form of ' learning rule' which changes the weights of the connections in respect of the input patterns that it is presented with. Although there are many different kinds of learning rules used by neural networks; but the delta rule is very important because it is useful for the most common class of Neural Networks called 'backpropagational neural networks' (BPNNs). Backpropagation is a shortening for the backwards propagation of error. “With the delta rule, as with other types of backpropagation, 'learning' is a supervised process that occurs with each cycle through a forward activation flow of outputs, and the backwards error propagation of weight adjustments. More simply, when a neural network is initially presented with a pattern it makes a random 'guess' as to what it might be. It then sees how far its answer was from the actual one and makes an appropriate adjustment to its connection weights (n.d., n.a.).” More graphically, the process looks something like this: ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 68 / 103 Figure 110 “That within each hidden layer node is a sigmoidal activation function which polarizes network activity and helps it to stablize (n.d., n.a.).” Backpropagation performs a slope descent within the solution's vector space towards a 'global minimum' along the fast vector of the error surface. The global minimum is that theoretical solution with the lowest possible error. The error surface itself is a hyperparaboloid but is seldom 'smooth' as is reflected in the graphic below. Figure 111 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 69 / 103 Since the nature of the error space can not be known a primacy, neural network analysis often requires a large number of individual runs to decide the best solution. Most learning rules have built in mathematical terms to assist in this process which control the 'speed' (Betacoefficient) and the 'momentum' of the learning. The speed of learning is actually the rate of inosculation between the current solution and the global minimum. Momentum helps the network to overcome obstacles (local minima) in the error surface and settle down at or near the global miniumum. “Once a neural network is 'trained' to a satisfactory level it may be used as an analytical tool on other data. To do this, the user no longer specifies any training runs and instead allows the network to work in forward propagation mode only. New inputs are presented to the input pattern where they filter into and are processed by the middle layers as though training were taking place, however, at this point the output is retained and no backpropagation occurs. The output of a forward propagation run is the predicted model for the data which can then be used for further analysis and interpretation (n.d., n.a.).” 1.2.2. Using a Neural Network How the problem is work and is trained are important steps for neural networks. Neural networks work by using some input variables, and producing some output variables. In other words, they are using the accessible information for reaching some unknown information. Some examples are: Stock market prediction: Using last week's stock prices and today's DOW, NASDAQ, or FTSE index; predict the tomorrow's stock prices. Credit assignme nt: You want to know whether an applicant for a loan is a good or bad credit risk. You usually know applicants' income, previous credit history, etc. Control: Using the robot’s camera, you can reach the target because you use the robot’s camera for looking whether a robot should turn left, turn right, or move forwards. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 70 / 103 In general, the main reason for using a neural network is people do not know the exact nature of the relationship between inputs and outputs because if they knew the relationship, they would model it directly. The other key feature of neural networks is that they learn the input/output relationship through training. There are two types of training used in neural networks, with different types of networks using different types of training. These are supervised which is the most imprtant and unsupervised training. In supervised learning, the network user assembles a set of training data which contains examples of inputs together with the corresponding outputs, and the network learns to infer the relationship between the two. Training data is usually taken from historical r ecords. “The neural network is then trained using one of the supervised learning algorithms which uses the data to adjust the network's weights and thresholds so as to minimize the error in its predictions on the training set. If the network is properly trained, it has then learned to model the (unknown) function that relates the input variables to the output variables, and can subsequently be used to make predictions where the output is not known (n.d., n.a.).” 1.2.3. Gathering Data for Neural Networks First step is choosing variables that you believe may be influential. Second step is numeric and nominal variables can be handled. Convert other variables to one of these forms, or discard. Third step is hundreds or thousands of cases are required; the more variables, the more cases. Fourth step is cases with missing values can be used, if necessary, but outliers may cause problems. In here, checking data is very important. Remove outliers if possible. If you have sufficient data, discard cases with missing values. Fifth step is if the volume of the data available is small, consider us ing ensembles and resampling. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 71 / 103 1.3. Application areas of neural networks There are a lot of applications of neural networks such as medicine, sales forecasting, industrial process control, customer research, data validation, risk management, target marketing. Apart from these areas, communications is very important area for neural networks. There are two areas in communications. These are wireless communications and telecommunications networks. “In recent years, the art of using neural networks (NNs) for wireless-communication engineering has been gaining momentum. Although it has been used for a variety of purposes and in different ways, the basic purpose of applying neural networks is to change from the lengthy analysis and design cycles required to develop high-performance systems to very short product development times. This article overviews the current state of research in this area. Different applications of neural-network techniques for wireless communication front ends are briefly reviewed, stressing the purpose and the way neural networks have been implemented, followed bya description of future avenues of research in this field (June 2004, Patnaik, A.; Anagnostou, D. E.; Mishra, R. K., Christodoulou, C. G.; Lyke, J.C.).” “This article discusses the application of neural networks to routing in telecommunications networks under normal and abnormal conditions. Concepts related to optimal routing will be discussed from the theoretical and telecommunications networks perspective to define a translation to the neural network paradigm. A sample network is then used to determine optimal neural network parameters, which are then tested to determine routing accuracy and performance under normal and abnormal routing conditions. Following the analysis of t he results, conclusions and recommendations on the results will be provided (n.d., Collett M.; Pedrycz, W.).” 1.4. Advantages and disadvantages of neural network The major advantages and disadvantages of neural networks in modeling applications are as follows: ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 72 / 103 Advantages There is no need to assume an underlying data distribution such as usually is done in statistical modeling. Neural networks are practicable to multivariate non- linear problems. The transformations of the variables are automated in the computational process. Disadvantages Minimizing overfitting requires a great deal of computational effort. The individual relations between the input variables and the output variables are not developed by engineering judgment so that the model tends to be a black box or input/output table without analytical basis. The sample size has to be large. 2.1. LITERATURES ON THE USE OF NEURAL NETWORKSFOR FORECASTING 2.1.1. Application title: Application of Neural Networks for Short-Term Load Forecasting 2.1.2. Application domain: The model can forecast daily load profiles with a load time of one day for next 24 hours. 2.1.3. Application details: Reza Afkhami, PRT Inc., Dallas, Texas http://www.prt-inc.com ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 73 / 103 F. Mosalman Yazdi is with the Department of Electrical Engineering, Islamic Azad University of Mehriz, Iran 2.1.4. Description(s) and characte ristics of the dataset(s): “In this method can divide days of year with using average temperature. Groups make according linearity rate of curve. Ultimate forecast for each group obtain with considering weekday and weekend. 24 hours of a day divided to 3 groups at 8 hours, network for every each of eight varieties must interpolate. This paper investigates effects of temperature and humidity on consuming curve. For forecasting load curve of holidays at first calculate pick and valley and then the neural network forecast is re-shaped with the new data. The networks are trained using hourly historical Load data and daily historical max/min temperature and humidity data (n.d., Afkhami, R.; Yazdi, F. M.).” 2.1.5. Types of neural networks used: Self recurrent neural network 2.2. 2.2.1. Application title: Application of Neural Network Model Combining Information Entropy and Ant Colony Clustering Theory for Short-Term Load Forecasting 2.2.2. Application domain: This paper presented a hybrid neural network model to integrate information entropy theory and ant colony clustering for load forecasting. 2.2.3. Application details: Wei Sun, School of Business Administration, North China Electric Power University, Baoding 071003, China Jian-Chang Lu, School of Business Administration, North China Electric Power University, Baoding 071003, China ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 74 / 103 Yu-Jun He, School of Electronic and Communicational Engineering, North China Electric Power University, Baoding 071003, China Jian-Qlang Li, Department of Automation, North China Electric Power University, Baoding 071003, China 2.2.4. Description(s) and characte ristics of the dataset(s): “First, information entropy theory is used to select relevant ones from all load influential factors and reduce the irrelevant factors thus reducing the input variables of the neural networks. Next, using ant colony clustering method, the practical historical load data within one year is divided into several groups. Each group is modeled by a separate neural network module. The typical samples in each clustered group were selected as the training set for the separate OHF Elman neural network in order to reduce the training time and improve convergent speed. Last, the performance of the presented model is tested by using actual load data (August 2005, Sun W.; Lu J.; He Y.; Li J.).” 2.2.5. Types of neural networks used: Elman neural network which is a kind of globally feed forward locally recurr ent network model with distinguished dynamical characteristics 2.2.6. Explanation of how we ll neural networks perform compared to other methods and in comparison to each other: This neural network could provide a considerable improvement of the forecasting accuracy. 2.3. 2.3.1. Application title: Information Entropy Based Neural Network Model for Short-Term Load Forecasting 2.3.2. Application domain: This paper presented a hybrid model to integrated information entropy and data mining theory with neural network to establish a new short-term load forecasting model. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 75 / 103 2.3.3. Application details: This work was supported by Specialized Research Fund for the Doctoral Program of Higher Education (20040079008), Scientific Research Foundation for Doctor's degree Teachers of North China Electric Power University (2005), Scientific Research Foundation for Young Teachers, North China Electric Power University (20041105). Wei Sun and Jianchang Lu are with the Department of Economics & Management, North China Electric Power University, Baoding, 071003. Yujun He is with the Department of Electronic and Communication Engineering, North China Electric Power University, Baoding, 071003 China. 2.3.4. Description(s) and characte ristics of the dataset(s): “First, information entropy theory is used to select relevant ones from all influential factors; the results are used as inputs of neural network. Secondly, according to the features of power load, the typical historical load data samples were selected as the training set which have the same weather characteristic as the certain forecasting day by using data mining theory. Finally, Elman neural network forecasting model is constructed combining the reduced factors and typical training set (2005, Sun W.; Lu J.; He Y.).” 2.3.5. Types of neural networks used: Elman neural network Hybrid neural network 2.3.6. Explanation of how we ll neural networks perform compared to other methods and in comparison to each other: The benefit of the hybrid structure was to reduce the training time and improve convergent speed. The performance of the model for forecasting is presented and evaluated using actual load data. The test results showed the proposed forecasting method could provide a considerable improvement of the forecasting accuracy. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 76 / 103 2.4. 2.4.1. Application title: Application Of Neural Networks to the Problem of Forecasting the Flow of the River Nile 2.4.2. Application domain: This paper applies multilayer neural networks to the problem of forecasting the flow of the River Nile in Egypt. 2.4.3. Application details: Amir Atiya, Dept of Electrical Engineering, Caltecli, MS 136-93, Pasadena, CA 91125 Suzan El-Shoura, Electronic Research Institute, Tahrir Street, Dolcki, Giza, Egypt Samir Shaheen, Dept of Coinputer Engineering, Cairo University, Giza, Egypt Mohamed El-Sherif, Electronic Research Institute, Tahrir Street, Dolcki, Giza, Egypt Description(s) and characte ristics of the dataset(s): Compare between four different methods for input and output preprocessing, includ ing a novel method processing posed here based on the Discrete Fourier Series. Types of neural networks used: Multilayer networks 2.5. Application title: Forecasting with Fuzzy Neural Networks: A Case Study in Stock Market Crash Situations Application domain: ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 77 / 103 In this paper a case study describes a comparison of fuzzy neuml networks and the classical apprwch during the stock market cmshes of 1987 and 1998. Application details: Martin Rast, Ludwig-Maximilians-Universitat, Mathematisches Institut Theresienstr. 39/334, 80333 Munich, Germany Types of neural networks used: Classical Neural networks Fuzzy neural networks Explanation of how we ll neural networks perform compared to other methods and in comparison to each other: This case study lead to the result that fuzzy neural networks can outperform classical approaches in extreme situations, while in standard situations the latter models have higher prediction quality. 2.6. Application title: A Comparison Between Neural-Network Forecasting Techniques - Case Study: River Flow Forecasting Application domain: Estimating the flows of rivers can have significant economic impact, as this can help in agricultural water management and in protection from water shortages and possible flood damage. Application details: Manuscript received May 6, 1997; revised December 10, 1998 and January 4, 1999. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 78 / 103 Atiya is with the Department of Electrical Engineering, Caltech, Mail Stop 136-93, Pasadena, CA 91125 USA. S. El-Shoura and M. S. El-Sherif are with the Electronic Research Institute, Tahrir Street, Dokki, Giza, Egypt. S. I. Shaheen is with the Department of Computer Engineering, Faculty of Engineering, Cairo University, Giza, Egypt (e- mail: sshaheen@frcu.eun.eg). Publisher Item Identifier S 1045-9227(99)02343-7. 2.7. Application title: An analysis of neural- network forecasts from a large-scale, real- world stock selection system Application domain: Estimating the flows of rivers can have significant economic impact, as this can help in agricultural water management and in protection from water shortages and possible flood damage. Application details: Ganesh Mani Kung-Khoon (KK) Quah Sam Mahfoud Dean Barr LBS Capital Management, 311 Park Place Blvd., Suite 330 Clearwater, FL 34619 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 79 / 103 Description(s) and characte ristics of the dataset(s): The first goal of this paper is to apply neural networks to the problem of forecasting the flow of the River Nile in Egypt. The second goal of the paper is to utilize the time series as a benchmark to compare between several neural- network forecasting methods.We compare between four different methods to preprocess the inputs and outputs, including a novel method proposed here based on the discrete Fourier series. We also compare between three different methods for the multistep ahead forecast problem: the direct method, the recursive method, and the recursive method trained using a backpropagation through time scheme. We also include a theoretical comparison between these three methods. The final comparison is between different methods to perform longer horizon forecast, and that includes ways to partition the problem into the several subproblems of forecasting K steps ahead. Types of neural networks used: Elman neural network Hybrid neural network Explanation of how we ll neural networks perform compared to other methods and in comparison to each other: The multistep ahead comparison yielded the direct method as the best method. It also confirmed the fact, well known in the neural- network community, that the backpropagation through time approach is what is needed to train a recursive approach. Concerning the longer horizon forecast, forecasting the whole period in one step tends to work better. 3. Forecasting with Neural Network Software Forecast of the datasets that were used in Part A will be explained within the performance analysis of each software product. 3.1. Alyuda Forecaster XL 2.3 a. Company Profile Name: Alyuda Researc, Inc. Website: http://www.alyuda.com ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 80 / 103 Origin Country: USA b. Availability The demo version and the 30-day trial of the software can be downloaded from the official website http://www.alyuda.com without any requirements. The shareware and the demo version are also available on http://www.download.com . c. Open source/Source hidden The software is source hidden. It is clearly stated in the licence agreement that the single user “shall not, nor permit any party to reverse engineer, disassemble, decompile, or translate the Software, or otherwise attempt to derive the source code of the Software”. d. Price The software has a 30-day trial version. The price of the full version for single user is $149 and for unlimited site $999. The company also offers a 30% educational discount and a volume discount for more than 2 purchases. e. Performance The overall performance of the software was satisfactory even though we weren’t able to input too many datasets at the same time because o f the limits of the shareware version. The performance report for the forecast of the datasets that were used in Part A showed that the results very reasonable. The RMSE for each dataset are shown at the end of the summaries. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 81 / 103 Figure 2812 – The forecast results and the preformance report of Forecaster XL for Intel Figure 113 – The performance report and forecast values (Ft) for Microsoft ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 82 / 103 Figure 114 – The performance report and forecast values (Ft) for GE f. Automation: The software is automatic. It is sufficient to give the inputs to do the forecast. However, when there is a single dataset as input, the user has to choose the time-series option. Afterwards, the software carries out the forecast without asking which method to use. g. Data import and export capabilities Forecaster XL is an MS Excel add- in, therefore it is easy to use it with the datasets in excel. Instead of importing data to the software, Forecaster XL can be used with the datasets like any other MS Excel tools. h. Graphical Capabilities: The software includes actual- forecast graphs (line and scattered), deviation graph and input importance table along with the performance report by default. In addition to that, the ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 83 / 103 forecasted values are automatically colored. The training report which is given on demand includes additional charts such as MSE vs. Iteration graph. i. Scalability: In the shareware version of the software, there is a limit on the number of input data. However, with teh maximum amount of data allowed (500 observations), the forecast was held successfully without operating slowly. The number of predictions are also limited for this version, but the software was capable of predicting 500 datas although these predictions were labelled as “bad forecasts”. Figure115 – The performance report of 500 forecasts based on 500 observations. j. Usability The software is easy to use with respect to GUI. Since it has an interface embedded in MS Excel, an Excel user doesn’t have the feeling of working with a new software. Forecaster XL is just a new icon on the toolbar.The menus and icons aren’t difficult to understand. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 84 / 103 Figure 116 – User Interface of Alyuda Forecaster XL k. Documentation The readme file gives brief information on the software, introducing the new and features of this version. The help files are very rich in content and is very detailed (even Excel shortcuts are noted) . The help section can let a simple computer user to do time series forecasting using this software. There are also 7 different forecasting examples which help to learn using the software. l.Overall score : Availability Open source/Source hidden Price Performance Automation Data Import/Export Capabilities Graphical Capabilities Scalability Usability Documentation Overall 10 0 9 10 10 10 8 8 10 10 85 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 85 / 103 3.2. Neural Tools a. Company Profile Name: Palisade Corporation Website: http://www.palisade-europe.com/ Origin Country: USA b. Availability The demo version and the 10-day unlimited trial version of the software can be downloaded from the official website http://www.palisade-europe.com/ after filling out a form. c. Open source/Source hidden The software is source hidden. d. Price The software has a 10-day trial version. NeuralTools Standard Neural Tools Professional $395 (download only: $380) $595 (download only: $580) The company has a student discount up to 90% if the student is enrolled in a programme that recommends the use of NeuralTools. e. Performance No difficulties were faced in using NeuralTools with multiple datasets simultaneously. The overall performance of the software was adequate. The forecast results of the datasets given in Part A show are reasonable and accurate. The RMSEs are shown in the training reports. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 86 / 103 Figure 117 - The forecast results and the training report of NeuralTools for Intel Figure 118 - The forecast results and the training report of NeuralTools for MS ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 87 / 103 Figure 119 - The forecast results and the training report of NeuralTools for GE f. Automation: NeuralTools is automatic. It is sufficient to give the inputs to do the forecast.The user should only define the dataset to train the neural network. In order to start the forecast the dataset should be defined once more. Apart from these steps the software doesn’t demand much from the user. It chooses itself the best method of forecast. g. Data import and export capabilities NeuralTools is also an MS Excel add- in. Instead of importing data to the software, NeuralTools uses the datasets in Excel like any other MS Excel tool as long as the data and the variables are properly introduced to the data manager. h. Graphical Capabilities: The software includes actual- forecast graphs (scattered) and training histogram along with the training report by default but these charts are much more plainer than Forecaster XL’s colorful graphics. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 88 / 103 i. Scalability: There is no limit in the number of datasets or observations that ca n be used in the trial version of NeuralTools. Hence, we were able to try forecasting the gold prices in US which had 1108 observations. The results were unsatisfactory. The software couldn’t make reasonable predictions. However this might be due irrational variatios and lags in the dataset since 1108 days is a long time period and our data isn’t very reliable. Figure 120 – The training report of NeuralTools for the dataset with 1108 observations j. Usability The software is easy to use with respect to GUI. Especially for an Excel user, the interface looks very familiar. In addition to an icon on the menu, NeuralTools adds a toolbar to Excelon which there are dataset manager, train, test and predict icons are settled. The menus and icons are easy to use. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 89 / 103 Figure 121 – NeuralTools toolbar and menu interface k. Documentation The help file gives detailed information on the software, introducing the new and features of this version. The help contents are very detailed and explain forecasting step by s tep. There also tutorials and example forecasts to help learning how to use the software. l.Overall score : Availability Open source/Source hidden Price Performance Automation Data Import/Export Capabilities Graphical Capabilities Scalability Usability Documentation Overall 10 0 8 10 10 10 7 7 10 9 81 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 90 / 103 4.Tutorial for Alyuda Forecaster XL Introdution: Alyuda Forecaster XL is a user friendly software with a very simple interface. We will perform a time-scale forecast using the data we are given. We are given 89 daily closing prices of Microsoft stock from 7/12/1999 to 11/12/1999 but the last 20 prices are assumed to be unknown and we will forecast for these days. Getting Started Forecaster XL automatically adds a menu icon on MS Excel. First we have to define the dataset that we will use. Open the dataset in MS Excel. We will not be using the data for the last 20 days, but we can use these actual values to check the accuracy of our forecast. Let’s move these data to the nearest column. To do this first select these data by holding SHIFT while holding on the SCROLL DOWN key. Now, right click on your mouse and click CUT from the menu.Go to cell C71, right click and press PASTE. We will have the forecasted values just below the actual value of the last day. Figure 122 Defining the Input Data Now that we have moved the actual values to the neighbouring cell, we will be able to compare the 2 values easily. We will perform a time series forecast, so open the Forecaster XL menu and click the time series button. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 91 / 103 Figure 123 Figure 124 The data selection method should be “by columns” since our data is placed chronologically in columns.You should fill in the “input columns” cell as seen in figure 124. To choose the input columns you can either write the cell numbers or you can simpl e select the cells by dragging you mouse on the desired cells. The software will automatically perceive your selection if you have clicked once in the “input columns” box before selecting the cells. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 92 / 103 The first row is row 1; you can define the first row by simply clicking cell A1. Make sure that the “Labels in the first row” checkbox is checked. This way the software recognizes all data in the first row as titles and doesn’t take them into account while training the network. Our data are daily values and since these data are tajen from the stock market they don’t have a periodicy. Therefore teh period should be chosen 1 whic hmeans that there are no periods. If we had monthly data then the period would have been 12 or for a quarterly data, 4. We desire to forecast the following 20 days so the “lookahead”value is 20. Figure 125 If you click the “more” button at the left, you will see this new parameter which is the “output cell”. You can select any cell you like to see the forecast vaslues in. If you don’t specify any cells, the forecast values will be put just below the actual values in cell B70. For now we will leave the “output cell1 choice undefined. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 93 / 103 Forecasting Ready to forecast? Just click forecast! You will be automatically directed to the bottom of the dataset where the forecasted values would be placed. The forecasted values are shown in light blue and right next to them, there are the actual values of the last 20 daysthat we have moved. At the bottom, you should notice that there is a new Excel sheet named “Performance Report”. Forecaster XL prepares a performance report at the end of each forecast by default. Figure 126 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 94 / 103 Analysis Figure 127 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 95 / 103 By clicking the “reports” button form the Forecaster XL menu, you can get the training and the data reports as well. Figure 128 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 96 / 103 Alte rnative Method This was the automatic way to do the forecast in which the software creates the network, trains and forecasts at once. If you want to create and train the network yo urself,you should click “create network” from the menu. Figure 129 This time we are obliged to define the target (output) cell and the out put cell must have the same number of columns with the input cell. Our output cell is going to be cells B70:C90. We should label cells B70 and C70 as “date” and “close” since we will select the “labels in the first row” box. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 97 / 103 “Forecast empty targets” box should also be checked so that the software will know those cells are empty to be forecasted. When you have finished these steps, click “train”. Figure 130 When training is completed, the neural network is prepared and ready to be forecasted. By clicking “forecast” from the menu and selecting the target cells you can complete the forecast. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 98 / 103 References: Afkhami, R. and Yazdi, F. M. (n.d.). Application of Neural Networks for Short-Term Load Forecasting. Retrieved December 4, 2006, from Information Center database. Atiya, A., El-Shoura, S., Shaheen, S., El-Sherif, M. (n.d.) Application Of Neural Networks to the Problem of Forecasting the Flow of the River Nile. December 5, 2006, from Information Center database. Collett M.; Pedrycz, W. (n.d.). Application of Neural Networks for Routing in Telecommunications Networks. December 5, 2006, from Information Center database. El-Shoura, S., El-Sherif, M. S. and Shaheen, S. I. (n.d.) A Comparison Between NeuralNetwork Forecasting Techniques - Case Study: River Flow Forecasting. December 4, 2006, from Information Center database. Mani, G., Quah, KK., Mahfoud, S. and Barr D. (n.d.) An analysis of neural-network forecasts from a large-scale, real-world stock selection system. December 3, 2006, from Information Center database. Patnaik, A.; Anagnostou, D. E.; Mishra, R. K., Christodoulou, C. G.; Lyke, J.C. (June 2004). Applications of Neural Networks in Wireless Communications. December 5, 2006, from Information Center database. Rast, M. (n.d.) Forecasting with Fuzzy Neural Networks: A Case Study in Stock Market Crash Situations. December 5, 2006, from Information Center database Stergiou, C. and Siganos, D. (n.d.). “Neural Networks” Retrieved December 5, 2006, from http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html Sun, W., Lu, J., He, Y. and Li, J. (2005). Application of Neural Network Model Combining Information Entropy and Ant Colony Clustering Theory for Short-Term Load Forecasting. December 4, 2006, from Information Center database. ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 99 / 103 Sun, W., Lu J. and He, Y. (n.d.). Information Entropy Based Neural Network Model for Short-Term Load Forecasting. December 5, 2006, from Information Center database. (n.a.), December 2, 2006. “Artificial neural network” December 5, 2006, from http://en.wikipedia.org/wiki/Artificial_neural_network (n.a.), (n.d.). “Neural Networks” December 5, 2006, from http://www.statsoft.com/textbook/stneunet.html (n.a.), (n.d.). “A Basic Introduction To Neural Networks” December 4, 2006, from http://www.cs.wisc.edu/~bolo/shipyard/neural/local.html (n.a.), (n.d.). “Neural networks : Introduction” December 3, 2006, from http://library.thinkquest.org/C007009/introduction/types/types-5.html ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 100 / 103 Appendix Appendix A: Naïve forecast is simply the sum of the last value of the series and the difference (negative or positive) of the last two values of the series. Formula: F(t+1) = A(t) + (A(t) – A(t-1)). Naïve forecasting can be more accurate with stable series where the variations are around an average,in other words, with a series whose graph has a slope close to 0. This method is prefered for its simplicity and low cost. Appendix B MAD, MSE and MAPE are commonly used methods to measure the accuracy of a forecast. MAD A t F t n (mean absolute deviation) 2 MSE A t F t n 1 (mean squared error) A t At F n t 100 MAPE (mean absolute percent error) RMSE MSE (root mean squared error) Appendix C F t MA n A i 1 n t i n i = An index that corresponds to time periods n = Number of periods in the moving average Ai = Actual value in period t-i MA = Moving Average Ft = Forecast for time period t The formula that was used to fill in the cells: F(t+1) = (A(t) + A(t-1) + A(t-2)+…+A(t-(n-1))) / n ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 101 / 103 Appendix D i n 1 Ft 1 w i0 ti At i → Ft+1 = wn At-n + wn-1 At-(n-1) + ….+ w1 At-1 Knowing that: w i 1 n i 1 Appendix E The percentage of the forecast error is represented by the smoothing constant “α”. (1 – )Ft Ft+1 = At + (1 – ) Ft Knowing that: 0 ≤ α ≤1 Appendix F In Holt’s exponential smoothing there are 2 smoothing constants “α” and “β”. These constants are used to calculate F t and Tt which are the precedents of the forecasted value Ht . Formulas: Ft+1 = At + (1 – )(Ft + Tt) Tt+1 = (Ft+1 - Ft)+ (1 – )Tt Ht+m = Ft+1 + mTt+1 Knowing that: 0 ≤ α ≤1 and 0 ≤ β ≤1 (smoothed value for the next period) (trend value for the next period) (Holt’s forecast for period (t+m)) Appendix G Linear regression relies on obtaining an equation of a straight line (y = bx + a) with minumum squared errors where y is the dependent variable and x is the independent variable. y(t) = a + b.t (Ft ) a: y-intercept (value of F0 ) b: slope of the line b n ty n t 2 t y t 2 a y b t n ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 102 / 103 Appendix H Formulas for Winter’s exponential smoothing: Ft = At / St-p + (1- )(Ft-1 + Tt-1 ) (smoothed value of the period) Tt = (Ft – Ft-1 ) + (1- )Tt-1 St = At / Ft + (1-)St-p Knowing that: 0 ≤ α ≤1; 0 ≤ β ≤1 and 0 ≤ ≤1 p: number of periods in a season (trend value of the period) (seasonality value of the period) Appendix I A forecast meets the requirements as long as the errors are random. The randomness of errors is checked by analyzing the control charts. On a control chart, there are 2 boundaries: upper control limit (UCL=2 x RMSE) and lower control limit (LCL = -2 x RMSE). The errors (difference between teh actual values and forecast values: A-F) must be between these 2 boundaries in order to justify the randomness. However, there are also 3 more types of nonrandomness. The errors should not be concentrated on a particular side in between the boundaries. Having too many points on one side of the centerline is called a bias. The errors should not have a constant downward or upward slope which might cause a trend. Finally, the errors shouldn’t have a cyclical variation. Appendix J Correlation formule: r n ( xy ) ( x )( y ) n( ) ( x ) 2 x 2 n ( y ) ( y ) 2 2 ENS 108, Fall 2006 Project 01 Dicle Evliyaoğlu , Deniz Say 103 / 103