Document Sample

THE DEVELOPMENT AND VALIDATION OF FUZZY LOGIC METHOD FOR TIME-SERIES EXTRAPOLATION By JEFFREY STEWART PLOUFFE A DISSERTATION SUBMITTED IN PARTICAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN BUSINESS ADMINISTRATION UNIVERSITY OF RHODE ISLAND 2005 -1- ABSTRACT It has been established across a large number of studies that statistically simple forecasting method, for the extrapolation of univariate time-series of business data, provide more accurate ex ante forecasts, in most situations, than do ones that are statistically sophisticated. The problem is that scholars attempting to develop new, more accurate forecasting methods have all but ignored this knowledge on forecast accuracy. Fildes and Makridakis (1998), Makridakis Hibon (2000), Fildes (2001) and Small and Wong (2002) suggest that what is needed are new statistically simple extrapolative forecasting methods that are robust to the fluctuations that occur in business data. This dissertation discusses the development and validation of the Direct Set Assignment (DSA) extrapolative forecasting method. The DSA method is a new, statistically simple, non-linear extrapolative forecasting method that was developed within the Mamdani Development Framework, and was designed to mimic the architecture of a fuzzy logic control system. The relative forecast accuracy of the DSA method was established through the use of three forecasting competitions. The time series used in these competitions, as required, included one hundred thirty five series drawn from the M3 International Forecasting Competition. These series represent nine subcategories and three categories of data including yearly, quarterly and monthly series each containing microeconomic, macroeconomic and industry data. In the first competition it was found that the fuzzy set parameter, in the range of two fuzzy sets to twenty fuzzy sets, can be manipulated in the DSA method to improve ex ante forecast accuracy. In Competitions # 2 and #3 the most accurate DSA methods from competition #1 were compared to alternative simple extrapolative methods, including those that were found to produce the most accurate forecasts in the M3 Forecasting Competition held in 2000. The DSA method and its combination with Winter's Exponential smoothing -2- provided the highest observed forecast accuracy in seven of the nine subcategories of time series, and was ranked in the top three in terms of observed accuracy in the other two subcategories. In addition, these methods provided the highest observed accuracy in two of three categories of time series and were ranked in the top three in terms of observed accuracy. They also provided the highest observed forecast accuracy for all one hundred thirty five series used in the competition and they provided the highest observed accuracy when for time series with a trend component. -3- ACKNOWLEDGEMENTS There are a number of people who have contributed both directly and indirectly to this research who deserve my thanks. This includes my parents Robert and Mildred Plouffe, whose love of discovery and incomparable work ethic have served as a constant source of inspiration to me. Thank you also to the exceptional group of scholars who I have been so fortunate to have as program advisors. Professor Jeffrey Jarrett, my major advisor, mentor and friend for over a decade provided me extraordinary latitude in pursuing my interests, but was always there to provide a course correction when required. I am forever grateful to him, and would not have completed this work without his patience and guidance. Professor Shaw Chen in countless meetings and discussions taught me how to operationalize my research ideas and helped me develop an appreciation for analytical rigor. His sights regarding this research were invaluable and it has been a pleasure to work with him. Professor Jerry Cohen has been a continual source of ideas. His sage advice is reflected throughout this project and his attitude toward research, and his commitment to quality has set a standard to which I aspire. Professor John Boulmetis has provided me with endless support and encouragement throughout my program. John's enthusiasm and curiosity for research and teaching have influenced and inspired me more than he could possibly know. A special thank you to Professor Choudary Hanumara and Dean Maling Ebrahimpour for their time, and their interest in this project. Their support and patience throughout the later stages of my program has been greatly appreciated, and their suggestions for improvements advanced the quality of this research. These two gentlemen represent the very best of the academic profession. I also owe a special debt of gratitude to Dean Maling Ebrahimpour and Professor Paul Mangiamelli for encouraging me to become a member of the Decision Science Institute, whose conferences served as the platform on which this research -4- was developed. I am also especially grateful to Michelle Hibon, Senior Research Fellow at INSEAD Business School, for providing me with, and helping me sort through, the extraordinary amount of data generated by the M3-Competition. Her suggestions lead to significant improvements to this project overall. -5- CHAPTER 1 INTRODUCTION This chapter provides an overview of research that was conducted to develop and validate a new fuzzy logic based method for time-series extrapolation. This new forecasting method is called the Direct Set Assignment (DSA) method. The first section of this chapter provides a description of problem that has been debated, by both scholars and practitioners, working in the field of business forecasting, for nearly two decades, that justifies the need for this research. Specifically, the problem is that extrapolative forecasting method development during the past two decades has focused on the development of statistically sophisticated methods despite the fact that research on forecast accuracy conducted during this same period has shown that statistically simple methods, that are robust to the fluctuations that occur in real-world data, produce forecasts that are at least as accurate as those produced by statistically sophisticated methods. A subsequent section offers a hypothesis as to why statistically simple methods produce forecasts that are at least as accurate as those produced by statistically sophisticated methods. Also provided, is a discussion of the perceived benefits of fuzzy logic and an argument for why it should serve as the basis for an extrapolative forecasting method. A brief discussion of the major research hypotheses of this study has been included, followed by a discussion of the experimental design that was used to evaluate seven specific null hypotheses. The chapter closes with a few remarks about significant findings from this research and an outline of the remaining chapters in this dissertation. 1.1 Problem Specification and Research Justification Makridakis and Hibon (1979) were one of the first to report that statistically simple extrapolative forecasting to report that statistically simple extrapolative -6- forecasting methods provide forecasts that are at least as accurate as those produced by statistically sophisticated methods. Such a conclusion was in conflict with the accepted view at the time, was not received well by the great majority of scholars. In response to these criticisms Makridakis and Hibon held the M-Competition (1982), the M2-Competition (1993) and the M3-Competition (2000). In each of these additional studies, the major findings of the Makridakis and Hibon (1979) study were upheld and this included the finding concerning the relative accuracy of statistically simple extrapolative methods. In addition to the M-Competitions, myriad other research, described as accuracy studies, were held utilizing new time series as well as time series from the M-Competitions, and they confirmed the original findings of Makridakis and Hibon regarding the relative accuracy of extrapolative methods. These studies include, Geurts and Kelly, (1986); Clemen, (1989); Fildes, (1984); Lusk and Neves, (1984); Koehler and Murphree, (1988); Armstrong and Collopy, (1992); Makridakis et al, (1993) and Fildes et all, (1998). The problem as reported by Fildes and Makridakis (1998) and Makridakis and Hibon, (2000) is that many scholars have all but ignored the empirical evidence that has accumulated across these competitions on the relative forecast accuracy of various extrapolative methods, under various conditions. Instead they have concentrated their efforts on building more statistically sophisticated forecasting methods, without regard to the ability of such methods to accurately predict real-life data. Makridakis and Hibon (2000) suggest that future research should focus on exploiting the robustness of simple extrapolative methods, that are less influenced by the real life behavior of data, and that new statistically simple methods should be developed. 1.2. Forecast Accuracy and Simple Extrapolative Methods Makridakis and Hibon (2000) suggest that real-life time series are not stationary, and that many of them also reflect structural changes resulting from the influence of -7- fads and fashions, and that these events can change established patterns in the time series. Moreover, the randomness in business time series is high and competitive actions and reactions cannot be accurately predicted. Also, unforeseen events affecting the series in question can and do occur. In addition, many series are influenced by strong cycles of varying duration and lengths whose turning points cannot be predicted. It is for these reasons that statistically simple methods, which do not explicitly extrapolate a trend or attempt to model every nuance of the time series can and do outperform more statistically sophisticated methods. 1.3 Fuzzy Logic Mukaidono (2002) concluded, "It is a big task to exactly define, formalize and model complicated systems", and it is at precisely at this task that fuzzy logic has excelled. In fact, fuzzy logic has routinely been shown to outperform classical mathematical and statistical modeling techniques for many applications involving the modeling of real world data. For example, fuzzy logic has found wide acceptance in the field of systems control. Fuzzy logic has been used in control applications ranging from controlling the speed of small electric motor, to controlling an entire subway system. In nearly every one of these applications fuzzy logic control systems have been shown to outperform more traditional, yet highly advanced, digital control systems. Fuzzy logic's success in these applications has been attributed to its ability to effectively model real world data. Mukaidono (2002) suggests that Fuzzy Logic's success lies in the fact that it offers a "rougher modeling approach". The process of digital control is actually remarkably similar to time series extrapolation. In a digital control system, sensors provide a set of quantitative or qualitative observations as input to the controller. The controller in turn models those inputs and provides either a qualitative or quantitative output to the system that is under control. In time series extrapolation, a set of historical observations on a time series, serve as the input data to the forecasting method. The method then produces an -8- output that in the case of time series extrapolation is the forecast or future value of the time series of interest. Given the similarities with respect to the task of modeling complex real world data, and the structure of the two modeling systems, a fuzzy logic based method for time series extrapolation would appear to be the type of statistically simple method which Makridakis and Hibon (2000) suggest is needed. 1.4 Major Research Hypotheses It is clear from over two decades of research on the relative accuracy of various extrapolative methods that simple methods will in most forecasting situations, and for most data types, produce the most accurate ex ante forecasts. In this study there are two major hypotheses. The first hypothesis is that the ex ante forecast accuracy of the DSA method will change in response to changes in the fuzzy set parameter. The fuzzy set parameter is the number of fuzzy sets used to model the time series of interest. The second hypothesis is that the DSA method will provide more accurate ex ante forecasts than the traditional extrapolative forecasting methods to which it has been compared. 1.5 Research Approach Elton and Gruber (1972), Reid (1972), and Newbold and Granger (1974), were among the first to establish the relative accuracy of different forecasting methods across a large sample of time series. However, these early studies compared only a limited number of methods. Makridakis and Hibon (1979) extended this early work by comparing the accuracy of a large number of methods across a large number of heterogeneous, real-life business time series. In 1982 Makridakis and Hibon, (1982) conducted a second accuracy study. In this study the authors invited forecasting experts to participate who had an expertise with a particular extrapolative method, thereby creating a forecasting competition. -9- Since 1982 there have been a number of improvements made to the forecasting competition methodology particularly in terms of predictive and construct validity. The research, which is the subject of this current study, relied on the data, methods and procedures of the M3 Forecasting Competition conducted in 2000 as this competition utilized the most recent advances in the forecasting competition methodology. In this current study, three competitions were required to evaluate the research hypotheses and to establish the relative forecast accuracy of the Direct Set Assignment Method. 1.6 Important Findings The results of this research support three of the major findings of the prior forecasting competitions and accuracy studies that have been conducted during the past two decades. Most important however is that, the findings of this study support both major research hypotheses discussed above. Thus it can be concluded that the DSA method does produce ex ante forecasts that are as accurate, and in most instances more accurate than the forecasts produced by the alternative extrapolative methods to which it was compared. These alternative methods are the methods that produced the most accurate forecasts for the identical time series form the M3-forecasting competition held in 2000. 1.7 Organization of Dissertation Chapter 2, Literature Review and Research Hypotheses, is the next chapter, and it begins with a discussion of the importance of the field of business forecasting as well as an overview of the methods that are available for producing forecasts of business data. A section has been devoted to extrapolative forecasting methods as they are the subject of this study. The background on the use of forecast accuracy as the primary criteria for extrapolative forecast method selection has also been provided. Also in this chapter is a comprehensive review of forecasting competitions and other - 10 - studies conducted to establish the relative forecast accuracy of extrapolative methods. Six sections are devoted to a review of the data processing technology fuzzy logic. These sections describe the origin of fuzzy logics, the mechanics of fuzzy logic, its applications, the Mamdani Framework for fuzzy logic method development and the background on the use of fuzzy logic in time series extrapolation. The chapter also presents the specific research hypotheses that have been evaluated in this study. The chapter concludes with a brief summary that describes the linkage between past research and this current research. Chapter 3, The Direct Set Assignment Method, opens with a discussion of the application of the Mamdani Framework to the development of the DSA method. The next two sections of this chapter contain examples of the DSA method used to produce forecasts of a non-seasonal as well as a seasonal time series. The chapter closes with a summary on the development and use of the DSA method. Chapter 4, Methodology, opens with a description of the forecasting methods that have been used in this study, including a brief overview of the DSA method, and a description of the six forecast accuracy measures that were used to establish the relative accuracy of the forecasting methods compared in this study. A subsequent section discusses the data used in this study, including complete descriptive statistics. The chapter continues with a discussion on the study's experimental design referred to as a forecasting competition, and includes specific details on each of the three forecasting competitions that were conducted to test the hypotheses outlined in Section 2.12. The chapter closes with a summary. Chapter 5, Results, provides summary tables that contain the values of the six accuracy measures, for each method being compared, for multiple forecast horizons. These tables have been provided for each of the three forecasting competitions conducted in this study. The chapter closes with a brief summary. Chapter 6, Discussion, contains an evaluation of each of the research hypotheses discussed in Section 2.12 in the context of the results presented in Sections 5.1-5.3 including statements of major findings. This findings include that the DSA method and the DSA method in combination with Winter's Exponential - 11 - smoothing were the top performing methods in this competition. The specific hypotheses being tested have been restated in this section for convenience. The chapter continues with an assessment of the contributions of this research specifically to the investigation of fuzzy logic extrapolative methods and to the theory and practice of forecasting. The chapter closes with suggestions for future research on the DSA method and some concluding remarks. - 12 - CHAPTER 2 LITERATURE REVIEW This chapter opens with a discussion of the important role that business forecasting play in the operation of many businesses. The chapter continues with a review of a recently introduced taxonomy of business forecasting methods. Special attention is given to extrapolative forecasting methods, the methods that are the focus of this research. The subsequent section provides a review of the role of forecast accuracy in forecasting method selection. A discussion on the measurement of forecast accuracy has been provided as well. Also included is the background on the use of a methodology, referred to as a forecasting competition, to establish the relative forecast accuracy of extrapolative forecasting methods for different forecasting situations. The research presented in these sections provides the justification for the development of a new more accurate, statistically simple, extrapolative forecasting method. Following the discussion on the need for new extrapolative forecasting methods, are three sections on the data processing technology, Fuzzy Logic. The first two of these three sections highlights why fuzzy logic may be uniquely suited for the job of time series extrapolation. The third of these three sections describes a theoretical framework that is routinely used to develop fuzzy logic methods for all manner of data processing applications. The second to last section provides statements of the hypotheses that will be evaluated in this study, as well as a discussion of the relevance of each hypothesis. This chapter concludes with a summary that integrates the prior research on the accuracy of extrapolative methods with the justification for this research. 2.1 The Need For Business Forecasting The role of management in all organizations is to oversee the functions of - 13 - planning, administering and controlling (Daft, 1983; Koontz, 1984; Jarrett, 1991). The planning function, referred to as the first function of management, focuses on the development of strategy, allocation of resources and establishment of the policies that guide the operation of the organization into the future. The future, in the context of the planning process, is referred to as the planning horizon and it can range from a few hours for decisions about production schedules, to several years for decisions concerning capital expenditures and enterprise strategy implementation, (Ascher, 1978; Armstrong, 1978; Makridakis, 1998;). Unfortunately, management must make these crucial planning decisions in an environment of uncertainty about the outcome of the future events that serve as key inputs to the firms planning processes. These events include such things as the future levels of product demand and market-share, raw material and labor costs, inventories, personnel requirements as well as the impact of various market, competitive and economic factors on their organization's performance, to name but a few (Jarrett, 1991; Makridakis, 1998). Business Forecasting is a formal process for managing the uncertainty inherent in an organization's planning process by providing a numerical prediction or forecast of the future level of the event or key input of interest (Jarrett, 1991; Baines, 1992; Altabet, 1998; Li, Ang and Gray, 1999; Winklhofer, 2002). The field of Business Forecasting began in earnest in the late 1950's at a time when many individuals questioned the validity of a discipline aimed at predicting an uncertain future. However, since that time the results of empirical research presented in over one thousand articles and books have demonstrated the efficacy of business forecasting. Today, business forecasting is a discipline with a strong and comprehensive theoretical framework and one that is widely accepted by scholars, and routinely applied by practitioners (Chatfield, 1997; Makridakis, 1996, Makridakis, 1998; Ord, 2000; Armstrong, 2001). The widespread adoption of databases and data warehouses combined with the continued decline in the cost of mass storage have allowed businesses to capture and store data on virtually every aspect of their operation. Giacomini (2003) suggests that - 14 - it is for this reason that the literature on and interest in business forecasting is experiencing a renaissance. 2.2 Forecast Method Taxonomy and Selection A large number of business forecasting methods have been developed during the past several decades. Chambers, Mullick and Smith (1971) were among the first to examine the problem of how to select from among available methods. These authors created a chart of six forecasting performance criteria by eighteen forecasting techniques. The performance criteria included accuracy, application, data required, cost, ease of implementation and robustness. Their rating of each technique on each criterion was based on these authors' general impressions. Reid (1972) advanced the idea of Chambers, Mullick and Smith (1971) by representing the method selection process in the form of a decision tree in which the branches reflected the criteria, and the rating of each model was based on empirical evidence as opposed to general impressions. Jenkins (1974) suggested that a better approach to classifying methods on criteria was to simple the Box-Jenkins method to identify and estimate a model from ARIMA class of time series models in all forecasting situations. Armstrong (1982) conducted a survey of Academics and Practitioners at the First International Symposium on Forecasting to solicit their opinions on the criteria, which they felt were most important for selecting a forecasting method. The survey results indicated that 70% of practitioners believed that accuracy was the most important criteria for selecting a method. The notion that accuracy should be the primary criterion for selecting the appropriate forecast method was reinforced in studies by Newbold and Granger, 1974; Reid, 1975; Hibon and Makridakis, 1979 and 1982. Georgoff and Murdick (1986) were the first to suggest that guidelines, based on prior research findings, could be used to identify the method that would be most accurate for a given forecasting situation. This was important research that laid the - 15 - groundwork for the current belief that forecasting method accuracy tends to be data and situation specific. Dalrymple (1987) used a mail survey to obtain information about the use of methods for sales forecasting in one hundred thirty four US companies. These companies reported that they relied on Expert Opinion (sales force 44.8%, executives 37.3% and industry experts 14.9%); Analogies (leading indicators 18.7%); Econometric models (12.7%) and Extrapolation (49.6%) to produce their forecasts. He also cited several other studies on the use of forecasting methods that contained similar findings. Rhyne (1989) conducted a survey of the senior management at forty hospitals. It was reported that a "jury of executives" was used to produce a forecast by 87% of respondents with 67% relying on the forecasts produced by experts. Extrapolative methods were used by 65% of respondents followed by 12.5% of respondents who used regression analysis. Frank and McCollough (1992) conducted a similar survey to that of Dalrymple that included 290 Practitioners of the Finance Officer Association for US state governments. They found that the most widely used forecasting method by this group was judgment 82%, followed by trend line 52%, econometric techniques 26%, moving averages 26% and exponential smoothing 10%. Sanders and Manrodt (1994) found that while knowledge of quantitative methods seemed to be increasing over time, firms still relied heavily on judgmental methods. Yokum and Armstrong (1995) conducted an analysis of previous survey research and concluded that accuracy was the most important selection criteria. Further, these authors highlight that the implication of selecting the most accurate methods are extremely important in practical terms, as even small improvements in the accuracy of a forecast can provide considerable savings to an organization. Makridakis (2000) also reported his observation that forecast accuracy was of primary importance. The increasing focus on forecast accuracy as the primary criteria for forecast - 16 - method selection described by Makridakis (2000), resulted primarily from the increasing presence of digital computing technology. The processing power of computers made selection criteria such as cost and time all but irrelevant. Further, the availability of forecasting software provided forecasters with the ability to produce quantitatively derived forecasts as readily as a judgmental forecast or forecasts based on expert opinion. Makridakis (2000); Meade (2000) and Armstrong (2001) report that a number of important conclusions, about the relative accuracy of alternative forecasting methods, have consistently been reached in prior empirical studies of forecast accuracy. They are: 1) accuracy of a structured approach, whether data is available or not, is greater than the accuracy of an ad hoc approach; 2) accuracy of quantitative methods exceeds that of judgmental methods when enough data exists; 3) accuracy of extrapolative methods often exceeds that of explanatory variable or casual models depending on the level of change in the variable of interest. Armstrong (2001) in "Selecting Forecasting Methods" presents a decision tree allows users to identify the forecasting method that should produce the most accurate forecast given a number of situational factors and conditions. The structure of the tree is based on a method taxonomy in which the myriad of forecast models and techniques that have been developed during the past few decades are classified into one of ten methods or method categories. In this taxonomy the ten methods belong to one of two major categories. The categories are judgmental methods and quantitative methods. Judgmental methods rely on the forecaster's judgment, the opinion of experts, leading indicators; analogous situations and intuition to produce a forecast value of the variable or event of interest. Quantitative methods in contrast rely on statistical relationships within and among data collected on the specific variable or event of interest, as well as on related variables and events, to produce the required forecast. The category of Judgmental Methods includes the seven sub-categories: Expert Forecasting, Judgmental Bootstrapping, Conjoint Analysis, Intentions, Role Playing and Analogies. The sub-categories of Quantitative Methods are: Time Series - 17 - Extrapolation, Explanatory, Rule-Based Forecasting and Expert Systems. These sub-categories each contain many alternative models, or alternative specifications of a model, that are used to actually produce the numeric forecast. Armstrong (2001) developed the following rules to select the most accurate method from among the ten alternative methods. Given that there is enough data available, quantitative methods will produce more accurate forecasts then judgmental methods. If quantitative methods are selected then the forecaster needs to consider if the casual influences on the variable of interest are known; the amount of change that is expected in that variable of interest; the type and amount of data that is available; the need for policy analysis and the extent of domain knowledge. If judgmental methods are selected the forecaster needs to consider whether or not large changes are expected in the value of the variable of interest over the forecast horizon; if a large number of forecasts will be required; differing view among key decision makers and policy considerations. Figure 2.1 is the decision tree from Armstrong (2001). 2.3 Traditional Extrapolative Methods The Extrapolative Forecasting methods from Figure 2.1 are the focus of this study. These methods are accurate, reliable and easy to automate, and for these reasons they are the most popular quantitative methods. These methods are widely used for producing inventory and production forecasts, demand forecasts, budgeting forecasts, operational planning and some long-term forecasts, as well as forecasts in many other areas of a business's operations, Armstrong (2001). Extrapolative Forecasting Methods produce a forecast or future value of a variable of interest by examining the past behavior of that variable. Unlike explanatory variable or econometric methods, extrapolative methods do not attempt to identify the factors that are responsible for the historical levels of the variable of interest. - 18 - (Figure 2.1) In an extrapolative method it is the passage of time that acts as a proxy for whatever is really causing the behavior of the variable. The goal with extrapolative methods is to identify the pattern in the values of the variable of interest, and then extrapolate that pattern into the future. Extrapolative methods have routinely been found to provide more accurate forecasts than econometric methods, Makridakis (2000); Meade (2000) and Armstrong (2001). To use an extrapolative method the variable of interest must be organized as a time series. A time series is a collection of historical observations on a quantitative variable that are equidistant with respect to time and are arranged sequentially. For example, a ten-observation time series of annual sales data would be the sales level measured on December 31st for each of the years 1995-2004. While any time interval is possible, in business forecasting most data is captured on a daily, weekly, monthly, quarterly or yearly basis. Although dozens of extrapolative methods have been developed during the past thirty years, practitioners and academics have adopted fewer than twenty methods for regular use. In theory, extrapolative methods are particularly good for producing short-term forecasts where, with reason, the past behavior of the variable of interest is a good predictor of the variables future behavior. These methods range from simple to complex relative to the statistical procedures required to model the historical observations of a time series. The general form of an extrapolative method is Yt 1 f (Yt , Yt 1 , Yt 2 ,..., Y0 , t ) . In this equation Yt 1 represents the predicted value of the time series for one time period ahead. The most recent actual historical observation is designated as Yt . The nest most recent actual historical observation is Yt 1 and occurs in period (t 1) , and so on. This model implies that a predicted value is a function of its previous values and time, (Jarrett, 1991). - 19 - Time series data has five basic components. The first four components are the: average, trend, seasonal and cyclical. These are referred to as the systematic components. The fifth component is error and it is the non-systematic component. The relationship can be represented as data=(pattern) + error = (systematic components) + error. Conventional wisdom suggests that to make accurate forecasts the extent to which each component is present in a given time series must be taken into account. In fact a considerable amount of research has focused on divining ways to disaggregate the components of a time series so that forecasts can be produced of each individual component. The final forecast of the time series overall is produced by aggregating the component forecasts in a systematic way. This process is generally referred to as time series decomposition. The average component, also referred to as the level component, is the sum of the values of the time series divided by the number of time periods. Time series that have only an average component are referred to as stationary series. The trend component is represented by the tendency for the values of the time series to systematically increase or decrease over time. Trends can be linear as well as curvilinear. Time series in which the trend component is present are referred to as non-stationary series. Trends can be identified by conducting a visual inspection of a plot of the data or by fitting a trend line. The seasonal component is any repeating pattern in the time series that has a period of exactly one year for a complete cycle. This component represents a predictable increase or decrease in demand depending on the week, month or season of the year. The seasonal component can arise from calendar or climatic influences as well as from other influences as well as from other influences that repeat at approximately the same time each year. Seasonality is most frequently associated with monthly quarterly and bi-annual series; however it can exist in any series of any time interval except series with a yearly time interval. There are several ways to identify seasonality in a time series. The first is to conduct a visual inspection of a line graph of the values of the time series of interest. A second graphical approach is to examine the Autocorrelation Function (ACF) for the series. The pattern in the ACF reveals the - 20 - presence or absence of a seasonal component. The most widely used approach however is to calculate seasonal indices for the time series of interest. One method for doing so is the ratio-to-moving-average method (Jarrett, 1991). The cyclical component is represented by long term repeating cycles in the series that are not related to seasonal effects. This component arises from two factors. The first is the business cycle, which is influenced by a number of economic factors that cause the economy to go through a repeating pattern of recession and expansion. The second factor is the product life cycle. The product lifecycle reflects demand for a product from its introduction through its decline. The magnitude of and duration of cycles is difficult to predict. This difficulty arises from an inability to predict the effects of national and international events, such as elections, wars or political turmoil around the globe. The error component is represented by any fluctuations in the time series that are not classified as one of the four systemic components. In essence these fluctuations are error. This error results from the occurrence of non-periodic, unpredictable and catastrophic events. This can include strikes, terrorist attacks, a stock market crash and etc. In addition, error can also arise from selecting a forecasting model that is incorrectly specified given the nature of a particular time series or, from measurement error in the historical observations of the time series or from the randomness inherent in the series itself. It is due to the presence of the error component that forecasts are always wrong even though they may still be quite useful for decision-making purposes. The value of the error component can be found as the difference between the actual and forecast values for each time period. There are some general guidelines for the application of extrapolative methods. Within this method category, there are methods that have been designed to be the most appropriate method for extrapolating stationary series in which only the average component is present; there are methods that are most appropriate for extrapolating series containing a trend and methods that are most appropriate for extrapolating series containing a seasonal component. In practice however it is difficult to know which specific extrapolative method will produce the most accurate forecast of a - 21 - given time series. Thus, a common approach for selecting the most appropriate extrapolative method for a given situation is to compare several alternative methods as to their forecast accuracy. 2.4 Measures of Forecast Accuracy There are numerous measures available to establish the accuracy of the forecasts produced by extrapolative forecasting methods. These measures reflect different approaches to aggregating the individual differences between the observed and forecast values for the same time period of a given time series. During the past several decades a number of measures of forecast accuracy have been proposed. In some instances the accuracy measures are used to determine the accuracy of the fit of the model to all of the historical observations of the time series in question. This is referred to as in-sample forecast accuracy, or model-fit. In other instances these measures are used to establish the accuracy of forecasts of the values of a time series that were not used to calibrate the model. This second approach relies on post sample or as ex ante forecasts, and is considered to be the preferred approach for assessing the accuracy of an extrapolative forecasting method. In this second approach the actual historical observations of a time series are divided into a training data set and a validation data set. This second set is frequently referred to as the hold out set. The training data is used to calibrate the model and the validation set contains the observations to which the forecasts values will be compared. Statisticians who focused primarily on theoretical considerations developed the first forecast accuracy measures. One of these accuracy measures is the Mean Square Error (MSE). To calculate this measure the individual difference between each observed and forecast value for a given time period is squared. The average of these squared errors is obtained. This average of the squared errors is the MSE. Another early measure is the Root Mean Square Error (RMSE), which is obtained by taking the square root of the MSE. Yet another early measure is Mean Absolute Deviation - 22 - (MAD) which is the average of the absolute value of the difference between each observed and forecast value. In practice MAD provides the most useful interpretation of these three measures, as it describes on average by how much each forecast will be wrong in the actual units of the time series in question. Makridakis and Hibon (1979) in a large-scale empirical study relied on several accuracy measures including: Theil's U, Mean Absolute Percentage Error (MAPE), Percentage Better and Relative Ranking to determine the relative accuracy of the forecasting methods being evaluated in their study. Carbone and Armstrong (1982) in a survey of forecasting experts found that Root Mean Square Error (RMSE) was the most preferred measure of forecast accuracy. The use of this measure was in opposition to the conventional wisdom at the time that error measures such as (RMSE) that were not unit free, were not reliable measures of relative accuracy. Mean Absolute Percentage Error (MAPE) was found to be the most widely used unit-free accuracy measure by these authors. These authors concluded that the choice of error measure used to identify the most accurate forecasting method appeared to be a question of personal taste. Ahlburg (1982) reviewed seventeen papers dealing with accuracy of population forecasts. The author found that the accuracy measures: Mean Absolute Percentage Error (MAPE) was used in ten papers; Root Mean Square Error (RMSE) was used in four papers; Root Mean Square Percentage Error (RMSPE) was used in three papers and Theil's U was used in three papers. Multiple measures were used in four papers. The author observed that no justification for the use of a particular measure was provided in any of these papers. Armstrong and Collopy (1992) conducted research for the purpose of establishing guidelines for the selection of the appropriate accuracy measure. In their study these authors evaluated the relative accuracy of eleven extrapolative forecasting methods, across one hundred ninety-one time series, with six different forecast accuracy measures. These authors concluded that the choice of accuracy measure does indeed make a difference in the identification of the most accurate forecasting method. They recommended the Geometric Mean of the Relative Absolute Error (GMRAE) - 23 - accuracy measure when the need is to assess the accuracy of model fit. Further they concluded that the error measure that should be used to select the most accurate method for producing out-of-sample forecasts is Median Relative Absolute Error (MdRAE), and in those situations when only a few series are being evaluated the Median Absolute Percentage Error (MdAPE) should be used. They also observed that the Percent Better error measure performed well when many series are being evaluated. Finally, they concluded that (RMSE) is not reliable, and should not be used for comparing the accuracy of alternative methods across series. Fildes (1992) conducted a similar study to that of Armstrong and Collopy (1992). In his study he observed that different forecasting conditions (date type, forecast horizon, component type) affect the ranking on accuracy of alternative forecasting methods by different accuracy measures. In this study Geometric Root Mean Squared Error (GRMSE) and Median Absolute Percentage Error performed well, while (RMSE) was found to be sensitive to values close to zero and (MAPE) was sensitive to outliers. Makridakis (1993) investigated the concern among forecasters, at that time, as to the selection of the appropriate accuracy measure. In his study the author reviews the prior research on accuracy measures and their selection under various forecasting conditions. He suggests that Theil's U2 as well as RAE, and this includes the Geometric, Mean and Median RAE are highly problematic because their divisor is the difference between the forecast value and the random walk forecast, which in some instance can be zero and in other instance it can be very large. Further, he indicates that the various RAE values are meaningless to most decision makers and that the Geometric means posses the additional problem of not being able to be calculated when working with a large number of series. As accuracy measures based on rankings and median measures are not relative measures, calculated as a ratio of a proposed model to a baseline model, these measures are not for general forecasting use, however they can be used in large scale accuracy studies. He does indicate however that the Percentage Better measure is reliable but should only be used in large-scale empirical studies as well, and, MSE - 24 - and RMSE are neither relative nor do they convey much meaning to decision makers. An additional baseline error measure useful for establishing the relative accuracy of alternative methods in large studies is Benchmark. Benchmark is simply the difference between the SMAPE value of a Benchmark method such as Naïve 2 and the SMAPE value for each of the other methods. Makridakis suggested further that MAPE is a relative measure that incorporates the best characteristics of the other accuracy measures, and is the only one other than percent better, that leads to a meaningful interpretation by decision makers. MAPE can be used in large-scale studies as well as for general use. This author provides an over view of the problems with the MAPE measure and proposes a modification to the measure to address its shortcomings. This improved MAPE measure was originally referred to as modified MAPE and latter came to be known as Symmetric MAPE or SMAPE. Collopy and Armstrong (2000) conducted an empirical study to re-examine the problems with RAE and to investigate the performance of SMAPE. These authors concluded that the search for the most effective error measure for making comparisons across series is still underway. They acknowledge that the SMAPE error measure introduced by Makridakis (1993) has desirable characteristics, specifically that it is a relative measure and that it is unbiased, and as such it should be further investigation is warranted. However, pending the results of further research, a relative error measure such as MedRAE should also be used. 2.5 Forecasting Competitions Elton and Gruber (1972), Reid (1972), and Newbold and Granger (1974), were among the first to establish the relative accuracy of different forecasting methods across a large sample of time series. However, these early studies compared only a limited number of methods. In these studies the accuracy of the various methods was measured as model fit. Elton and Gruber (1972) also used group difference testing to identify the most accurate method. - 25 - Makridakis and Hibon (1979) compared the relative accuracy of nine forecasting methods across one hundred eleven time series of strictly business and economic data. The accuracy measures, MAPE, Theil's-U and Percentage Better were used to establish the relative accuracy of the forecast methods using a validation data set. The accuracy measures were aggregated across series in such a way that the method most accurate method for each data type in the study could be identified, as could the most accurate method for all series used in the competition. Statistical tests for differences between methods as applied by Elton and Gruber (1972) were abandoned in this study and in most subsequent large-scale accuracy studies. There are three reasons for this decision. Firstly, in a forecasting competition the methods are all reasonable alternatives for one another. For example, consider two forecasting methods A and B. Consider that it is found that Method A has the higher observed accuracy of the two methods. If A is the method selected to produce the forecast and a difference exists between A and B, then A would produce the more accurate the more accurate forecast. If on the other hand there is no difference between A and B and A is the selected method, then method A will produce a forecast that will be as accurate as a forecast produced by method B. For this reason observed accuracy is emphasized in method selection. It should be noted that in practice confidence intervals are routinely used. Secondly, the accuracy measure Percent Better is typically considered to provide more useful information about the real differences between the accuracy of alternative methods and is used in most large-scale studies. Thirdly, studies have shown that the ranking of various methods on forecast accuracy differs according the accuracy measure used. Therefore, group difference testing based on rankings derived from different measures of forecast would lead to very different conclusions about which method produced the most accurate forecasts. The major finding of this study was that simple extrapolative methods perform at least as well as more statistically sophisticated ones such as Winter's Method, which were designed to extrapolative the trend and seasonal components of the time series in addition to the average component. This conclusion was in conflict with the - 26 - accepted view held by most experts in the late 1970's, that statistically sophisticated forecasting methods would outperform statistically simple methods as the more sophisticated methods could more precisely model the time series. Makridakis and Hibon (1982) introduced the first of what would become three international forecasting competitions referred to generically as the M-competitions. The goal of the competitions was to establish the relative accuracy of established, as well as new, forecasting methods under various business-forecasting conditions and for varying data types. In this study the authors used five measures of forecast accuracy, MAPE, MSE, Average Ranking, Median Absolute Percentage Error (MdAPE) and Percentage Better, to establish the relative accuracy of twenty-one extrapolative methods across one thousand and one time series, comprised of macroeconomic, microeconomic and industry and demographic data captured over yearly, quarterly and monthly data. Group difference testing was not used. Each of the time series was divided into a training data set and a validation data set. The training set for the one thousand and one series was provided to each of nine contestants who were experts with one of the twenty-one extrapolative methods. The use of outside experts to produce the forecasts was in response to criticisms of the author's 1979 study in which the authors themselves produced all of the forecasts. The experts produced six, eight and eighteen one period ahead forecasts for the yearly, quarterly and monthly time series respectively. The forecasts were then compared to the values in the validation data set in a post sample fashion. Accuracy measures were aggregated for each data type by time period subcategory, each data type category and for all of the series in the study. The results of the M-competition were similar to those of the Makridakis and Hibon 1979 study. The four important findings of this study were, that 1) statistically sophisticated methods do not necessarily provide more accurate forecasts than do simple models, 2) the ranking of methods on relative accuracy differs for different accuracy measures, 3) the combination of the forecasts from alternative models outperforms the accuracy of each of the methods being combined, 4) the ranking of - 27 - methods on forecast accuracy differs for different forecast horizons. Hill and Fildes (1984); Lusk and Neves, (1984); and Koehler and Murphee, (1988) used the M-competition data in what amounted to a replications of the M-competition. They reported similar findings to those of Makridakis and Hibon, (1982). Gardner and McKenzie, (1985); Geurts and Kelly (1986); and Clemen, (1989) relied on a portion of the M-competition data to develop and test new extrapolative methods. In particular, the Gardner and McKenzie (1985) model called Robust Trend Exponential Smoothing has been examined in number of additional accuracy studies since 1985, and has been found to be very accurate, particularly with yearly time series and with time-series that have a trend component. Armstrong and Collopy, (1992, 1993); Makridakis et al., (1993) applied the notion of a competition established in the M-competition to a set of telecommunications time series. The results of this study reconfirmed the four major findings of the findings of the 1982 M-competition. Makridakis et al., (1993) discussed the findings of the M-2 forecasting competition, held during 1987 and 1988, as way to advance the study of forecasting accuracy and to address the major criticism of the M-competition. That criticism was that forecasters in real situations could utilize domain knowledge about their business and industry to improve the accuracy of extrapolative methods. The format of the M2-competition therefore was designed to evaluate this hypothesis as well as to evaluate hypotheses relating to the four 1982 M-competition findings. This competition consisted of distributing twenty-nine actual time series from four companies to five expert forecasters. The competition was run in real time over the course of a two-year period. The experts could incorporate any information they could obtain from either the company or from secondary sources into their monthly forecasts. The accuracy of the forecasts produced by the experts was measured on a validation data set comprised of the actual values for the twenty-nine time series that were obtained after the beginning of the competition. The accuracy measures used in this study were MAPE, Percent Better, and Benchmark. In this study the Benchmark - 28 - calculations were based on the MAPE values for each method. The primary finding of the M2-competition is that the additional information used by the experts did not result in forecasts that were more accurate than those produced by the quantitative methods alone. Further, the other important findings of this study reproduced the four major findings of the original M-competition. Makridakis and Hibon (2000) introduced the third and what the author's state is the final forecasting competition, the M3-competition. In the words of the authors "The goal of this study is to respond to those experts who continue to build more sophisticated methods without regard to the ability of such methods to more accurately predict real-life data". The M3-competition was designed to extend the M and the M-2 competitions as well as the myriad of other accuracy studies that were conducted during the past twenty years. The M3-competition established the relative accuracy of twenty-four extrapolative forecast methods, across three thousand three time-series of macro-economic, micro-economic, industry, demographic, financial and other time series captured over yearly, quarterly monthly time intervals. The forecasting methods were used to produce forecasts for the forecast horizons of six, eight and eighteen periods ahead, for yearly, quarterly and monthly data respectively for each of the five date types. The forecasting methods examined in this study range from the statistically simple random walk method to statistically sophisticated methods that include neural networks, the Box-Jenkins approach, expert systems and Rule Based Forecasting. Six measures of forecast accuracy were used to establish the relative accuracy of the twenty-four methods. As with the earlier studies, the time series were divided into a training data set and a validation data set. The accuracy measures used in this study were Symmetric Mean Absolute Percentage Error (sMAPE); Average Ranking; Percentage Better; Median Symmetric Absolute Percentage Error (MdAPE); Median Relative Absolute Percentage Error (MdRAE) and Benchmark. As was the case in the earlier M-competitions, accuracy measures were aggregated by data type-time interval subcategory, data type category and for all of the series overall. The four major - 29 - conclusions of the M3-competions reconfirmed the findings of the previous M-competitions. The decision as to the most accurate method was based on consensus among the error measures. Further, as difference testing was not used the authors chose to report the top three methods for the various sub-categories, categories and for all of the series evaluated in the competition. Fildes and Makridakis (1998), Makridakis and Hibon (2000), and Fildes (2001) concluded, based on the findings of the M-competitions and other accuracy studies that simple extrapolative methods, that do not explicitly extrapolate through decomposition the trend or seasonal components of a time series provide forecasts that are as accurate, than those produced by sophisticated methods that do explicitly extrapolate these components. These authors conjectured independently that the reason simple methods can outperform more sophisticated methods is because the former are robust to features in real-life time series that confound more complex methods. These features include: 1) structural changes caused by fads and fashions that can result in changes to established patterns; 2) a high level of randomness or uncertainty that results from competitive actions and reactions, and unforeseen events that can not be accurately predicted; and 3) strong cycles of varying duration and lengths whose turning points cannot be predicted. These authors argue that future research to improve the accuracy of extrapolative methods should focus on the development of statistically simple methods that can take into account the real-life behavior of time series. As an example, Makridakis and Hibon (2000) cite the introduction of a new method, Theta (Assimakopoulos and Nikolpoulos, 2000). Although this method is not based on strong statistical theory, it performs remarkably well across different types of series, forecasting horizons and accuracy measures. Makridakis and Hibon (2000) conclude: "Hopefully, new extrapolative methods, similar to Theta, can be identified and brought to the attention of practicing forecasters". 2.6 Is Fuzzy Logic The Solution? - 30 - Fuzzy Logic is a data processing technology that has received wide acclaim for its ability to more accurately model real world data than traditional mathematical approaches (Stevens, 1993; McNeil and Freiberger, 1993; Kosko, 1994; Hajek, 2002; Mukaidono, 2002; Mendel, 2001 and Nguyen and Walker, 2000). Fuzzy logic is based on both traditional logic and traditional set theory and was developed by Lotfi Zadeh professor of electrical engineering at the University of California at Berkley in 1965 (Zadeh, 1965). Traditional propositional logic is based on the Laws of Thought, as defined by Aristotle and other early Greek philosophers. In this system, a proposition is an ordinary statement that is comprised of a priori defined terms. For example, "It is cold outside today". One of these laws, The Law of The Excluded Middle, states that every proposition must be either true or false and is accordingly associated with a truth-value of 1 or 0 respectively. Meaningful propositions like the one in the above example can be determined to be either true or false. Logical reasoning is the process of combining propositions into other propositions forming a logical structure that allows for the truth or falsity of all propositions in that structure to be determined. Propositions can be combined in many ways, all of which are derived from three fundamental operations. They are conjunction, disjunction and implication. For two propositions p and q , conjunction (denoted p q ) asserts their simultaneous truth. For example, it is snowing today AND it is cold today. Disjunction (denoted p q ) asserts the truth of either or both propositions. For example, I will shovel the snow from my driveway today. Implication, denoted ( p q ) asserts a conditional relationship between two propositions in the form of IF-THEN rules. For example IF it is cold outside today THEN I will wear a warm jacket. In implications the proposition associated with the IF portion of the rule is referred to as the antecedent and the proposition associated with the THEN portion of the rule is referred to as the consequent. Conjunction and disjunction can also be used to combine additional propositions within the antecedent and consequent of the rules. For example, IF it is - 31 - cold today AND it is snowing today THEN I will wear a warm jacket AND a warm hat. The German mathematician Georg Cantor in 1884 introduced traditional set theory (McNeil and Freiberger, 1993). He proposed a theory of sets that very much built on the work of the early Greek philosophers. Cantor defined sets as collections of definite distinguishable objects. Sets can represent people, things, words, or any creation of the human imagination. In Cantor's theory, sets divide the world into IN and OUT or TRUE and FALSE with the associated truth-values of 1 or 0, respectively. Each potential member of a set either belongs or does not belong to a given set. For example, given two sets Cold Days and Hot Days, the day December 1st 2004 can be assigned to one and only one of the sets based on the temperature in Fahrenheit on that day. The similarities between these two bodies of thought are illustrative of their common origin. Consider the similarity between the conjunction operation in logic, and the intersection operation in set theory. In conjunction a proposition is true overall only if proposition p AND proposition q are both true. In intersection, a set element is in the intersection only if the element is a member of Set 1 AND Set 2. The same correspondence exists among many other logic and set operations. Further, in logic if a proposition is true it is assigned a truth-value of 1 and if it is false it is assigned a truth-value of 0. In set theory if an element is a member of a set it receives a membership value of 1 and if it is not an element of a set it receives a membership value of 0. Zedah relied on these similarities to meld logic with set theory to form fuzzy logic. 2.7 Zadeh's Epiphany Zadeh who is also the father of modern Systems Theory began working in the area of complex systems in the 1950's. Zadeh (1962) concluded that, "as the complexity of a system increases, it becomes more difficult and eventually impossible - 32 - to make a precise statement about its behavior, eventually arriving at a point of complexity where the methods for reasoning and decision making born in humans is the only way to get at the problem". Human beings reason and make decisions based on human language rules that are organized as IF-THEN rules similar to a logical implication, (McNeil and Freiberger, 1993; Cox, 1994 and 1995). Zadeh observed however that the use of traditional or two valued logic by computers prevented them from manipulating data representing subjective or vague human ideas such as IF the weather is fine today THEN I will wear appropriate clothing. Clearly, there is some vagueness in the meaning of the word fine and the word appropriate in the above rule, to most readers. However these words undoubtedly have a very precise meaning to the individual who spoke them. Vagueness is the condition that exists in which the status of an object is a matter of definition. The question becomes one of how to harness this decision making structure. Zadeh (1962) suggested, "We need a radically different kind of mathematics, the mathematics of fuzzy or cloudy quantities which are not discernable in terms of probability distributions". This appears to have be Zadeh's first reference to what would latter become Fuzzy Logic. Zadeh however was only one in a long line of philosophers, mathematicians and scientists who had wrestled with the problem of the excluded middle and its associated vagueness. Plato was one of the first to raise concerns about the appropriateness of the Law of the Excluded Middle and in so doing laid the groundwork for what would become Fuzzy Logic. He observed that there was a third region beyond TRUE and FALSE where in his words these opposites "tumbled about" (Aziz, 1996). Charles Sanders Peirce the preeminent nineteenth century philosopher is reported to have referred to those who split the world into TRUE and FALSE as the "sheep and goat separators" (Nadin, 1983). He suggested instead that all that exists is continuous, and such continuums govern knowledge. For example size is a continuum, height is a continuum and even behaviors such as anger and sadness are also continuums. He stated that vagueness "is no more to be done away with in the world - 33 - of logic than is friction in mechanics", (Burch, 2001). Bertrand Russell another renowned philosopher concluded in the early 1900's that both vagueness and precision were features of language, not reality. Russell even challenged the notion of TRUE and FALSE. He concluded that without precise symbols, they too are vague. Therefore any proposition would have a range of facts that would make it TRUE, (Irvine, 2004). For example, the statement: This is a car could refer to a sports car, an economy car, a racecar or even a toy car. Russell (1923), asserted: "Vagueness, is clearly a matter of degree". Jan Lukasiewicz in the early 1900's relying on the work of Russell, Pierce, and others introduced what was the first attempt at a formal model of vagueness. Today it is referred to as three-valued logic, and it laid the foundation for the development of Fuzzy Logic. Lukasiewicz in 1920 introduced a new logic in which the truth-value of 1 still stood for TRUE, and the truth-value of 0 still stood for FALSE however he added the new truth value of 1/2 which stood for possible (McNeil and Freiberger, 1993). This represented a gigantic leap in the field of logic in that an assertion and its negation had the same value. For example, it can be asserted that, it is possible that it will snow today. This assertion has a truth-value of 1/2. The negation is, it is possible that it will not snow today. The negation also has a truth-value of 1/2. Lukasiewicz had created partial contradiction and in so doing opened the door to fuzzy logic. Zadeh (1965) set forth the mechanics of Fuzzy Logic in which he fused the classic set theory of Cantor with the three-valued logic of Lukasiewicz. Zadeh recognized that if it was possible, as Lukasiewicz had described, to have truth-values of 1, 1/2, 1, then it was also possible to have truth-values of 1/4 and 3/4 as well. In fact, if these truth-values were possible, then there are actually an infinite number of truth-values in the interval [0,1]. He concluded that truth actually exists in degrees. He extended this notion of degrees of truth to set theory and concluded that membership functions can assign elements to sets in degrees of truth in the interval [0,1] as well. This allows elements to belong partly to a set. This is the origin of what Zadeh described as a fuzzy set. Fuzzy sets discriminate much better between and among objects and supply - 34 - more information. While it is counter intuitive, fuzzy sets are more precise than are Cantor's bivalent sets. For example, consider the question: if an individual lives in the state of Rhode Island, USA for half the year and in the state of Florida, USA for half the year, is that individual a resident of Rhode Island? Cantor's sets are unable to represent or answer this question. Fuzzy sets on the other had can answer this question with ease. The individual has a truth-value in the set of: Residents of Rhode Island, USA of .5 and in the Residents of Florida, USA of .5. While these membership assignments have been mistaken for probabilities they are in fact degrees of truth. Probability values assert the chance that an entire set element belongs to a set, whereas a grade of membership asserts the degree to which an element is a member of a particular set. With the advent of fuzzy sets Zadeh had developed a mechanism through which the vagaries of human thought or of data could be captured. As a final step Zadeh combined traditional logical implications with fuzzy sets by replacing the propositions, or a portion a proposition, which serve as the antecedents and consequents of the IF-TEHN rules with fuzzy sets, thus creating fuzzy rules or fuzzy logical implications. The same set theoretic operations that apply to Cantor's sets including union and intersection, or analogously the same logic operators that can be applied including conjunction and disjunction can used to combine these fuzzy logical rules into fuzzy logical rule sets. This is the essence of Fuzzy Logic. 2.8 Mapping a Domain With Fuzzy Logic Kosko, (1994) introduced the Fuzzy Approximation Theorem as an explanation for how fuzzy logic models data. Fuzzy Logic approximates a function by defining its surface initially with fuzzy sets and the in turn by covering its surface with fuzzy that the author refers to as patches. The input and output of the fuzzy method can be associated together using these patches. Figure 2.2 below is an illustration of how patches are used to map a function. In this way Fuzzy Logic provides a much more accurate representation of the way systems behave in the real world. - 35 - (Figure 2.2) Cox (1995) states regarding FAT, "Instead of isolating a point on the function surface, a fuzzy rule localizes a region of space along the function surface." When multiple rules are executed, multiple regions are combined in the same local space to produce a composite region or a fuzzy rule set. The final point of the surface is found through defuzzification. As an example of the application of fuzzy logic to modeling a real world phenomenon consider the decision to adjust the thermostat in a house in response to the temperature in the house. In rule based fuzzy logic, the general form of the rules are, IF x is A THEN y is B, where A and B are fuzzy sets. First, fuzzy sets need to be established for both the antecedent and consequent of the rules. In this example the sets are established for different domains (ambient temperature in the house and thermostat setting) however this is not always the case. The number of sets required to describe each domain is based on a number of sets required to describe each domain is based on a number of factors including the modeler's judgment about each domain, their prior experience or is based on trial and error. In most instances fuzzy sets overlap and the amount of overlap need not be the same for all sets in a given model. For example some sets may overlap by 25% while other overlap by 90%. As with the number of sets, the decision on the amount of overlap is based on the modeler's judgment, prior experience or is based on trial and error. The overlap between and among sets is the feature of fuzzy logic that allows an object to have a degree of membership in more than one set. In this example three sets (TOO COLD, JUST RIGHT, TOO HOT) will be used to model the room temperature form 40 degrees Fahrenheit to 90 degrees Fahrenheit and three sets (INCREASE, NOTCHANGE, DECREASE) will model the action to be taken relative to the setting of the thermostat. These sets translate to the following fuzzy rules and fuzzy rule set: 1) IF the room is too cold THEN the thermostat setting should increase - 36 - 2) IF the room is just right THEN the thermostat setting should notchange 3) IF the room is too hot THEN the thermostat setting should decrease The above example is a demonstration of a simple control system designed to regulate the temperature in a home. This is a multi-pass system in that the rules are fired multiple times in order to regulate temperature. It is also possible to have a single pass system where the rule set is developed from a single modeling or pass of the data. Further, in this example, the modeler established the fuzzy relationship between sets. For example the antecedent set, "too clod" was matched with the consequent set, "increase". In other applications it is preferable to allow the data to dictate the antecedent and consequent of the fuzzy rules. 2.9 Fuzzy Logic Applications Although Fuzzy Logic was introduced first in the United States, American scientists and academics generally avoided using it mainly due to its unconventional name. The same was generally true of scientists in Europe. It seems that many scientists refused to be involved with a technology that had a name that sounded so child-like, (Kaehler, 1998). On the other hand many other scientists gave fuzzy logic more serious consideration but none-the-less discounted fuzzy logic as being nothing more than probability theory in disguise, (Kosko, 1994; McNeil and Freiberger, 1993). At the same time researchers in many Asian countries including China and Japan enthusiastically accepted this new technology. Japan is currently positioned at the leading edge of the application of Fuzzy Logic research. The US in contrast, by some estimates is, is ten years behind in the applied use of this technology, (Mendel, 2001) One of the first significant applications of Fuzzy Logic was in the area of automated systems or machine control in 1973, (Krantz, 1999). At the University of London Professor Ebrahim Mamdani and his graduate student Sedrak Assilian were trying to stabilize the speed of a small steam engine. Although they were using the most sophisticated digital control equipment available, they were unable to stabilize - 37 - the speed of the engine as it would either overshoot the target speed, or would be to sluggish in achieving the target speed. Professor Mamdani as the story goes had recently read about the control method proposed by Professor Zadeh, and decided to try it. He created a simple fuzzy logic controller that worked better than any of the other systems that they had tried, (Sowell, n.d.; Krantz, 1999). The most well know large-scale application of Fuzzy Logic to date is its use as the control system for the subways, constructed in 1987, in Sendai, Japan. It has been reported many times that the trains start and stop without the jolts and tug of inertia common to most subways. It has been estimated that the Fuzzy Logic controller used on this subway systems has resulted in a 10% fuel savings as well. Also in Japan researchers created a Fuzzy Logic controller that can fly a helicopter that is missing one of its rotor blades. Something that not even a human pilot can do. Fuzzy Logic has received widespread acceptance as a technology for automated systems control and it is gaining acceptance as a technology for many other data processing applications. In Japan there are several billion dollars of successful Fuzzy Logic based commercial products including: auto-focusing cameras; washing machines that adjust to how dirty the cloths are, automatic transmission and engine controllers; anti-lock braking system controllers; color film developing systems and computer programs that successfully trade in the financial markets, (Krantz, 1999). 2.10 The Mamdani Development Framework Fuzzy Logic is a rich discipline in which there is more than one way to skin the proverbial data processing cat. The wide range of fuzzy methods that have evolved for the same application evidences this fact. With that said, most rule-based fuzzy logic methods have four major components or modules. This four-module framework is attributed to the work of Ebrahim Mamdani, (Mendel, 2001). The modules are, in order of operation: fuzzification, inference, composition and defuzzification. Fuzzy IF-THEN rules guide the operation of each module. Figure 2.3 provides a graphical representation of the Mamdani Framework. - 38 - (Figure 2.3) The fuzzification module establishes the fact base for the fuzzy method. The input to this module is the scalar values of the data to be processed. In this module the IF-THEN rules that will be used in all four modules are developed and established. These rules are used in fuzzification module to associate the input scalar observations with input fuzzy sets; in the inference module to associate input fuzzy sets with other input fuzzy sets; in the composition module to create fuzzy rule sets and finally in defuzzification to associate the output fuzzy sets with output scalar values. In addition, the numbers of fuzzy sets that will be used to model the data are established, as are the characteristics of the membership function for each set. The membership function is defined by the method developer and is used to determine each scalar observation's membership in each fuzzy set. Each fuzzy set can have a membership function that is unique to that set. The characteristics of a membership function affect each scalar observations degree of membership (DOM). Membership functions have six characteristics that include, shape, height, width, shouldering, center points, and overlap. These functions can be depicted graphically in a Cartesian coordinate system with an x and a y axis. The most common shapes are triangular, bell shaped, trapezoidal and exponential. The height of the function is normalized or set at one so that maximum membership in any set is one. The width of the function is its distance along the x-axis for each set and the width can vary by set. Shouldering is typically used to lock the height for a given set at the maximum DOM of one. The overlap between sets in many instances is set at 50%, however overlap can range from zero to nearly 100% and the overlap between the various functions does not have to be the same. Figure 2.4 provides a graphical depiction of a three set fuzzy model. (Figure 2.4) - 39 - While there are numerous approaches to assigning an observations DOM, the most frequently used method for control systems is to first identify the observation's location on the x-axis and then project vertically to identify its location on the membership function. The DOM in a given set can be established by judgment or through a more standardized calculation but must be a value in the interval [0,1]. The output of the fuzzification module is a fuzzy set or sets for each scalar observation. The inference module has as its input the fuzzy sets representing the fuzzified scalar values established in the fuzzification module. In this step the appropriate IF-THEN rules are evaluated resulting in inferences being made about the relationship between the fuzzy sets. These relationships are often referred to as Mamdani fuzzy relationships. These relationships are captured as fuzzy rules in which the fuzzy sets serve as the antecedent and consequent of the rules. The relationship can be determined a priori by the modeler as was the case in the earlier example of a fuzzy home heating controller in section 2.8, or the relationship can be determined by the data with each firing of the inference modules IF-THEN rules. The composition module has its input the fuzzy rule set that was the output of the inference module. In this module firing of its IF-THEN rule results in the creation of composite fuzzy sets that serve as the fuzzy output. Individual fuzzy rules may have different conclusions so composition is the process in which all rules are considered and combined. The output, are fuzzy sets that summarize the fuzzy relationship between the observations in the original data. The Defuzzification module has as its inputs the composite or combined fuzzy sets that served as the output of the composition module. In this module an IF-THEN rule converts the fuzzy output into scalar values that can be used by the physical system being modeled. There are many approaches to defuzzification including Maximum, Centroid, Center-of-Sums, Height, and Center-of-Sets. 2.11 Fuzzy Logic Based Extrapolative Methods Song and Chissom (1991) introduced an extrapolative forecasting method based - 40 - on Fuzzy Logic. These authors proposed this method to address uncertainty in the form of, what they referred to as, fuzziness or vagueness in the historical observations of university enrollment data. These authors drew a distinction between uncertainty in the form of fuzziness, and uncertainty that result from white noise. In the latter case, they concluded that white noise results from random factors that affect the values of the time series. In the former case, they concluded that fuzziness results from non-random factors such as measurement error. They argued that a method based on fuzzy logic would be required to handle uncertainty resulting for these non-random factors. In their study, the APE for each forecast value as well as MAPE were used to establish the relative accuracy of their fuzzy logic method and time series linear regression (TSLR) on a single series of enrollment data. Accuracy was based on model-fit using one period ahead forecasts, that is a forecast horizon of one period. In the their experiment they altered the values of the time series to simulate measurement error and demonstrated that their method was more robust than TSLR as it produced forecasts that provided a better fit to the historical observations than did the forecasts produced by the regression method, in which occupancy was regressed against time. Although their method was not designed within a specific method development framework, their method non-the-less can be discussed generally within the Mamdani Framework. Again, the four modules in this framework are fuzzification, inference, composition and defuzzification. While The Song and Chissom (1991) method showed promise as a viable approach to extrapolation, overall their method was quite cumbersome to use and it would be difficult to automate their procedures. In their fuzzification module they pre-assigned seven fuzzy elements to each of seven fuzzy sets resulting in an unusual and highly cumbersome approach to fuzzification. The authors also indicated that they believed that seven sets would yield optimal results, however they did not provide any empirical evidence to support their conclusion. This claim contracts earlier findings and the generally held belief by most - 41 - modelers that there is no single parameter value in any method, that that yields optimal forecasts for all time series. In addition, their use of matrices and vectors in the composition module to capture the relationship between the historical observations of the time series in question is extraordinarily involved. It would appear that the author's goal was to create a final matrix that captured or summarized all knowledge on the relationship between fuzzy set, much in the same was that does so in a regression model. Chen (1996) introduced a new method in which he replaced the cumbersome composition module of the Song and Chissom (1991). Specifically he replaced the matrices and the MIN-MAX composition operators in their composition module with Mamdani style fuzzy logical relationships, and replaced their height defuzzifier with a simpler maximum defuzzifier. In Chen's study, which was a replication in part of the Song and Chissom (1991) study and relied on the same enrollment data, APE and MAPE were used to establish the relative accuracy of the Song and Chissom method (1991) method and his method with new composition and defuzzifier modules. In his method he retained the Song and Chissom fuzzifier module. This was done in apparent support of the assertion on the part of Song and Chissom that a fuzzy set parameter of seven yielded the most accurate forecasts. Chen demonstrated that, relative to model fit using one period ahead forecasts, his method produced more accurate forecasts, and was more robust to measurement error, than was the Song and Chissom (1991) method. It should be noted however that both the Song and Chissom (1991) study and Chen study were deficient in the number, of series evaluated, methods examined and accuracy measures employed. Jarrett and Plouffe (1996) investigated the use of the Song and Chissom (1991) method to forecast occupancy levels in undergraduate student housing. These authors believed that the historical occupancy data contained measurement error and that the Song and Chissom (1991) method would be robust under these conditions, resulting in more accurate forecasts. - 42 - In this study four measures of forecast accuracy were used to establish the relative accuracy of the Song and Chissom (1991) method and six alternative extrapolative methods that included five smoothing methods and TSLR across fifteen time series of occupancy data. The measures of forecast accuracy used were MAPE, MAD, MSE and RMSE, and they were used as a measure of model fit using one period ahead forecasts. Jarrett and Plouffe used the procedures and method of the Song and Chissom study (1991). The ranking of the methods on forecast accuracy was based on minimizing both MAPE and MAD with MSE and RMSE reserved for comparison to other studies. The major finding of this study was that the Song and Chissom (1991) method provided more accurate forecasts based on model fit, using one period ahead forecasts than did the alternative extrapolative methods. In addition, and as was the case in the Song and Chissom 1991 study, a visual inspection of the plots of forecasted values produced by the Song and Chissom method and the observed values of the time series, revealed how closedly these patterns replicated one another. With that said an apparent lag of one period was observed when a trend was present in the historical values of the time series. Jarrett and Plouffe (1998) in an extension of their 1996 study used the same four accuracy measures to establish the relative accuracy of the Chen method in addition to the methods evaluated in 1996, across twenty time series of occupancy data. The fuzzy set parameter for both fuzzy methods was seven. The ranking of the models was again based on minimizing both MAPE and MAD as to model fit using one period ahead forecasts (a forecast horizon of one period). The additional five time series were combinations of fall semester and spring semester occupancy level for the periods under investigation. These combined series created a seasonal pattern in the data as the occupancy level for each spring was lower than for the fall of the same year. The major findings of this study were that the Chen method provided the most accurate forecasts for the original fifteen series, however, the Song and Chissom and - 43 - Chen methods were both outperformed by five and three of the traditional methods respectively on the five series that were simulating seasonality. The Chen method was, computationally, much easier to implement than the Song and Chissom method as a result of the development by Chen of new composition and defuzzifier modules. The improvement in forecast accuracy of the Chen method over the Song and Chissom method may well be attributed to the use of the Mamdani fuzzy logical relationships rather than the extensive matrices and vectors of the Song and Chissom method. The advantage of the Chen method may reside in the fact that the Mamdani fuzzy logical relationships provide a rougher modeling solution than do the matrices and vectors of the Song and Chissom method continually over estimated the actual values of the time series when a trend was present in the time series. These authors argue that future research to improve the accuracy of a fuzzy logic based extrapolative methods should focus on developing methods that will provide more accurate forecasts than alternative traditional methods when a trend or seasonal component is present in the time series. In addition, a fuzzifier module must be developed that is simple to use, easy to automate and allows for alternative values of the fuzzy set parameter to be considered. Finally, a center-of-sets defuzzifier should be considered. Further, they suggest that future experiments conducted to establish the relative accuracy of fuzzy based methods should use ex ante forecasts as opposed to model fit and that the ex ante forecasts should be produced for multiple forecasts horizons. Additionally, they suggest that a broader sample of data types should be examined. 2.12 Research Hypotheses Twenty years of research on time series extrapolation has demonstrated that statistically simple methods provide forecasts that are as accurate, and in many cases more accurate, than those produced by statistically complex methods. Authors including Fildes and Makridakis (1998), Makridakis Hibon (2000), Fildes (2001) and - 44 - Small and Wong (2002) suggest that future research to improve the accuracy of extrapolative methods should focus on the development of statistically simple methods that have the characteristic of being robust to the fluctuations that exist in real world data resulting from both random and non-random events. Since 1991 four studies have been conducted to extend the initial work of Song and Chissom, to develop a fuzzy logic method for time series extrapolation. The results of those studies lend empirical support to the theoretical evidence that a fuzzy logic extrapolative method can provide more accurate forecasts than traditional extrapolative methods. Jarrett and Plouffe (1998) suggest in their conclusion that improving the accuracy of these methods requires that a new fuzzy logic extrapolative method be developed that will have the implicit ability too provide accurate forecasts of times series in which a trend or seasonal component is present, without the need to decompose the time series. Additionally, this method should allow for fuzzy set parameters other than seven. In response, a new fuzzy logic method for time series extrapolation has been developed and introduced in this research. This method builds on the work of Song and Chissom (1991) and Chen (1996). This new method has a new fuzzifier module that allows for scalar values to be simply and directly assigned to fuzzy sets. This module will also capture a trend if one exists in the time series, and further it allows the modeler to specify the value of the fuzzy set parameter. In addition, the inference module from the earlier methods has been modified to capture the seasonal component, of any duration, and a new defuzzifier module has been created that uses the center-of-sets principle. Finally the composition module, that uses Mamdani fuzzy logic relationships, which was used successfully in the Chen method has been retained in the Direct Set Assignment method. Three forecasting competitions have been designed to validate the relative accuracy of the Direct Set Assignment method. These competitions have used, as required, for each competition, the data, accuracy measures, procedures and best performing simple extrapolative methods from the M3-Competition, Makridakis - 45 - (2000). To investigate the effect of and changes to fuzzy set parameter, in this study, two null hypotheses will be tested: HO1: The ex ante forecast accuracy of the DSA method will not change in response to a change in the number of fuzzy sets, all other model parameters held constant HO2: A fuzzy set parameter of seven in a DSA model will yield the most accurate ex ante forecasts when compared to DSA models with fuzzy set parameters other than seven, in the range of set values from two to twenty, all other model parameters held constant There are three findings regarding the relative accuracy of extrapolative forecasting methods that have consistently been affirmed in the forecasting competitions and accuracy studies conducted during the past two decades, including the M3 forecasting competition held in 2000. As the data, accuracy measures and procedures are those of the M3-competition it is expected that these same three hypotheses will be reaffirmed in this study as well. Therefore in this study the following three null hypotheses will be tested: HO3: The ranking on forecast accuracy of the DSA method and the traditional methods compared in this study will be the same for all accuracy measures considered HO4: The ranking on forecast accuracy of a combination of alternative forecasting methods will be lower than that of the specific forecasting methods being combined HO5: The ranking on forecast accuracy of the DSA method and the traditional - 46 - methods compared in thus study does not depend on the length of the forecast horizon Small improvements in forecast accuracy can lead to cost reduction, enhanced market penetration, and improvement in both operational efficiency and customer service for many businesses. For this reason, and as indicated above, the goal of this research is to introduce a new extrapolative forecasting method, based on fuzzy logic that will provide more accurate ex ante forecasts than alternative simple extrapolative methods across a varied selection of business data types and forecasting conditions including those series in which a statistically significant trend is present. Therefore in this study the following three null hypotheses will be tested: HO6: The ranking on forecast accuracy of the time series specific DSA model, will be less than or equal to the ranking on forecast accuracy of both the subcategory and category specific DSA models HO7: The ranking on forecast accuracy, of the DSA method, will be lower than that of the traditional extrapolative methods to which it is being compared in this study, by time series subcategory, time series category and for all of the time series category and for all of the time series evaluated in this study HO8: The ranking on forecast accuracy, of the DSA method, will be lower than that of the traditional extrapolative methods to which it is being compared in this study, on those series in which a statistically significant trend is present 2.13 Summary As the 21st century gets underway, the field of business forecasting is experiencing a renaissance. This rebirth can be attributed in part to the emergence of mass storage technologies that allow business to capture data on all of their essential - 47 - activities, and in part to the fact that businesses are compelled to use all the information in their arsenal to gain competitive advantage. Forecasts of a business's essential activities are a critical input to their planning processes and are used for developing competitive responses and improving operational efficiency. When sufficient quantitative data are available in the form of a time series, extrapolative forecasting methods are preferred as they will provide more accurate forecasts than other available quantitative, as well as qualitative forecasting methods. Several authors have argued that new statistically simple extrapolative methods are needed to achieve further improvements in forecast accuracy for this category of forecasting methods. For this reason, this research has focused on the development and validation of a new fuzzy logic based method referred to as the Direct Set Assignment Method. The findings, from several prior studies on fuzzy logic based extrapolative methods, indicate that these methods can provide more accurate forecasts than traditional extrapolative methods. To validate the forecast accuracy of this new method, three forecasting competitions have been conducted using the standards and procedures of the M3 forecasting competition. The competitions in this study were designed to evaluate eight hypotheses relating to this new method's relative accuracy when compared to the most accurate methods from the M3-competition under different forecast situations. The seven specific hypotheses tested were derived from this study's two major research hypotheses. The first hypothesis is that the ex ante forecast accuracy of the DSA method will change in response to changes in the fuzzy set parameter. The second hypothesis is that the DSA method will provide more accurate ex ante forecasts than the traditional extrapolative forecasting methods to which it has been compared. In the next chapter, the development of the DSA method with the Mamdani framework is discussed. Two examples of the implementation of the DSA method on two of the time series evaluated in this study have been provided. - 48 - CHAPTER 3 THE DIRECT SET ASSIGNMENT METHOD This chapter begins with a comprehensive description of the development of the Direct Set Assignment method within the four-module Mamdani development framework. This section is followed by two examples demonstrating the implementation of the DSA forecasting method. The chapter closes with a summary of the application of fuzzy logic to time series extrapolation. 3.1 DSA Method Development The Direct Set Assignment extrapolative forecasting method was developed within the Mamdani design framework discussed in section 2.10, and has as its primary inspiration, the fuzzy logic based extrapolation methods introduced by Song and Chissom in 1991 and Chen in 1996. A discussion of the four design components of the DSA method follows. The inputs to the DSA method are those historical values of the time series of interest that have been selected by the modeler as the training set for that time series. There are four IF-THEN rules that are used in the DSA method with one set used in each of the four modules. A description of each rule appears in the following sections on each of the four modules that comprise the DSA method. Important new features in the DSA fuzzifier include explicitly describing the membership function as well as the degree of overlap between and among fuzzy sets. This was not done in either the Song and Chissom or Chen methods. This adds two additional model parameters to the DSA method that can be manipulated to improve ex ante forecast accuracy. In the DSA model a triangular membership function was used for all fuzzy sets, and the degree of overlap for successive sets for a particular - 49 - model is identical. An additional new feature in the DSA Fuzzifier is a universe of discourse that reflects an extension of the range of the historical values of the time series of interest. In the DSA fuzzifier the minimum and maximum values of the range are decreased and increased respectively, by the average of the absolute differences between the values of successive periods in the time series of interest. This provides the DSA method with the implicit ability to produce in-sample as well as out-of-sample forecasts that reflect either growth or decay in the time series. In the DSA method fuzzy sets are defined on the universe of discourse which serves as both the input and output domain for the method. While in most fuzzy methods, sets receive linguistic labels in the DSA method simple labels with subscripts suffice. Subscripts with low values are associated with low values of demand while subscripts with larger values are associated with higher levels of demand. Therefore fuzzy sets have been labeled Ai ( i =1 to n) where n is the number of fuzzy sets selected by the modeler. The minimum number of fuzzy sets is two, as one fuzzy set produces a horizontal-line forecast. While it is possible to evaluate an infinite number of fuzzy sets, in this study the maximum number of sets evaluated is twenty. Beyond twenty sets, fuzzy set intervals converge and as a result fuzzy forecast value converge. In the DSA method, unlike earlier fuzzy methods, the number of fuzzy sets defined on the universe of discourse is considered to be a model parameter that can be manipulated to improve ex ante forecast accuracy. Previously it was believed that seven fuzzy sets were optimal, (Song and Chissom, 1993). Also, while an observation's degree of membership in a fuzzy set can be established by judgment, in this study membership intervals were defined for each fuzzy set and each interval is associated with a specific degree of membership in the range [0,1]. The number of intervals defined should be sufficient to differentiate the degree of membership of observations and differ by model. This parameter is not considered to effect forecast accuracy but does ensure that the results of this study can be reproduced. - 50 - In the final step in the fuzzification module its IF-THEN rule set is used to assign the historical values of the time series, that is, the values of the training set, to one of the fuzzy sets that were defined on the universe of discourse for the time series in question. The rule is, IF an observation occurs within an interval of one and only one of the candidate fuzzy sets THEN that observation is directly assigned to that fuzzy set, exclusively OR, IF the observation occurs within an interval of more than one fuzzy set, THEN it is directly assigned to the fuzzy set in which it has maximum membership, exclusively. It is from this step in the fuzzifier, in which each historical observation of the training set is directly assigned to a set without reference to a linguistic label, that the DSA method derives its name. This simplified fuzzifier results in one input fuzzy set per historical obsevation being passed to the inference module. While it is possible to have more than one fuzzy set per observation passed to the inference module, one set was selected, as it is the simplest approach, and as such it represents the best starting point for developing the DSA method. In the DSA inference module its IF-THEN rule set is used to make inferences about the relationship between the input fuzzy sets. This results in the creation of fuzzy rules in which the fuzzy input sets, from the fuzzifier module, serve as the antecedent and consequent of those rules. The output from this module is a fuzzy rule set comprised of the individual fuzzy rules. For each antecedent and consequent pair of sets, the antecedent set is considered to be the current state of demand, while the consequent set is considered to be the future state of demand. Demand is a generic reference to the values of the time series. The pairs of sets cumulatively represent a fuzzy model of the time series. To identify the antecedent and consequent pairs, the periodicity, which is a measure of the seasonal component of the time series, must be known. The periodicity of the time series can be determined by either a visual inspection of a plot of the observations in the training set, or from the calculation of seasonal indices. The use of periodicity in a fuzzy extrapolative method is unique to the DSA method and has as its inspiration Winter's Seasonal Method. - 51 - The rule is, IF the periodicity is one, no seasonality is present, THEN the rules are formed for each (t) and (t+1) fuzzy sets beginning with the earliest observations in the time series, OR, IF the periodicity is four or eight, and the time series is quarterly, seasonality is present, THEN the rules are formed for each (t) and (t+4) or (t+8) fuzzy sets respectively beginning with the earliest observations in the time series, OR, IF the periodicity is twelve or twenty-four, and the time series is monthly, seasonality is present, THEN the rules are formed for each (t) and (t+12) or (t+24) fuzzy sets respectively beginning with the earliest observations in the time series. The data is processed only once making for a one-pass system that results in the creation of a fuzzy forecasting rule set that that will serve as the input to the composition module. These rules capture the relation between the historical observations of the time series. In the composition module its' IF-THEN rule creates composite rules that yield a (t + n) fuzzy forecast, in the form of fuzzy sets, for each fuzzy set, that represents a fuzzified historical observation of the training set, for the time series of interest. The rule is, IF for each fuzzified historical observation there is one or more fuzzy rules in which that fuzzy set is the current state or antecedent of the fuzzy rule, then the fuzzy forecast are the fuzzy sets which are the future state or consequent of the composite fuzzy rule, OR IF for each fuzzified historical observation there are no fuzzy rules in which that fuzzy set is the current state or antecedent of the fuzzy rule, then the fuzzy forecast is the fuzzy set that is the current state or antecedent of the composite rule. In the defuzzification module it' IF-THEN rule utilizes a center-of-sets defuzzifier to convert the fuzzy forecasts to scalar forecasts. The rule is, IF there is one and only one set in the fuzzy forecast, THEN the scalar forecast is the center point of that fuzzy set, OR IF there are two or more fuzzy sets in the fuzzy forecasts THEN the scalar forecast is the average of the center points of all fuzzy sets in the fuzzy forecast. The next section provides two examples of the DSA method on two time series that were used in this study. The first time series is N0006. This series has twenty observations of yearly microeconomic data collected for the periods 1975-1994. The first fourteen observations are the training data set, and the final six observations are - 52 - the validation data set. The observations in the validation data set are the values that will be used to establish ex ante forecasts accuracy. As such six ex ante forecasts are required. Series N0006 contains a statistically significant trend, however there is no indication of seasonality as the periodicity of the series is one. The second time series is N0671. This series has forty-four observations of quarterly microeconomic data collected for the periods 1984-1994. The first thirty-six observations are the training data set and the last eight observations are the validation data set. The observations in the validation data set are the values that will be used to establish the ex ante forecast accuracy. As such eight ex ante forecasts are required. Series N0671 contains a statistically significant trend and there is an indication of seasonality as the periodicity is four. Additional descriptive information on these series can be found in Appendix A. 3.2 DSA Example: Non-seasonal Series N0006 Step 1: Create Universe of Discourse. The minimum and maximum values for series N0006 are 1458.1 and 4095.0, respectively. The mean absolute change for successive periods is 245.0. Therefore the universe of discourse is 1213.0-4339.6, and the range or interval for the universe of discourse is 3126.6. Step 2: Select Membership Function and Fuzzy Set Parameters and Define Fuzzy Sets. A triangular membership function in which nine fuzzy sets are defined on the universe of discourse will be used to model the training set of series on each fuzzy set. As the membership function is triangular, the maximum degree of membership occurs at the apex of the N0006. (In this study nine sets was shown to provide forecasts that minimized forecast error). To allow for graded set membership each set interval was extended by twenty-five percent beyond that of the crisp set values. The overlap between sets is the same amount over consecutive fuzzy sets for a given DSA model. To provide consistency to the process of assigning degree of membership to a historical observation of the series, twenty membership intervals have been defined for each fuzzy set. Therefore, the value at the center point of the set has the maximum - 53 - membership in each set, and as such has a membership in each set, and as such has a membership value of 1.0. Ten membership intervals were defined on the sub intervals to the left and right of the set midpoint for each fuzzy set. The membership intervals are ordered from the maximum membership of 1.0 at the apex of the function, to the minimum membership value of 0.1 at the base of the function. For example consider the fuzzy set (1213.1, 1430.2, 1647.3). The observation 1429.0 would be assigned a membership value of 1.0, while observations 1214 and 1646 would each be assigned a membership value of 0.1. The nine fuzzy sets with their endpoints and midpoints for series N0006 are as follows: A1=(1213.1, 1430.2, 1647.3); A2=(1549.6, 1766.7, 1983.8); A3=(1886.1, 2103.3, 2320.4); A4=(2222.7, 2439.8, 2656.9); A5=(2559.2, 2776.3, 2993.5); A6=(2895.8, 3112.9, 3330.0); A7=(3232.3, 3449.4, 3666.5); A8=(3568.8, 3786.0, 4003.1); A9=(3905.4, 41222.5, 4339.6) Step3: Fuzzify observations in training set. If an observation has a degree of membership in one or more fuzzy sets, then that observation's fuzzy set assignment is the single set in which it had the highest degree membership. For example, in series N0006 the observation for 1975 is 1458.05, which has a membership of 0.9 in set A1. Therefore it is assigned to fuzzy set A1 exclusively. The observation for 1976 is 1931.53, has a membership in set A2 of 0.4 and in set A3 of 0.3. Therefore it is assigned to fuzzy set A2 exclusively. Step 4: Establish Fuzzy Rules. The periodicity of series N0006 is one. Therefore, fuzzy rules will be established with the sets for each (t) and (t+1) pair of time periods. For example, in series N0006 for the years 1975 and 1976 the fuzzy rule is, IF the demand for 1975 is A1 THEN that for 1976 is A2. The logical relationship A1-A2 is established for the years 1975 and 1976. In this pair the current state is A1 and the future state is A2. The (n-1) pairs for the time series, when taken cumulatively represent a fuzzy model of the training set for the time series. Step 5: Produce fuzzy forecasts. Given series N0006, for each time period (t) produce a (t+1) forecast beginning with the first observation in the training set. If the fuzzy set at time (t) is a current state in one or more fuzzy logical relationships, then - 54 - the fuzzy forecast is the future state of all fuzzy logical relationships for which that fuzzy set is the current state. From the above example, the fuzzified observation for 1975, time period (t) is A1. A1 is the current state in the fuzzy logical relationship A1-A2 only. Therefore, A2 the future state in the A1-A2 fuzzy logical relationship, is the fuzzy forecast for 1976, which is the (t+1) time period, OR IF the fuzzy set at time (t) is not a current state in one or more fuzzy logical relationships, then the fuzzy forecast for (t+1) is that that fuzzy set. From the above example, if the fuzzified observation is A3 and the only fuzzy logical relationship is A1-A2 then the fuzzy forecast for A3 is A3. In step 5, fuzzy forecasts are produced for each year 1976-1989. The forecasts for 1976-1988 are in-sample forecasts while the forecast for 1989 is an out-of-sample or ex ante forecast. For series N0006, six ex ante forecasts are required for the period 1989-1994. For this (t+1) model, the five additional ex ante forecasts for 1990-1994 are a replication of the first ex ante forecast produced for 1989. Step 6: Produce Scalar Forecasts. If there is only one fuzzy set in the fuzzy forecast, then the scalar forecast is the center point of the set of the fuzzy set that is the fuzzy forecast. If there are two or more fuzzy sets, then the scalar forecast is the average of the center point of the fuzzy sets that are the fuzzy forecast. Table 3.1 presents a summary of the output from the steps of the DSA method for series N0006. Table 3.1 DSA Method Implementation For Series N0006 3.3 DSA Example: Seasonal Series N0671 Step 1: Create Universe of Discourse. The minimum and maximum values for series N0671 are 1264.9 and 4414.0, respectively. The average absolute change for successive periods is 505.0. Therefore the universe of discourse is 759.94-4919.0, and the range or interval for the universe of discourse is 4159.1. Step 2: Select Membership Function and Fuzzy Set Parameters and Define - 55 - Fuzzy Sets. A triangular membership function in which seventeen fuzzy sets are defined on the universe of discourse will be used to model the training set of series N0671. (In this study seventeen sets was shown to provide forecasts that minimized forecast error). To allow for graded set membership each set interval was extended by twenty-five percent beyond that of the crisp set values. The overlap between sets is the same size over consecutive fuzzy sets for a given DSA model. To provide consistency to the process of assigning degree of membership to a historical observation of the series, twenty membership intervals have been defined on each fuzzy set. As the membership function is triangular, the maximum degree of membership occurs at the apex of the function. Therefore, the value at the center point of the set has maximum membership in each set, and as such has a membership value of 1.0. Ten membership intervals were defined on the sub intervals to the left and right of the set midpoint for each fuzzy set. The membership intervals are ordered from the maximum membership of 1.0 at the apex of the function to the minimum membership value of 0.1 at the base of the function. For example consider the fuzzy set (1213.1, 1430.2, 1647.3). The observation 1429.0 would be assigned a membership value of 1.0, while observations 1214 and 1646 would each be assigned a membership value of 0.1. The seventeen fuzzy sets with their endpoints and midpoints for series N0671 are as follows: A1=(759.9, 912.8, 1065.8); A2=(1000.8, 1153.7, 1306.6); A3=(1241.6, 1394.5, 1547.4); A4=(1482.4, 1635.3, 1788.2); A5=(1723.3, 1876.2, 2029.1); A6=(1964.1, 2117.0, 2269.9); A7=(2204.9, 2357.8, 2510.7); A8=(2445.7, 2598.6, 2751.5); A9=(2686.6, 2839.5, 2992.4); A10=(2927.4, 3080.3, 3233.2); A11=(3168.2, 3321.1, 3474.0); A12=(3409.0, 3562.0, 3714.9); A13=(3649.9, 3802.8, 3955.7); A14=(3890.7, 4043.6, 4196.5); A15=(4131.5, 4284.4, 4437.3); A16=(4372.4, 4525.3, 4678.2); A17=(4613.2, 4766.1, 4919.0). Step 3: Fuzzify observations in training set. If an observation has a degree of membership in one or more fuzzy sets, then that observation's fuzzy set assignment is the single set in which it had the highest degree membership. For example, in series N0671 the observation for 1984-Q1 is 1264.9, which has a membership of 0.3 in set - 56 - A2 and .2 in set A3. Therefore it is assigned to fuzzy set A2 exclusively. The observation for 1984-Q2 is 1386.3, which has a membership of 1.0 in set A3. Therefore it is assigned to fuzzy set A3 exclusively. Step 4: Establish Fuzzy Logical Relationships. The periodicity of series N0671 is four. Therefore, fuzzy rules will be established between each (t) and (t+4) pair of time periods. For example, in series N0671 for the quarters 1984-Q1 and 1985-Q1 the fuzzy rule is, IF the demand for 1984-Q1 is A2 THEN that for 1985-Q1 is A3. The fuzzy logical relationship A2-A3 is established for the years 1984-Q1 and 1985-Q1. In this pair the current state is A2 and the future state is A3. The (n-4) pairs when taken cumulatively represent a fuzzy model of the training set for the time series. Step 5: Produce fuzzy forecasts. Given series N0671, for each time period (t) produce a (t+4) forecast beginning with the first observation in the training set. If the fuzzy set at time (t) is a current state in one or more fuzzy logical relationships, then the fuzzy forecast is the future state of all fuzzy logical relationships for which that fuzzy set is the current state. From the above example, the fuzzified observation for 1984-Q1, time period (t) is A2. A2 is the current state in the fuzzy logical relationship A2-A3, only. Therefore A3, the future state in the A2-A3 fuzzy logical relationship, is the fuzzy forecast for 1985-Q1, which is the (t+4) time period, OR IF the fuzzy set at time (t) is not a current state in one or more fuzzy logical relationships, THEN the fuzzy forecast for (t+4) is that fuzzy set. From the above example, if the fuzzified observation is A4 and the only fuzzy logical relationship is A2-A3 then the fuzzy forecast for A4 is A4. In step 5, fuzzy forecasts are produced for each quarter 1985-Q1 through 1993-Q4. The forecasts for 1985-Q1 through 1992-Q4 are in-sample forecasts while the forecasts for 1993-Q1 through 1993 is an out-of-sample or ex ante forecast. For series N0671, eight ex ante forecasts are required for the periods 1993-Q1 through 1994-Q4. For this (t+4) model, the four additional ex ante forecasts for 1994-Q1 through 1994-Q4 are a replication of the first four ex ante forecasts produced for 1993-Q1 through 1993-Q4. Step 6: Produce Scalar Forecasts. If there is only one fuzzy set in the fuzzy forecast, then the scalar forecast is the center point of the set of the fuzzy set that is - 57 - the fuzzy forecast. If there are two or more fuzzy sets, then the scalar forecast is the average of the center point of the fuzzy sets that are the fuzzy forecast. Table 3.2 presents a summary of the output from the steps of the DSA method for series N0671. Table 3.2 DSA Method Implementation For Series N0671 3.4 Summary The Direct Set Assignment method has as its primary inspiration the extrapolative forecasting method of Song and Chissom (1991) and Chen (1996). However, unlike those methods which were designed to forecast only the level component of the time series, the DSA method was designed to forecast the trend and seasonal components of the time series as well as the level component. In addition, the DSA method was designed to forecasts all three components without using externally calculated parameters to adjust the forecast produced by the model, as is the case with methods including Robust Trend, Damped Trend and Theta, nor was it accomplished through decomposition of the time series as was the case with the Holt's and Winter's methods. A new fuzzifier module was developed for this method exclusively for use in times series extrapolation. The same is essentially true for the defuzzifer in which a center of sets defuzzifer was adapted from its typically application in control systems. The Inference module used by Song (1991) in which the antecedent and consequent of the fuzzy rules formed fuzzy logical pairs was retained. The manner in which the antecedent and consequent for the rules were created however was modified to reflect the periodicity of the time series. Using the periodicity of the series in the forecasting model was adopted from the Winter's decomposition method. The Composition module in the DSA was adopted from the Chen method - 58 - (1996), relied on Mamdani Fuzzy Logical Relationships and was introduced, by Chen, to overcome several identified problems with the composition process used by of Song and Chissom (1991). Two examples have been provided to illustrate the implementation of the four modules of the DSA method on two of the time series used in this study. The next chapter discusses the experimental design used in this study to validate the forecast accuracy of this new method. CHAPTER 4 METHODOLOGY This chapter describes the data and experimental design that were used to produce the measures of forecast accuracy required to evaluate the specific hypotheses discussed in section 2.12. The purpose of this chapter is to provide information that will allow for replication of this current study in part or in full. This chapter begins with a comprehensive description of the traditional and fuzzy extrapolative forecasting that were included, as required, in this current research to establish the relative accuracy of the DSA method under various forecasting conditions and for different data types. This includes a description of a combination of methods. A subsequent section has been provided that reviews the seven measures of forecast accuracy that served as the basis for establishing the relative forecast accuracy of the methods compared in this study. This information is followed by a description of the collection methods used to obtain, and the source and characteristics of, the time series in this study for which ex ante forecasts were produced. This information is followed by a description of the specific procedures that were followed in each of three forecasting competitions that utilized, as required, the - 59 - methods, accuracy measures and data described above. The methods, measures, data and procedures used in this study were adopted from the M3 Forecasting Competition held in 2000. In particular, nine subcategories of time series were selected that were defined by time interval and data type. These data were selected because they were the series for which statistically simple methods produced more accurate ex ante forecasts than did statistically sophisticated methods. 4.1 Forecasting Methods The forecasting methods discussed in the following sub sections are classified as statistically simple extrapolative forecasting methods. These methods rely on the use of weights, referred to as model parameters, to establish the relationship between the historical observations of the time series. This is with the exception of the DSA method, which uses fuzzy logical relationships to establish the relationship between the observations of the time series. Also, all of the methods in these sub-sections are linear methods with the exception of the DSA method, which is a non-linear method. All these methods rely on the established knowledge of the relationships between the observations in the training set to produce forecast values. Statistically sophisticated methods in contrast, rely on more sophisticated statistical theory, including correlation and covariance, to model the series, and this includes their use in procedures for diagnostic testing and training on multiple data sets. Statistically sophisticated methods include automated neural network methods, the family of Box-Jenkins methods and expert systems. 4.1.1 Naïve 2 The simplest of all forecasting methods is the Naïve method, also referred to as a random walk. It is easy to understand and takes no calculation. The assumption in a naïve model is that whatever happened last period will happen in the next period. So, - 60 - the last observation in a time series becomes the forecasts for the next periods. A slightly more involved version of the Naïve method is the Naïve 2 method. In this method the last observation is adjusted for seasonality and that adjusted value is used as the forecast. The Naïve 2 method lags trends and does not forecast turning points in the time series. The Naïve 2 model reverts to the Naïve model when seasonality is not present in the time series. 4.1.2 Single Exponential Smoothing (SES) Single Exponential Smoothing was introduced by Brown (1957) and is more complex than Naïve 2. It produces forecasts that are in principal a weighted average of the historical observations of the time series. In this instance however the weight applying to older observations is exponentially decreased, hence the name exponential smoothing. Single refers to the fact that the model uses only one smoothing parameter. This weight can assume a value in the range 0.1-0.9. This method provides automatic adjustment for past forecast errors and typically does not perform well when a trend or seasonality is present. This method is primarily for the extrapolation of the average component of a time series. 4.1.3 Holt's Linear Exponential Smoothing Holt's Linear Exponential Smoothing introduced by Holt (1959), is an extension of single exponential smoothing and it provides for the extrapolation of a linear trend in the historical observations of the time series in addition to extrapolating the average component. This method uses two smoothing parameters, one of which is used to forecast the level component and the other is used to forecast the trend component. The weights each assume a value in the range 0.1-0.9. This method is also referred to as double exponential smoothing for this reason. 4.1.4 Winter's Exponential Smoothing - 61 - Winter's Exponential Smoothing extends Holt's method by including an extra equation that is used to adjust the forecast to reflect the presence of seasonality in the historical observations of the time series. In this way Winter's method can forecast the average, the trend and seasonal components of a time series. Thus this model uses three smoothing parameters. Each of the parameters can take a value in the range 0.1 and 0.9. 4.1.5 Damped-Trend Exponential Smoothing Damped-Trend Exponential Smoothing was introduced by Gardner and McKenzie in (1985). It is an extension of single exponential smoothing, as are the two previously described methods. This method is also used to extrapolate both the level and trend components of a time series. This method uses two smoothing parameters, one is used to forecast the average component and the other is used to forecast the trend component. The average component parameter is typically in the range of 0.1-0.9 and the trend parameter is in the range 0.7-1.0. In addition, there is a third parameter referred to as the trend modifier. This parameter is used to reduce or "damp" the amount of growth extrapolated into the future. It is from this feature lies in the economic principle of diminishing return, which suggests that growth or decay, the trend, are rarely a sustained feature over the long term. 4.1.6 Robust Trend The Robust Trend Model was introduced by Grambsch and Stahel, (1990), and is a non-parametric version of Holt's Linear Exponential Smoothing method described in subsection 4.1.3. As implied by the name this method was designed to forecast both the average and trend components of a time series. In this method there is no weighting oer se of the historical observations. The average component is simply a Naïve forecast and the trend component is based on a median estimate of the - 62 - differenced data. The median estimate of the trend, unlike the mean estimate, is robust to the presence of outliers and it is for this reason that the method was named Robust Trend. 4.1.7 Theta and Theta sm The Theta Method was introduced by Assimakopoulos and Nikolopoulos (2000) and is a statistically simple extrapolative method. This method competed, and was one of the top performers in the M3-Competition. These authors also introduced a derivative of the Theta method called Theta sm or Theta Seasonal method. This method also competed in the M3 competition. While the Theta Seasonal method did not rival the performance of the Theta method, it did perform well in some situations and for that reason it has also been included in this study. Relative to Theta method, the authors' description of how to implement the method required many pages of mathematical calculations. Hyndman and Billah (2003) examined the Theta model and found that it could be expressed more simply. In fact, they demonstrate that Theta is comparable to Single Exponential Smoothing with drift, that is with an added trend component plus a constant where the slope of the trend is half that of the fitted line through the original time series. In any case Theta has been shown to be a very accurate forecasting methods. The Theta sm implementation represents a modification to the implementation of the Theta method. 4.1.8 Direct Set Assignment (DSA) The Direct Set Assignment Method introduced in this study utilizes Fuzzy Logic. It has been hypothesized that fuzzy logic's capability to model real-life data may provide for a more accurate forecasting method than the traditional and fuzzy extrapolative methods currently available. The DSA method provides a non-linear mapping of the relationship between historical observations of the time series. This method captures the relationship - 63 - between historical observations in the form of relationships between fuzzy sets. Knowledge of these relationships is then used to produce a fuzzy forecast in the form of fuzzy sets. The fuzzy sets are then defuzzified to produce scalar forecasts. An implementation of the DSA method has been provided in sections 3.2 and 3.3. 4.1.9 Combination of Methods A major finding of the original M-competition and one that has been reaffirmed in many accurate studies and forecasting competitions since that time, is that a combination of alternative methods will often produce forecasts that are more accurate than the forecasts produced by each of the alternative methods in their native form. It has been speculated that a combination of methods provides a more accurate forecast because each method being combined in some way offsets the forecast error of the other methods. For example, consider that method A, and method B, are being combined. Method A provides forecasts that continually over estimate demand, and Method B provides forecasts that continually underestimate demand. If the forecasts of the two methods for a common time period are combined, the average of the two forecast values would be the forecast for that time period. In the example, the average would likely be more accurate then either of the two original forecasts. In this study the traditional methods as well as the DSA method have been combined, as required, for each of the forecasting competitions. 4.2 Forecast Accuracy Measures A number of measures of forecast accuracy have been developed during the past two decades. These measures are used for selecting the forecasting method that will produce the most accurate forecasts for a given situation of data type. The research presented in section 2.4 raises concern about the utility of some of these accuracy measures for selecting the most accurate method. The measure discussed in the following subsections represent those measures - 64 - that have been found to be most appropriate for use in forecasting competitions. In each case the accuracy measure reflects the difference between the observed and forecast value for a given time period. This value is indicated in the calculations for these measures by the symbol (et). Each measure evaluates and combines the (et) values for a given method in such a way that it can be used to make a statement about the methods relative, and in some instances, absolute forecast accuracy. 4.2.1 Symmetric Mean Absolute Percentage Error (sMAPE) Symmetric MAPE is the average sAPE value for the same forecast horizon, for a selection of time series, for a specific method when used to establish the relative accuracy of various methods over a large selection of time series. However this measure can also be used to establish the accuracy of forecast method over the forecast horizons of a single time series. Using the sMAPE avoids the problem caused by large errors when the observed values are close to zero, as well as when there are large difference between the absolute percentage error, that occurs when the observed value is greater than the forecast value. Finally sMAPE has the advantage of being easy to interpret as it expresses forecast error as a percentage of the observed value. The sMAPE value is calculated as the average of the sAPE values for each forecast horizons of a selected series for each method or for all the forecast horizons of a selected series. The sAPE value is calculated as the ratio of the absolute difference between the forecast and observed values, et and the average of the sum of the forecast and observed values. This value is multiplied by one hundred to convert the value to a percentage. 4.2.2 Median Absolute Percentage Error (MedAPE) Although it is not indicated in its name, MedAPE is the median sAPE value for the same forecast horizon, for a selection of time series, for a specific method or for - 65 - all forecast horizons for a selected series for a particular method. This measure has the advantage of not being influenced by extreme values and for this reason is more robust than sMAPE. The measure is reasonably easy to interpret. The MedAPE value is calculated as the median of the sAPE values for each forecast horizon across all selected series for each method or for all forecast horizons of a particular series. The sAPE value is calculated as the ratio of the absolute difference between the forecast and observed values, et and the average of the sum of the forecast and observed values. This value is multiplied by one hundred to convert the value to a percentage. 4.2.3 Mean Absolute Deviation (MAD) This accuracy measure reflects the average dispersion about the mean and for this reason it can be interpreted as the standard deviation of the forecast value. In short this measure indicates how much, more or less, the forecast will be than the actual observation in the units of the time series. In addition, this measure can be used to create confidence intervals for the forecasts value. The MAD value is found by taking the average across all selected series, of the absolute difference between the forecast and observed value et for each forecast horizon. 4.2.4 Median Relative Absolute Error (MedRAE) This measure has been found to be particularly well suited for comparing the accuracy of various methods as in the case of this forecasting competition. The Median Relative Absolute Error is calculated as the absolute error et of a proposed model divided the absolute error et of the Naïve 2 model for a given series. The median value of these ratios across a number of time series can be found, and this term is the Median Relative Absolute Error. This accuracy measure is reasonably easy to interpret and lends itself to summarizing across horizons and series as it controls - 66 - for scale and for outliers. 4.2.5 Percentage Better The percentage Better measure is used to count the percentage of time that a given method has a smaller forecasting error than another method. In this study, for this accuracy measure, the method to which all of the methods were compared is Theta. Theta was one of the best performing statistically simple methods in the M3 competition held in 2000. In this measure each forecast is given equal weight and for this reason it is a useful measure for use in forecasting competitions. 4.2.6 Average Ranking In a forecasting competition the methods under evaluation can be given a rank that indicates their accuracy, relative to that of the other methods in the competition. The method with the lowest rank is the method with highest relative accuracy. The Average Rank can be based on a number of other measures of forecast accuracy, but in most cases is based on one of the numerous MAPE values. In this study it was based on the sAPE values, as was the case in the M3 competition. To produce the average rank for a method, a rank is assigned for each forecast horizon for each of the selected time series for all methods under evaluation based on the methods sAPE value for each forecast horizon for each series or across the forecast horizons of a particular series. Then the ranks for each forecast horizon for each series are averaged for each method or are averaged across the forecast horizons of a particular series. This measure while typically used in the aggregate can be used to compare the forecast accuracy of various methods for a single series. 4.2.7 Benchmark The absolute accuracy of the methods in a competition is not as important as - 67 - how well each method performs relative to a benchmark method. The simplest benchmark and the one used in this study is Naïve 2. The benchmark value is the difference in the sMAPE value between Naïve 2 and the alternative method. Positive values indicate that the alternative method produced more accurate forecast for the selected time series than did Naïve 2. The alternative methods can then be evaluated in terms of how much better of worse the performed than Naïve 2. The use of sMAPE allows for the difference to be interpreted as a percentage. 4.3 Determination of Periodicity The periodicity, or degree of seasonality, of a particular time series can be determined in several different ways. These include conducting a visual inspection of a plot of the historical observations of the time series itself, examining the autocorrelation function for the time series or through an algebraic calculation of the seasonal indices for the time series. In practice and in competitions the latter approach is generally preferred. The most frequently used approach to calculation of seasonal idiocies is the ratio-to-moving average method. In this method the ratio of the actual observation to a centered moving average forecast is calculated. This ratio produces a de-trended value for each period, typically a month or quarter. The average of these de-trended values for similar periods (i.e., quarter one for all years covered by the time series) is the seasonal indices. If the value of each of the seasonal indices is 1.00 then seasonality is not present in the time series. If the calculated value of the indices is other than 1.00 then seasonality is present and the periodicity is the number of indices with a value other than 1.00. 4.4 Data Fifteen time series were randomly selected, without replacement, from each of nine subcategories of data used in the M3 forecasting competition held in 2000, for a total of one hundred thirty-five time series. These data were organized as yearly, - 68 - quarterly and monthly categories of microeconomic, macroeconomic and industry data. This created the nine subcategories referenced above. The time dimension refers to the time interval between successive observations. Makridakis and Hibon (2000) collected the original data for the M3 competition on a quota basis. The three thousand three time series collected for the M3 competition were real-world, heterogeneous, business time series of yearly, quarterly, monthly other data each containing microeconomic, macroeconomic, industry, demographic and financial data. This created a total of twenty subcategories of time series. Makridakis and Hibon used a variety of means to collect the M3 Competition time series. These include written requests for data sent to companies, industry groups and government agencies, as well as the retrieval of data from the Internet and more traditional sources of business and economic data. The authors labeled the three thousand three time series by creating a unique ID number for each series ranging from N0001 to N3003. They also assigned to each of these time series a brief description of the data type, (i.e., SALES, INVENTORIES, COST-OF-GOODS-SOLD, etc.) and included the time period from which the data was generated. Further, the authors partitioned each of the three thousand three time series into a calibration data set and a validation data set. The calibration data was used to calibrate the forecasting methods and the validation data set was used to evaluate the accuracy of the ex-ante forecasts for each time series. The validation data set contains the last six observations for each of the yearly time series; the last eight observations for the quarterly time series; and the last eighteen observations for monthly time series. The number of observations in the validation data set represents the forecast horizon, or the number of ex-ante forecasts that had to be produced for that particular time series. The entire M3 Competition data set can be retrieved form: www.marketing.wharton.upenn.edu/forecast/data.html. In the one hundred thirty-five time series selected for this study, the minimum series length, for yearly series was twenty and the maximum length was forty-seven; for quarterly data the minimum length was twenty four and the maximum length was - 69 - seventy two with a mean and median length respectively of fifty-two and fifty-four; and for monthly data the minimum length was sixty-nine and the maximum length was one forty-four with a mean and medium length respectively of one hundred twenty-two and one hundred thirty-four. These values are consistent with the values reported by Makridakis and Hibon, (2000) for all of the time series in the same nine subcategories of data. Table 1 in Appendix A provides complete descriptive statistics including, periodicity, for all of the one hundred thirty five time series evaluated in this study. To ensure that each of the methods used in the competition was fairly evaluated all forecasts for the traditional methods used in the study were obtained from the experts used in the M3-Forecasting Competition via Michelle Hibon, Senior Research Fellow, from INSEAD Business School, and coauthor of the M3-Competition. That is with the exception of the DSA method, whose forecasts, were produced by the author of this current study. 4.5 The Forecasting Competitions Three forecasting competitions were conducted in this study to evaluate the hypotheses discussed in section 2.12. These competitions adopted the methodology used in the M3 Forecasting Competition for a newly introduced method, Theta. In this current study the competitions were used to establish the relative forecast accuracy of the extrapolative forecasting methods discussed in section 4.1 of this chapter as required, as were the accuracy measures described in section 4.2, for the three competitions. The time series used in these competitions are those discussed in section 4.3 of this chapter, also used as required for each of the three competitions. The measures of forecast accuracy were calculated for the forecast horizons of individual series, for subcategories of series, for categories of series and for all of the series used in a competition, as required. This is referred to as the level of series aggregation. In practice having knowledge of which method provides the most accurate forecasts for a particular data type or time interval is quite useful, as is the - 70 - knowledge that a particular method provides the most accurate forecast, over a broad cross-section of time intervals. 4.5.1 Competition #1 Procedures In Competition #1, the relative forecast accuracy of nineteen models of the DSA method was established. The nineteen DSA models evaluated in this competition differed in the value of their fuzzy set parameter. The parameter values examined were the whole numbers from two through twenty. The fuzzy set parameter is the number of fuzzy sets used to model the training set for the time series. These models were labeled (FS2) through (FS20). Six, eight and eighteen ex ante forecasts were produced for each one of the one hundred thirty-five yearly, quarterly and monthly time series, respectively. These forecasts were produced in several automated Micro Soft Excel workbooks. These ex ante forecasts were then compared to the observed values in the validation data set for each one of the time series. The two measures of forecast accuracy, sMAPE and Average Ranking, were calculated for each method for, each time series across all its forecast horizons for the series level of aggregation. All seven accuracy measures however were calculated for the time series in each of the nine subcategories and the time series in each of the three categories. The most accurate model was the one that minimized both sMAPE and Average Ranking for the series level. A consensus method was used to identify the most accurate model at the subcategory and category level of aggregation. This approach allowed for a statement to be made as to the relative accuracy of a particular method for, and individual series, a subcategory, as well as a category. In addition to testing hypotheses H01 and H02 discussed in section 2.12, the goal of this competition was to identify the DSA model that produced the most accurate forecasts for individual series, for subcategories of series and for categories of series so that the forecasts for those models could be used in competition number two and number three. As such, the forecasts for the models of best fit by series, - 71 - subcategory and category were combined and relabeled DSAA, DSAB and DSAC, respectively. 4.5.2 Competition #2 Procedures In Competition #2, the relative forecast accuracy of, the eight traditional extrapolative methods, a combination of SES, Holt's and Dampen Trend methods, designated S-H-D, DSA-A, DSA-B, DSA-C and a combination of the three DSA methods with Winter's Exponential Smoothing, designated DSAA-W, DSAB-W and DSAC-W, was established. Winters method was selected as it was the most complete of the exponential smoothing methods. Six, eight and eighteen ex ante forecasts were obtained from experts for each one of the one hundred thirty-five yearly, quarterly and monthly time series, respectively. These ex ante forecasts were then compared to the observed values in the validation data set for each one of the time series. The seven measures of forecast accuracy discussed in section 4.2, were calculated for each method for, the time series in each of the nine subcategories, the time series in each of the three categories and for all one hundred thirty five time-series used in this competition. In this competition the three methods with the highest observed accuracy by subcategory, category and overall were selected based on a consensus among the seven accuracy measures. This approach allowed for a statement to be made as to the relative accuracy of a particular method for, a subcategory of series, a category of series and for all of the time series in this competition. Competition #2 was conducted specifically for the purpose of testing hypotheses H03, H04, H05 and H06. 4.5.3 Competition #3 Procedures In Competition #3, the relative forecast accuracy of, the eight traditional - 72 - extrapolative methods, a combination of SES, Holt's and Dampen Trend methods, designated S-H-D, DSA-A, DSA-B, DSA-C and a combination of the three DSA methods with Winters Exponential Smoothing, designated DSAA-W, DSAB-W and DSAC-W, was established. Winters method was selected as it was the most complete of the exponential smoothing methods. Six, eight and eighteen ex ante forecasts were obtained from experts for each one of forty-five time series each containing a statistically significant trend. These forty-five series were comprised of fifteen series that were randomly selected without replacement from each of the three categories of yearly, quarterly and monthly time series. These ex ante forecasts were then compared to the observed values in the validation data set for each one of the time series. The seven measures of forecast accuracy discussed in section 4.2, were calculated for each method for, the time series in each of the three categories and for all forty-five, time series used in this competition. In this competition the three methods with the highest observed accuracy by category, and overall, were selected based on a consensus among the seven accuracy measures. This approach allowed for a statement to be made as to the relative accuracy of a particular method for, a subcategory of series, a category of series and for all of the time series in this competition. Competition #3 was conducted specifically for the purpose of testing hypotheses H07. 4.6. Concerns About Forecasting Competitions A longstanding concern, when using the forecasting competition design, is that the accuracy measures are averaged across series and over different forecasting horizons. The effect potentially is to obscure the top performance of a model on specific series, or specific forecast horizons, when in fact these series or forecast horizons, could be of primary interest to a practitioner. This problem is exacerbated as the number of series and forecasts produced is increased. Further, by not bringing to the attention of forecasters, a model's top performance on some limited number of - 73 - series or forecast horizons, the inevitable question as to why the model performed so well in these situations is never asked. So, declaring a particular model as the winner of a competition has some real limitations to its importance, and this is particularly true when declaring a model the overall winner of a competition. Another concern is with the nature of the data used in the M-competitions and the other accuracy studies that rely on these data. Tashman (2001), suggests that time series are multi-attributed and that a term such as "microeconomic data" used in the M competitions is a catch all for series that actually vary greatly with respect to a number of features including: company, brand, item, product, financial, marketing, operations, country and region. In addition, these data vary by seasonality, level of volatility, presence of outliers, and whether or not a trend is present. The problem is that the time series in the data types used in the M competitions, (microeconomic, industry, microeconomic, demographic and financial), are actually quite heterogeneous within the data type subcategories and quite homogeneous with respect to the series in the other subcategories. This makes conclusions about a methods performance on a subcategory or category less meaningful as these levels of aggregation do not represent the way in which time series are encountered in the real world. Further, in the M-competitions held since 1982 the time origins of the data have been predominately yearly, quarterly and monthly, where as in business, a great deal of data is captured on an hourly, daily or weekly basis. Data with these time origins has not been used in the M competitions and it is unlikely the results on yearly, quarterly or monthly data can be generalized to these different time origins. In section 2.5 of this current study a discussion was provided on the reasons why difference testing is not used to establish forecast accuracy in forecasting competitions. While the reasons enumerated for not using different testing are justified, there is far less justification for not using prediction intervals in place of the current point estimates. Prediction intervals are constructed from the standard deviation, which in forecasting is the accuracy measure, Mean Absolute Deviation (MAD). As such, the debate from section 2.5, on which accuracy measure to use becomes moot. Further, as - 74 - to the use of the Percent Better accuracy measure, there is no reason why this accuracy measure could not be used in conjunction with prediction intervals. Finally, while the argument provided for methods being reasonable alternatives still holds, the fact remains that the use of prediction intervals would help to resolve with greater justification, issues of relative forecast accuracy. Another topic of concern is the absence of domain knowledge about the series used in the competitions. Authors including Armstrong (2001), suggest that domain knowledge about the series used in the competition should be provided to participants. This would ensure that any methods that could benefit from domain knowledge would be fairly evaluated in the study and, this approach better represents how practitioners produce forecasts. As such, the caveat to a particular model being declared the winner at some level of series aggregation is that other models may have performed as well if domain knowledge had been made available. 4.7 Summary Three forecasting competitions were conducted to investigate the relative forecast accuracy of the Direct Set Assignment (DSA) method. This is a newly developed fuzzy logic based extrapolative forecasting method that was investigated due to its potential to provide more accurate ex ante forecasts than currently available statistically simple extrapolative forecasting methods. The data, procedures and alternative forecasting methods used in these competitions, as required, were adopted from the M3 forecasting competition held in 2000. The alternative methods included eight traditional methods including a combination of three traditional methods. These competitions were conducted to answer several specific questions concerning the impact of the fuzzy set parameter on the relative forecast accuracy of the DSA method, as well as questions about the relative accuracy of the DSA method, on different types data including series in which a trend was present. An additional question was what would be the effect on relative accuracy of combining the most - 75 - accurate fuzzy method with a selected traditional method. In the next chapter the results of the three competitions are presented. CHAPTER 5 RESULTS This chapter reports the relative forecast accuracy of the Direct Set Assignment, (DSA) forecasting method introduced in Chapter 3, and that of the alternative forecasting methods that were analyzed in three forecasting competitions outlined in section 4.5. The one hundred thirty-five time series that were randomly drawn from the M3-Competition data set were evaluated in their entirety in Competition #1 and #2, resulting in the analysis of, twenty-seven thousand three hundred and sixty forecasts, and twenty one thousand six hundred forecasts, respectively. In Competition #3 a sample of forty-five series, comprised of fifteen series randomly drawn from each of the three categories of time series, in the sample drawn for this study, were evaluated, resulting in the analysis of four thousand fifty forecasts. The results generated from the three competitions are too numerous to present in their entirety in this chapter. Therefore, tables containing the accuracy measures for the methods evaluated in Competitions #1, #2 and #3 can be found in Appendix B, C, and D respectively. A description of the information provided in each Appendix can be found in the sections describing the results for each competition, accompanied by the appropriate summary tables. The tables and commentary in the remainder of this chapter report the relative forecast accuracy of the best performing methods in each competition for the various levels of series aggregation. 5.1 Competition #1 Results In Competition #1 the goal was to assess the impact on the forecast accuracy of the DSA method of varying the fuzzy set parameter in the model from two sets (FS2), - 76 - to twenty sets (FS20), and to determine if there is an optimal or universal number of fuzzy sets. Relative forecast accuracy was assessed at three levels of aggregation. They are individual series, subcategory and category. The results for Competition #1 indicate that there is no universal or optimal fuzzy set parameter for the DSA method. Rather, the set parameter that will yield the most accurate forecasts is specific to each individual series, each subcategory of series and each category of series. This finding suggests that the set parameter in the DSA method functions in a similar manner to the parameter weights used in exponential smoothing methods. 5.1.1 Individual Series Competition #1 The values of sMAPE and Average Ranking accuracy measures for the one hundred thirty five series for the various forecast horizons, for the methods evaluate in this competition have been reported, by category, in Table B.1 - Table B.6 in Appendix B. Table 5.1 – Table 5.3, in this chapter, reports the DSA model that was selected as the model providing the highest observed accuracy for each individual series across its own forecast horizons. The (FS #) designation in these tables indicates the value of the fuzzy set parameter for that DSA model. The forecasts provided by the DSA models with the highest observed accuracy by series, reported in Table 5.1 – Table 5.3, were combined and relabeled the DSA-A model and were evaluated in Competition #2. Table 5.1 Model Which Gives Best Results by Series – Yearly Data Table 5.2 Model Which Gives Best Results by Series – Quarterly Data - 77 - Table 5.3 Model Which Gives Best Results by Series – Monthly Data 5.1.2 Subcategory Competition #1 The values of the seven accuracy measures for the nine subcategories of time series, for the various forecast horizons, for the methods evaluated in this competition have been reported in Table B.7 – Table B.69 in Appendix B. Table 5.4 – Table 5.12 in this chapter reports, in order, the three DSA models with the highest observed accuracy for each of the seven measures of forecast accuracy, for each the nine subcategories of series respectively across the various forecast horizons. Table 5.13, in this chapter, reports the DSA model that was selected, based on the average across all forecast horizons, as the model providing the highest observed accuracy for each of the nine subcategories for each of the seven accuracy measures. Table 5.13A reports the DSA models that were selected on a consensus basis for each subcategory, with sMAPE breaking ties, from Table 5.13, as the model with the highest observed accuracy by subcategory. The forecasts from these models were combined, and labeled, and DSA-B model. The forecast accuracy of the DSA-B model was established in Competition #2. Table 5.4 Best Models For Yearly Micro Series Table 5.5 Best Models For Yearly Industry Series Table 5.6 Best Models For Yearly Macro Series Table 5.7 - 78 - Best Models For Quarterly Micro Series Table 5.8 Best Models For Quarterly Industry Series Table 5.9 Best Models For Quarterly Macro Series Table 5.10 Best Models For Monthly Micro Series Table 5.11 Best Models For Monthly Industry Series Table 5.12 Best Models For Monthly Macro Series Table 5.13 Models which give best results – subcategory 5.1.3 Category Competition #1 The values of the seven accuracy measures for the three categories for the various forecast horizons, for the methods evaluated in this competition, have been reported in Table B.70 – Table B.90 in Appendix B. Table 5.14 – Table 5.16 in this Chapter reports, in order, the three DSA models with the highest observed accuracy, for each of the seven measures of forecast accuracy, for each of the three categories of series respectively, across the various forecast horizons. Table 5.17 in this Chapter, reports the DSA model that was selected as the model providing the highest observed accuracy for each of the three categories, based - 79 - on the average across all identical forecast horizons, for each of the seven accuracy measures. In Table 5.17, it can be seen that generally, the ranking of the models varies according to the error measure being used. Table 5.17A reports the DSA models that were selected on a consensus basis by category, with sMAPE breaking ties, from Table 5.17, as the model with the highest observed accuracy by category. The forecasts from these models were combined, and labeled the DSA-C model. The forecast accuracy of the DSA-C model was established in Competition #2. Table 5.14 Best Models For Yearly All Data Table 5.15 Best Models For Quarterly Series Table 5.16 Best Models For Monthly Series Table 5.17 Models Which Give Best Results – category 5.2 Competition #2 Results In Competition #2 the goal was to establish the relative forecast accuracy of several models of the DSA method that had proven to be the most accurate at the series, subcategory, and category level of aggregation in competition #1, and eight traditional methods; a combination of traditional methods and a combination of the DSA methods and Winters Exponential Smoothing. Relative forecast accuracy was assessed at three levels of aggregation for all one hundred thirty-five series used in this competition. They are the subcategory, category and all series levels of - 80 - aggregation. In Competition #2 the DSAA and DSAA-W models provided more accurate forecasts at the subcategory, category and all series levels of aggregation than did the DSAB, DSAC, DSAB-W or DSAC-W. In the subcategory competition the DSAA model provided the highest observed accuracy in the five subcategories: Yearly-Micro and Yearly-Industry data, and Quarterly-Micro, Quarterly-Industry and Quarterly-Macro data. The DSAA model was also a top three performer on observed accuracy in the Monthly-Micro subcategory. The DSAA-W model provided the highest observed accuracy in the two subcategories: Monthly-Micro and Monthly Industry-Industry Data. The DSAA-W model was also a top three performer in the five subcategories: Yearly-Industry, Quarterly-Micro, Quarterly-Industry, Quarterly-Macro and Monthly-Macro data. The DSAB-W and DSAC-W models were top three performers in the Quarterly-Industry and Quarterly-Macro subcategories. It was in only the subcategory of Yearly-Macro data where none of the DSA method derivatives were among the three models with the highest observed accuracy. In the category competition the DSAA model provided the highest observed accuracy in the Quarterly Category and was a top three performer in the Yearly and Monthly Categories. The DSAA-W model provided the highest observed accuracy in the Monthly Category and was a top three performer in the Yearly and Quarterly Categories. The DSAC-W model was a top three performer in the Quarterly Category. In the All Series competition the DSA-A model had the highest observed accuracy of all models and the DSAA-W model was a top three performer. In addition to the DSA methods, other top performing methods in the subcategory competition included Theta and Robust Trend. In the subcategory competition Robust Trend provided the highest observed accuracy in two subcategories: Yearly-Macro and Monthly-Macro data and was a top three performer in the two additional subcategories: Yearly-Micro and Quarterly-Macro. Theta was a top three performer in the three subcategories: Yearly-Industry and Yearly-Macro and - 81 - Monthly-Industry. In the category competition Robust Trend provided forecasts with the highest observed accuracy in the Yearly Category. Theta was a top three performer in the Monthly Category. In the All Series competition Theta was a top three performer. It is worthy to note that Robust Trend and Theta were two of the top performing methods in the M3 competition. 5.2.1 Subcategory Competition #2 The values of the seven accuracy measures for the nine subcategories for the various forecast horizons, for the methods evaluated in this competition, have been reported in Table C.1- Table C.63 in the Appendix C. Table 5.18- Table 5.26 in this Chapter reports, in order, the three models with the highest observed accuracy for each of the seven measures of forecast accuracy, for each of the nine subcategories of series respectively, across the various forecast horizons. Table 5.27 reports, for the convenience of the reader, the models with the highest observed accuracy for each subcategory for each accuracy measure based on the average across all forecast horizons. Table 5.27A reports the model that was selected on a consensus basis, with sMAPE breaking ties, by subcategory from Table 5.27, as the model with the highest observed accuracy by subcategory. In Table 5.27, it can be seen that the ranking of the models varies according to the accuracy measure being used. Table 5.18 Best Models For Yearly Micro Series Table 5.19 Best Models For Yearly Industry Series Table 5.20 - 82 - Best Models For Yearly Macro Series Table 5.21 Best Models For Quarterly Micro Series Table 5.22 Best Models For Quarterly Industry Series Table 5.23 Best Models For Quarterly Macro Series Table 5.24 Best Models For Monthly Micro Series Table 5.25 Best Models For Monthly Industry Series Table 5.26 Best Models For Monthly Macro Series Table 5.27 Models which give best results - Subcategory 5.2.2 Category Competition #2 The value of the seven accuracy measures for the three categories, for the various forecast horizons, for the methods evaluated in this competition, have been reported in Table C.64 – Table C.84 in Appendix C. Table 5.28 – Table 5.30 in this Chapter reports, in order, the three models with the highest observed accuracy for each of the seven measures of forecast accuracy, for each of the three categories of - 83 - series respectively, across the various forecast horizons. Table 5.31 reports, for the convenience of the reader, the model with the highest observed accuracy, based on the average across all forecast horizons, for each category for each of the seven accuracy measures. Table 5.31A reports the model that was selected on a consensus basis, sMAPE breaking ties, by category from Table 5.31, as the model with the highest observed accuracy by category. In Table 5.31 it can be seen that the ranking of the models varies according to accuracy measure being used. Table 5.28 Best Models For Yearly All Data Table 5.29 Best Models For Quarterly All Data Table 5.30 Best Models For Monthly All Data Table 5.31 Models which give the best results – category 5.2.3 All Series Competition #2 The values of the seven accuracy measures for all one hundred thirty five series in competition #2, for the various forecast horizons, for the methods evaluated in this competition have been reported in Table C.85 – Table C.91. Table 5.32 in this Chapter reports, in order, the three models with the highest observed accuracy for each of the seven measures of forecast accuracy, for all one hundred thirty five series in competition #2 respectively, across the various forecast horizons. Table 5.33 reports, for the convenience of the reader, the model with the highest observed accuracy, based on the average across all forecast horizons, for all one - 84 - hundred thirty five series in competition #2, for each of the seven accuracy measures. Table 5.33A reports the model that was selected on a consensus basis for all series, with sMAPE breaking ties, from Table 5.33 as the model with the highest observed accuracy overall. In Table 5.33 it can be seen that the ranking of the models varies according to accuracy measure being used. Table 5.34 reports the sMAPE values for the combination methods evaluated in competition #2 for the models in both combined and native form. These values indicate that with the exception of the sMAPE values for the DSAA model for the average of the 1-4, 1-6 and 1-8 forecast horizons, combined methods perform at least as well as do the methods in their native form. Table 5.32 Best Models For Overall Data Table 5.33 Model which gives best results – overall Table 5.34 Symmetric MAPE of Single, Holt, Dampen, DSA-A and their combinations 5.3 Competition #3 Results In Competition #3 the goal was to establish the relative accuracy of several models of the DSA method that had proven to be the most accurate at the series, subcategory, and category level of aggregation in Competition #1, and eight traditional methods; a combination of traditional methods and a combination of the DSA methods and Winters Exponential Smoothing. Relative forecast accuracy was assessed at two levels of aggregation for forty-five series containing a statistically significant trend. They are the category and all series levels of aggregation. In competition #3, the DSAA model provided the highest observed accuracy in - 85 - the Quarterly Category was one of three methods with the highest observed accuracy in the Monthly Category. The DSAA-W model provided the highest observed accuracy in the Yearly Category and was one of the three methods with the highest observed accuracy in the Quarterly and Monthly Categories. The DSAC-W model was a top three performer in the Quarterly Category. In the All Series competition, DSAA was one of three models with the highest observed accuracy for all forty-five, time series and DSAA-W provided the highest observed accuracy in competition #3. The other models that performed well in the category competition were Theta and Holt's Exponential Smoothing. Theta provided the highest observed accuracy in the Monthly Category and was a top three performer in the Yearly Category. Holt's Exponential Smoothing was a top three performer in the Yearly Category. In the All Series Competition, Theta was also a top three performer. 5.3.1 Category Competition #3 The values of the seven accuracy measures for the three categories for the various forecast horizons for the methods evaluated in this competition have been reported in Table D.1 – Table D.63 in Appendix D. Table 5.35 – Table 5.37 in this Chapter reports, in order, the three models with the highest observed accuracy for each of the seven measures of forecast accuracy, for each of the three categories of series respectively, across the various forecast horizons. Table 5.38 reports, for the convenience of the reader, the model with the highest observed accuracy, based on the average across all forecast horizons, for each category for each of the seven accuracy measures. Table 5.38A reports the model that was selected on a consensus basis, with sMAPE breaking ties, from Table 5.38, as the model with the highest observed accuracy by category. In Table 5.38 it can be seen that the ranking of the models varies according to accuracy measure being used. Table 5.35 Best Models For Yearly – Trend Series - 86 - Table 5.36 Best Models For Quarterly – Trend Series Table 5.37 Best Models For Monthly – Trend Series Table 5.38 Models which give the best results- category 5.3.2 All Series Competition #3 The values of the seven accuracy measures for all forty-five series used in competition #3, for the various forecast horizons have been reported in Table D.64 – Table D.90 in Appendix D. Table 5.39 in this Chapter reports, in order, the three models with the highest observed accuracy for each of the seven measures of forecast accuracy, for all forty-five series used in Competition #3 respectively, across the various forecast horizons. Table 5.40 reports, for the convenience of the reader, the model with the highest observed accuracy, based on the average across all forecast horizons, for all forty five time series used in Competition #3, for each of the seven accuracy measures. Table 5.40A reports the model that was selected on a consensus, sMAPE breaking ties, from Table 5.40 as the model with the highest observed accuracy for all forty-five series overall. In Table 5.40 it can be seen that the ranking of the models does not vary according to accuracy measure being used. Table 5.39 Best Models For Overall – Trend Series Table 5.40 - 87 - Model which gives the best results overall – Trend Series 5.4 Summary In this study three forecasting competitions were conducted for the purpose of testing eight specific null hypotheses regarding the relative forecast accuracy of the Direct Set Assignment forecasting method. Several important observations have been made. Firstly, there does not appear to be a universal or single value of the fuzzy set parameter that will yield the most accurate forecasts in all situations and for all types of data. In fact, each of the fuzzy set parameter values provide the most accurate forecast by series for several different series. Further, in this study it was observed that the parameter value is series specific, as opposed to data type specific. It was also found that the DSAA and DSAA-W outperformed the DSAB, DSAC, DSAB-W and DSAC-W models, albeit in two instances the DSAB-W and DSAB-W were ranked a top three performing models. However they were not ranked higher than the alternative of DSAA and DSAA-W models and did not perform well overall. Secondly, the DSA-A and the DSAA-W models performed remarkably well relative to the other methods, across subcategories and categories of time series that differed by data type and time interval. In addition, these methods performed extremely well on monthly and quarterly data in which both a trend and seasonal component was present, and on data where only a trend was present. Further the DSA method produced accurate forecast for series with short as well as long forecast horizons and for series with short as well as long training sets. Lastly, there were two sets of combination models, one in which traditional methods only were combined, and one in which traditional and fuzzy methods were combined. In the case of the traditional combination S-H-D, it performed at least as well as each of the traditional models did in their native form in fifteen of eighteen comparisons. The three instances where S-H-D did not outperform the traditional model where in the comparison with Dampen on average forecast horizons 1-4, 1-6 - 88 - and 1-8. In the case of the DSAA-W combination, it performed at least as well as the DSAA method in one of six comparisons and at least as well as the Winter's method in six of six comparisons. The DSAB-W combination performed at least as well as the DSAB and Winter's models in twelve of twelve of comparisons. The DSAC-W method also, performed at least as well as the DSAC and Winter's methods in twelve of twelve comparisons. In the next chapter the results of the three forecasting competitions will be used to evaluate the eight research hypotheses outlined in section 2.12. CHAPTER 6 DISCUSSION The purpose of this study was to introduce and validate the ex ante forecast accuracy of the Direct Set Assignment extrapolative forecasting method. This method was developed in response to a reported need in the forecasting literature for a statistically simple extrapolative forecasting method that would be robust to the fluctuations that exist in real-life business and economic data. Extrapolative forecasting methods are one of a large group of quantitative forecasting methods that produce a future quantitative value of a variable of interest, by extrapolating the historical values of that variable. To use these methods the historical values of the quantitative variable must be organized as a time series. The DSA method differs from traditional extrapolative forecasting methods in that it uses fuzzy logic or more specifically, fuzzy sets, to model the relationships between the historical observations of a time series. Fuzzy logic is a data processing technology that has earned a reputation for being a robust to real-life data for a variety of applications including systems control and signal processing. It has been hypothesized that the DSA method will provide more accurate forecasts than traditional, statistically simple methods, due to the DSA methods use of fuzzy logic. To validate the relative forecast accuracy of the DSA method its - 89 - performance has been established through the use of a forecasting competition methodology. Specifically three competitions have been conducted that have relied on the standards methods and procedures used in the M3 International Forecasting Competition held in 2000. The next section of this chapter discusses the hypotheses that were discussed and outlined in section 2.12 in the context of the results of the three forecasting competitions. For convenience the hypotheses have been restated in this Chapter. In subsequent sections the evaluation of the hypotheses has lead to a discussion of the theoretical implications of this research. The Chapter closes with a discussion on the future directions of the research on the Direct Set Assignment forecasting method and some concluding remarks. 6.1 Evaluation of Research Hypotheses HO1: The ex-ante forecast accuracy of the DSA method will not change in response to a change in the number of fuzzy sets, all other model parameters held constant. In Competition #1 all parameters of the DSA method were held constant with the exception of the fuzzy set parameter, which was allowed to vary between two and twenty sets. This created DSA models (FS2) through (FS20). The (FS) model that provided the highest observed accuracy for each series, subcategory and category was identified for the purpose of creating composite models DSAA, DSAB and DCSC. Figure 6.1 – Figure 6.13 in this Chapter present the frequency with the various fuzzy parameter values produced the most accurate forecasts for a given series. These frequencies are based on the data from Table 5.1 – Table 5.3. The data in these tables has been aggregated for all series used in the competition, for each of the nine subcategories and for each of the three categories. The analysis of the bar charts for individual series, as well as subcategories and categories suggests that the set parameter that produces the most accurate forecast is series specific, very much in the way that the value of the parameter weight, in single - 90 - exponential smoothing, is specific to a particular series. For example, in Figure 6.1 it can be seen that to produce the most accurate forecast for the one hundred thirty five series in competition #1 it was necessary to use every parameter value in the range of FS2-FS20. As such hypothesis H01 was rejected and it has been concluded that changing the fuzzy set parameter does effect forecast accuracy. The importance of this finding is that it indicates that general criteria will need to be established for selecting the fuzzy set parameter value in the DSA method. Figure 6.1 Overall FS Frequency Figure 6.2 Yearly Micro FS Frequency Figure 6.3 Yearly Industry FS Frequency Figure 6.4 Yearly Macro FS Frequency Figure 6.5 Quarterly Micro FS Frequency Figure 6.6 Quarterly Industry FS Frequency Figure 6.7 Quarterly Macro FS Frequency Figure 6.8 Monthly Micro FS Frequency Figure 6.9 Monthly Industry FS Frequency Figure 6.10 Monthly Macro FS Frequency Figure 6.11 Yearly FS Frequency Figure 6.12 Quarterly FS Frequency Figure 6.13 Monthly Frequency HO2: A fuzzy set parameter of seven in a DSA model will yield the most accurate ex ante forecasts when compared to DSA models with fuzzy set parameter other than seven, in the range of set values from two to twenty, all other model parameters held constant. Song (1991) suggested that a model with a fuzzy set parameter equal to seven would produce the most accurate results. Song whose background is in systems - 91 - control may have observed that in those applications that seven fuzzy sets is in fact optimal, and by extension concluded that seven fuzzy sets would be optimal in forecasting applications. This study was the first to investigate the impact on relative accuracy of different values for the set parameter as discussed above, as well as the first to evaluate the claim that seven sets is the optimal set value. Figure 6.1 – Figure 6.13 illustrate that for individual series, subcategories and for categories seven is not an optimal or universal value for the fuzzy set parameter. For example, Figure 6.1 illustrates that there is no optimal set parameter value for the DSA method for the one hundred thirty five individual series used in competition #1. In fact, the parameter values that produced the most accurate forecasts most frequently was FS11 followed by FS20, FS10 and FS8. Further, in Table B.1 in Appendix B, the difference between the sMAPE values of the most accurate model (FS9) and the lease accurate model is in excess of fourteen percentage points. As such hypothesis H02 was rejected and it has been concluded that a fuzzy set parameter value of seven is neither an optimal nor universal value. The importance of this finding is that it is unlikely that there is an optimal set parameter value and that modelers should anticipate that it will be necessary to examine a range of set parameter values to identify the (FS) model that will produce the most accurate forecasts. Again this is similar to the process that is followed in other extrapolative methods to identify the model that will produce the most accurate forecasts be they in sample or ex ante forecasts. H03: The ranking on forecast accuracy of the DSA method and the traditional methods compared in this study will be the same for all accuracy measures considered Table 5.18 – 5.33 for Competition #2 and 5.35 – 5.40 in Competition #3, excluding the summary tables, report the three models that provided the highest observed forecast accuracy, for subcategory, category and the all series levels of aggregation, for the seven accuracy measures, across the various forecast horizons. An - 92 - examination of these tables reveals that the ranking of the three best performing methods in Competition #2 and Competition #3 differs, within a particular forecast horizon, for the seven measures of forecast accuracy. Similar results can be observed in the tables in Appendix C, containing the accuracy measures for each method. For example, in Table 5.19 for the average of all six forecast horizons, the order of methods on the basis of their sMAPE value is DAA, DAAW, and the order on the basis of Average Ranking is DAAW, DAA, THET. As such hypothesis H03 was rejected and it has been concluded that the ranking of the forecast accuracy of various methods will be different for the various accuracy measured being used. The importance of this finding is that it reaffirms the findings of early studies including the M competitions and thus adds support to the existing body of knowledge on time series extrapolation. H04: The ranking on forecast accuracy of a combination of alternative forecasting methods will be lower than that of the specific forecasting methods being combined In Table 5.34, the sMAPE values for the combination of three traditional methods, Single, Holt and Dampen, and their combination as well as for the DSA and Winters methods and their combination have been reported for various forecast horizons. Relative to the traditional methods and their combinations, the combination methods outperform Single and Holts in their native form for all of the six sets of averaged forecast horizons. The combination however does not outperform Dampen Trend Exponential Smoothing for these same forecast horizons. For example, the sMAPE for S-H-D for the average of forecast horizons 1-4 is 12.44 while the sMAPE for Dampen is 12.23. The other absolute differences are of approximately the same magnitude. Relative to the combination of the DSA and Winters Methods, the findings are mixed. For the DSAB-W and DSAC-W method the combination outperforms both of - 93 - the DSA models and the Winter's model in their native form. In the case of the DSAA-W model the combination method outperforms the Winter's method across all forecast horizon averages, however the combined method does not outperform the DSAA model on any of the forecast horizon averages. For example the sMAPE for DSAA-W for the average of forecast horizons 1-4 is 10.63 while the sMAPE for the same forecast horizons is for the DSAA model is 10.00. Additionally, an examination of Table 5.38 indicates that the DSAA-W model outperforms the DSAA and the Winter's models on yearly data with a trend, while the DSAA model outperforms the DSAA-W model on quarterly data with a trend. Further, in Table 5.40 DAA-W outperforms all other models including the DSAA and Winter's models on all forty-five series in which a statistically significant trend is present. As such hypothesis H04 was not rejected and it has been concluded that a combination of methods do not perform at least as well, in all settings, as do the methods that have been combined do, in their native form. This finding is disappointing in that this study has not reaffirmed a finding that has been reaffirmed in so many other accuracy studies. It may be that group difference testing could be used to resolve this disparity. H05: The ranking on forecast accuracy of the DSA method and the traditional methods compared in thus study does not depend on the length of the forecast horizon Table 5.18 – 5.33 for Competition #2 and 5.35 –5.40 in Competition #3, excluding the summary tables, report the three models that provided the highest observed forecast accuracy, for subcategory, category and the all series levels of aggregation, for the seven accuracy measures, across the various forecast horizons. An examination of these tables reveals that the ranking of the three best performing methods in Competition #2 and Competition #3 differs, for a particular accuracy measure across forecast horizons, and the averages of those forecast horizons. Similar results can be observed in the tables in Appendix C, containing the accuracy measures for each method. For example, in Table 5.19 for the sMAPE accuracy measure, for forecast horizon 1, the three models with the highest observed accuracy are DAAW, - 94 - DAA and THES and the three models for forecast horizon 3 are DAA, DAAW and DABW. Further, for the 1-4 horizon average the models are ranked, DAAW, DAA and THES and for the 1-6 horizon average the models are ranked, DAA, DAAW, DABW. As such hypothesis H05 was rejected and it has been concluded that the ranking of the forecast accuracy of various methods will be different across different forecast horizons for a given measure of forecast accuracy. The importance of this finding is two fold. Firstly, it reaffirms the findings of early studies including the M competitions and thus adds support to the existing body of knowledge on time series extrapolation. Secondly, it is important because it impacts on forecast method selection. The method that produces the most accurate forecast across the forecast horizons 1-4 may well not be the method that produces the most accurate forecast over the forecast 1-18. So, modelers should be certain that they select the model that will perform best for the forecast horizons of interest. H06: The ranking on forecast accuracy of the time series specific DSA model, will be less than or equal to the ranking on forecast accuracy of both the subcategory and category specific DSA models H07: The ranking on forecast accuracy, of the DSA method, will be lower than that of the traditional extrapolative methods to which it is being compared in this study, by time series subcategory, time series category and for all of the time series evaluated in this study In competition #2 the goal was to establish the relative accuracy of the DSA-A, DSA-B, DSA-C models, and eight traditional methods; a combination of traditional methods; and a combination of the DSA models and Winters Exponential Smoothing. The DSA-A, DSA-B and DSA-C models were developed in competition #1 to help answer the question: Is there a fuzzy parameter value, for a data type and time interval subcategory, or a time interval category, that will yield more accurate results for those levels of aggregation, than it will for the series level of aggregation. - 95 - Relative to the combination models, the traditional combination model was designated S-H-D and the fuzzy traditional combination models were designated DSAA-W, DSAB-W and DSAC-W. Relative forecast accuracy was assessed at the subcategory, category and all series levels of aggregation in Competition #2. The results of this competition indicate that the forecasts represented by the DSA-A model, when assessed at the subcategory, category and all series levels of aggregation, were more accurate, in the aggregate than were those represented by the DSA-B and DSA-C models. This finding is important but not necessarily surprising. This result suggests that the DSA method will produces the most accurate forecasts when it is used to produce ex ante forecast for the forecast horizons of an individual series, and less accurate forecasts will be obtained if the modeler selects a fuzzy set parameter for all series of a particular data type or time interval. Certainly, from the standpoint of the economy of the DSA method, it would have been preferable to have a single fuzzy parameter value that would produce the most accurate forecast for a subcategory or category of data. This would be particularly true for manufacturing environments where forecasts for thousands of product components must be produced on a routine basis. This finding reinforces the findings in competition #1 that were used to test H01 and H02. In Competition #2 the DSAA model and its derivative, the DSAA-W model dominated the competition. In the subcategory competition the DSAA model provided forecasts with the highest observed accuracy for five of the subcategories, while the DSAA-W model provided forecasts with the highest observed accuracy for two for the subcategories. In total these two models provided forecasts with the highest observed accuracy for seven of nine subcategories. In the category competition the DSAA model and the DSAA-W model each provided forecasts with the highest observed for one of the categories. In total, these two models provide the highest observed accuracy for two of three categories. In the All Series competition, the DSAA model provided forecasts with the highest observed accuracy for all one hundred thirty five, time series in this competition. The DSAA-W method provided forecasts with the second highest - 96 - observed accuracy. The Theta method that received such wide acclaim in the M3 competition was ranked third in the All Series competition. As such, hypotheses H06 and H07 were rejected. It was concluded relative to hypothesis H06 that the most accurate forecasts, for a large number of series in the aggregate, will be obtained by first obtaining the most accurate forecasts produced by the DSA method for the forecasts horizons of individual series. The importance of this finding is that it indicates to modelers that they should not assume that a particular fuzzy set parameter will produce the most accurate forecasts for a given data type but that they should first produce forecasts across the forecast horizon of each series in future studies of the DSA method. It was concluded relative to hypothesis H07 that the DSA-A and DSAA-W models produce forecasts that are in most cases more accurate, than those produced by the traditional extrapolative methods evaluated in this study. Remarkably, this conclusion holds across a broad range of time series that differ by, data types (micro, industry, macro); time origin, (yearly, quarterly, monthly); forecast horizon, (six, eight, eighteen); presence of a mix of time series components, (average trend and seasonal), and although it was not explicitly tested, training set length, (fourteen, seventeen, thirty-six, forty-one, fifty-six, fifty-one, fifty-six, one-hundred sixteen and one hundred twenty-six). The exceptions to this list are Yearly-Micro and Monthly-Macro data. The importance of this finding is that it suggests that hypothesis presented in several prior studies that a statistically simple method, that is robust to the fluctuations in real-life time series, could advance the search for improvements in forecast accuracy of extrapolative methods appears to be supported by the performance of the DSA method in this study. In so doing it has demonstrated that the DSA method is a method on which future research can be justified. In addition, these findings provide to those who wish to advance the research on the DSA some specific facts about its implementation that will help focus the direction of any future research on this method. - 97 - H08: The ranking on forecast accuracy, of the DSA method, will be lower than that of the traditional extrapolative methods to which it is being compared in this study, on those series in which a statistically significant trend is present The fuzzifier module of the DSA method was designed to implicitly forecast the trend component in a time series without the need to explicitly forecast the trend through decomposition, or through modification to a forecast with an external parameter. This is the case with Holts and Winters' methods, and Damped Trend Exponential Smoothing and Theta method, respectively. In this way the DSA method can truly be classified as a statistically simple extrapolative forecasting method. In competition #3 the DSA-A and DSAA-W models were again top performers. These models together provided forecasts with the highest observed accuracy in two of the three categories of time series and were one of three models that provided the highest observed accuracy in at least one of the other categories. In the All Series competition DSAA-W provided the forecast with the highest observed accuracy and DSA-A provided forecasts with the second highest observed accuracy. As such hypothesis H08 was rejected and it has been concluded that the ranking of the DSA method on forecast accuracy is at least as high as that of the alternative traditional extrapolative methods evaluated in this study on time series in which a statistically significant trend is present. The importance of this finding is that it demonstrates that at least for series in which the average and trend component only were present, the new fuzzifier module provides the DSA method with the ability to accurately forecast time series with a trend. 6.2 Limitations There are a number of limitations to the conclusions that can be drawn from this study's findings, or for that matter from any forecasting competition. This includes in particular those studies that rely on the procedures and data from the M competitions. - 98 - In section 4.6 a discussion has been provided that enumerates the concerns with the forecasting competition methodology and the specific limitations that imposes on the findings of this study. In this study, the decision to rely on the data and procedures from the M3 competition brought with it a limitation. Specifically the heterogeneous nature of all of the series in the nine subcategories, resulted in a sample of time series that were difficult to differentiate other than on the basis of the criteria set forth by the designers of the M3-competition, Makridakis and Hibon (2000). These series each contained, for the most part, a mix of time series components, outliers and high variation. For this reason it was not possible to test the DSA method's performance on sub-samples of series containing a seasonal component, or on series that were highly volatile or on series that had only the average component. This is with the single exception of a sub-sample of series with a trend component that were evaluated in Competition #3. Another problem specific to this current study was the omission of an All Series competition within Competition #1. At the outset of the study the plan was to evaluate accuracy of the DSA method at the individual series, subcategory and category levels of aggregation. Given the performance of DSAB and DSAC models in competition #2 however, it would have been interesting to access the relative accuracy of a DSA representing an all series level of aggregation. 6.3 Contributions to Theory This study has made several important contributions to the body of knowledge on business forecasting. The first, and most important, relates to the performance overall of the DSA method. Fildes and Makridakis (1998), Makridakis and Hibon (2000), and Fildes (2001), have argued that future research to improve the accuracy of extrapolative methods that can take into account the real-life behavior of time series, that is, methods that are robust to the fluctuations that occur in real-life time series. The DSA method was introduced in response to this prior research, as a statistically simple method that would be robust to the fluctuations in real-life data. - 99 - The superior performance of the DSA method when compared to traditional methods may be a measure of this methods robustness, and in this way, the findings of this study support the hypothesis of those authors. This finding will, at the every least, add weight to the argument that statistically simple methods produce forecasts that are at least as accurate as forecasts produced by statistically sophisticated methods because they are robust. At most it will change the direction of extrapolative forecasting method development to a focus on methods that will be robust to the fluctuations in real-life business data. In addition, these results have reconfirmed two important findings from the M competitions and other accuracy studies. They are, that different accuracy measures will rank models differently in terms of relative accuracy, and that models relative accuracy will differ across the forecast horizon and forecast horizon averages of a particular series. This study has also made important contributions to the body of knowledge on the development of fuzzy logic based extrapolative methods. Firstly, the results of this study provide justification for additional research on the DSA method in particular, and on fuzzy logic extrapolative methods in general. Secondly, this study demonstrated the use of the Mamdani Framework for the development of a fuzzy logic based extrapolative method. This framework provides the structure and a common platform for the future development of the DSA method or, other fuzzy logic based extrapolative methods. The hope is that research will focus on developing the current modules to better match with specific forecasting conditions or problems. Finally, this study has demonstrated that the value of the fuzzy set parameter can be changed to improve forecast accuracy much in the same way that a parameter weight can be changed in an exponential smoothing model to improve forecast accuracy. Finally, the results of this study have demonstrated that there does not appear to be an optimal or universal value for the fuzzy set parameter. 5.4 Future Directions - 100 - The DSA method has been shown in this study to be capable of producing forecasts that rival in accuracy the forecasts produced by many traditional and routinely used statistically simple extrapolative forecasting methods. As such, additional research on this method appears to be justified. The next step in the development of the DSA method is to establish criteria for the a priori selection of the fuzzy set parameter. Work in this are has already begun and the focus is on a multi-pass analysis of the training set. In other words to use multiple validation sets. The goal is to establish criteria that will identify in advance the value for the fuzzy set parameter that will produce the most accurate forecasts. The results of this current study are serving as the platform for this new research. Although the DSA method performed remarkably well overall, there were two subcategories and one category, in competition #2, in which it did not provide forecasts that were as accurate as it did for other subcategories and categories of data. The subcategories were yearly and monthly macroeconomic data and the category were attributable to the DSA methods performance in the yearly macroeconomic data subcategory. The results in the two subcategories of macroeconomic data are attributable to the nature of the trend in these time series. Unlike the trend that exists in the majority of the time series in the other subcategories in which the trend component, either growth or decay, is damped in the validation data set, the trend component in the macroeconomic data exhibits constant growth, and in some instances decay. It is this constant trend that appears to adversely affect the DSA methods performance. As a first attempt to address this problem, a fuzzy growth factor was developed, based on the end points of the training set. This growth factor was in fuzzy set units and worked remarkably well. In fact when applied to the yearly macroeconomic subcategory, the DSAA model provided the most accurate forecasts of all models in the competition for this subcategory of data. The problem is that this approach in which the trend is explicitly accounted for is less desirable than an approach in which the trend is forecasted implicitly. It is recommended therefore that the DSA method defuzzifier and inference modules be modified to allow for output sets that are - 101 - different than the input sets. The output sets would implicitly forecast the trend. As product life cycles become shorter, resulting from rapid technology obsolescence and increased competition, there will be a need for forecasting methods that can provide accurate forecasts, for various forecast horizons based on small training data sets. Although it was not hypothesized in this study that the DSA method would provide forecasts that were at least as accurate as traditional methods, regardless of the length of the training set, the results of this study, as discussed in section 6.2 suggest that the DSA method is not impacted by training set length. To verify this observation and empirical investigation should be undertaken to examine the impact of training set length on the relative accuracy of the DSA method. It should be noted that the DSA method is capable of producing forecasts using training sets that have as few as three observations. 6.5 Conclusions The results of this study demonstrate that the observed forecast accuracy of the DSA method is at least as good, and in many cases better, than that of traditional models to which it was compared, across a heterogeneous selection of time series. The DSA method performance under these various conditions is likely attributable to, two factors. Firstly, the method is statistically simple and forecasts the various components of the time series implicitly. Secondly, and equally, important is the role, played by Fuzzy Logic in this traditional extrapolative methods. Fuzzy Logic has held a preeminent position in the field of systems control for over two decades. The success of Fuzzy Logic in these applications has been attributed to its ability to be robust to the anomalies that exist in real-life data resulting from a rougher modeling approach than traditional methods and because it provides a nonlinear mapping of inputs to outputs. The Direct Set Assignment extrapolative forecasting method was developed within the Mamdani framework and was designed to mimic the data processing approach of a fuzzy logic controller. While the DSA method has performed admirably in this first comparison to - 102 - other statistically simple extrapolative forecasting methods, there remain many opportunities to improve further the accuracy of the DSA method. Specific suggestions for future research have been provided in section 6.3. APPENDIX A DESCRIPTIVE STATISTICS Table 1 Descriptive Statistics APPENDIX B COMPETITON #1 Table B.1 Yearly sMAPE Values Table B.2 Quarterly sMAPE Values Table B.3 Monthly sMAPE Values Table B.4 Yearly Average Rank Values Table B.5 Quarterly Average Rank Values Table B.6 Monthly Average Rank Values Table B.7 Average symmetric MAPE: yearly micro data Table B.8 MedAPE: yearly micro data Table B.9 Average Rank: yearly micro data Table B.10 MAD: yearly micro data Table B.11 medRAE: yearly micro data Table B.12 % Better: yearly micro data Table B.13 Benchmark: yearly micro data Table B.14 Average symmetric MAPE: yearly industry data Table B.15 MedAPE: yearly industry data Table B.16 Average Rank: yearly industry data Table B.17 MAD: yearly industry data - 103 - Table B.18 medRAE: yearly industry data Table B.19 % Better: yearly industry data Table B.20 Benchmark: yearly industry data Table B.21 Average symmetric MAPE: yearly macro data Table B.22 MedAPE: yearly macro data Table B.23 Average Rank: yearly macro data Table B.24 MAD: yearly macro data Table B.25 medRAE: yearly macro data Table B.26 % Better: yearly macro data Table B.27 Benchmark: yearly macro data Table B.28 Average symmetric MAPE: quarterly micro data Table B.29 MedAPE: quarterly micro data Table B.30 Average Rank: quarterly micro data Table B.31 MAD: quarterly micro data Table B.32 medRAE: quarterly micro data Table B.33 % Better: quarterly micro data Table B.34 Benchmark: quarterly micro data Table B.35 Average symmetric MAPE: quarterly industry data Table B.36 MedAPE: quarterly industry data Table B.37 Average Rank: quarterly industry data Table B.38 MAD: quarterly industry data Table B.39 medRAE: quarterly industry data Table B.40 % Better: quarterly industry data Table B.41 Benchmark: quarterly industry data Table B.42 Average symmetric MAPE: quarterly macro data Table B.43 MedAPE: quarterly macro data Table B.44 Average Rank: quarterly macro data Table B.45 MAD: quarterly macro data Table B.46 medRAE: quarterly macro data Table B.47 % Better: quarterly macro data - 104 - Table B.48 Benchmark: quarterly macro data Table B.49 Average symmetric MAPE: monthly micro data Table B.50 MedAPE: monthly micro data Table B.51 Average Rank: monthly micro data Table B.52 MAD: monthly micro data Table B.53 medRAE: monthly micro data Table B.54 % Better: monthly micro data Table B.55 Benchmark: monthly micro data Table B.56 Average symmetric MAPE: monthly industry data Table B.57 MedAPE: monthly industry data Table B.58 Average Rank: monthly industry data Table B.59 MAD: monthly industry data Table B.60 medRAE: monthly industry data Table B.61 % Better: monthly industry data Table B.62 Benchmark: monthly industry data Table B.63 Average symmetric MAPE: monthly macro data Table B.64 MedAPE: monthly macro data Table B.65 Average Rank: monthly macro data Table B.66 MAD: monthly macro data Table B.67 medRAE: monthly macro data Table B.68 % Better: monthly macro data Table B.69 Benchmark: monthly macro data Table B.70 Average symmetric MAPE: yearly all data Table B.71 MedAPE: yearly all data Table B.72 Average Rank: yearly all data Table B.73 MAD: yearly all data Table B.74 medRAE: yearly all data Table B.75 % Better: yearly all data Table B.76 Benchmark: yearly all data Table B.77 Average symmetric MAPE: quarterly all data - 105 - Table B.78 MedAPE: quarterly all data Table B.79 Average Rank: quarterly all data Table B.80 MAD: quarterly all data Table B.81 medRAE: quarterly all data Table B.82 % Better: quarterly all data Table B.83 Benchmark: quarterly all data Table B.84 Average symmetric MAPE: monthly all data Table B.85 MedAPE: monthly all data Table B.86 Average Rank: monthly all data Table B.87 MAD: monthly all data Table B.88 medRAE: monthly all data Table B.89 % Better: monthly all data Table B.90 Benchmark: monthly all data APPENDIX C COMPETITION #2 Table C.1 Average symmetric MAPE: yearly micro data Table C.2 MedAPE: yearly micro data Table C.3 Average Rank: yearly micro data Table C.4 MAD: yearly micro data Table C.5 medRAE: yearly micro data Table C.6 % Better: yearly micro data Table C.7 Benchmark: yearly micro data Table C.8 Average symmetric MAPE: yearly industry data Table C.9 MedAPE: yearly industry data Table C.10 Average Rank: yearly industry data Table C.11 MAD: yearly industry data Table C.12 medRAE: yearly industry data Table C.13 % Better: yearly industry data - 106 - Table C.14 Benchmark: yearly industry data Table C.15 Average symmetric MAPE: yearly macro data Table C.16 MedAPE: yearly macro data Table C.17 Average Rank: yearly macro data Table C.18 MAD: yearly macro data Table C.19 medRAE: yearly macro data Table C.20 % Better: yearly macro data Table C.21 Benchmark: yearly macro data Table C.22 Average symmetric MAPE: quarterly micro data Table C.23 MedAPE: quarterly micro data Table C.24 Average Rank: quarterly micro data Table C.25 MAD: quarterly micro data Table C.26 medRAE: quarterly micro data Table C.27 % Better: quarterly micro data Table C.28 Benchmark: quarterly micro data Table C.29 Average symmetric MAPE: quarterly industry data Table C.30 MedAPE: quarterly industry data Table C.31 Average Rank: quarterly industry data Table C.32 MAD: quarterly industry data Table C.33 medRAE: quarterly industry data Table C.34 % Better: quarterly industry data Table C.35 Benchmark: quarterly industry data Table C.36 Average symmetric MAPE: quarterly macro data Table C.37 MedAPE: quarterly macro data Table C.38 Average Rank: quarterly macro data Table C.39MAD: quarterly macro data Table C.40 medRAE: quarterly macro data Table C.41 % Better: quarterly macro data Table C.42 Benchmark: quarterly macro data Table C.43 Average symmetric MAPE: monthly micro data - 107 - Table C.44 MedAPE: monthly micro data Table C.45 Average Rank: monthly micro data Table C.46 MAD: monthly micro data Table C.47 medRAE: monthly micro data Table C.48 % Better: monthly micro data Table C.49 Benchmark: monthly micro data Table C.50 Average symmetric MAPE: monthly industry data Table C.51 MedAPE: monthly industry data Table C.52 Average Rank: monthly industry data Table C.53 MAD: monthly industry data Table C.54 medRAE: monthly industry data Table C.55 % Better: monthly industry data Table C.56 Benchmark: monthly industry data Table C.57 Average symmetric MAPE: monthly macro data Table C.58 MedAPE: monthly macro data Table C.59 Average Rank: monthly macro data Table C.60 MAD: monthly macro data Table C.61 medRAE: monthly macro data Table C.62 % Better: monthly macro data Table C.63 Benchmark: monthly macro data Table C.64 Average symmetric MAPE: yearly all data Table C.65 MedAPE: yearly all data Table C.66 Average Rank: yearly all data Table C.67 MAD: yearly all data Table C.68 medRAE: yearly all data Table C.69 % Better: yearly all data Table C.70 Benchmark: yearly all data Table C.71 Average symmetric MAPE: quarterly all data Table C.72 MedAPE: quarterly all data Table C.73 Average Rank: quarterly all data - 108 - Table C.74 MAD: quarterly all data Table C.75 medRAE: quarterly all data Table C.76 % Better: quarterly all data Table C.77 Benchmark: quarterly all data Table C.78 Average symmetric MAPE: monthly all data Table C.79 MedAPE: monthly all data Table C.80 Average Rank: monthly all data Table C.81 MAD: monthly all data Table C.82 medRAE: monthly all data Table C.83 % Better: monthly all data Table C.84 Benchmark: monthly all data Table C.85 Average symmetric MAPE: overall data Table C.86 MedAPE: overall data Table C.87 Average Rank: overall data Table C.88 MAD: overall data Table C.89 medRAE: overall data Table C.90 % Better: overall data Table C.91 Benchmark: overall data APPENDIX D COMPETITION #3 Table D.1 Average symmetric MAPE: yearly micro data Table D.2 MedAPE: yearly micro data Table D.3 Average Rank: yearly micro data Table D.4 MAD: yearly micro data Table D.5 medRAE: yearly micro data Table D.6 % Better: yearly micro data Table D.7 Benchmark: yearly micro data Table D.8 Average symmetric MAPE: yearly industry data - 109 - Table D.9 MedAPE: yearly industry data Table D.10 Average Rank: yearly industry data Table D.11 MAD: yearly industry data Table D.12 medRAE: yearly industry data Table D.13 % Better: yearly industry data Table D.14 Benchmark: yearly industry data Table D.15 Average symmetric MAPE: yearly macro data Table D.16 MedAPE: yearly macro data Table D.17 Average Rank: yearly macro data Table D.18 MAD: yearly macro data Table D.19 medRAE: yearly macro data Table D.20 % Better: yearly macro data Table D.21 Benchmark: yearly macro data Table D.22 Average symmetric MAPE: quarterly micro data Table D.23 MedAPE: quarterly micro data Table D.24 Average Rank: quarterly micro data Table D.25 MAD: quarterly micro data Table D.26 medRAE: quarterly micro data Table D.27 % Better: quarterly micro data Table D.28 Benchmark: quarterly micro data Table D.29 Average symmetric MAPE: quarterly industry data Table D.30 MedAPE: quarterly industry data Table D.31 Average Rank: quarterly industry data Table D.32 MAD: quarterly industry data Table D.33 medRAE: quarterly industry data Table D.34 % Better: quarterly industry data Table D.35 Benchmark: quarterly industry data Table D.36 Average symmetric MAPE: quarterly macro data Table D.37 MedAPE: quarterly macro data Table D.38 Average Rank: quarterly macro data - 110 - Table D.39MAD: quarterly macro data Table D.40 medRAE: quarterly macro data Table D.41 % Better: quarterly macro data Table D.42 Benchmark: quarterly macro data Table D.43 Average symmetric MAPE: monthly micro data Table D.44 MedAPE: monthly micro data Table D.45 Average Rank: monthly micro data Table D.46 MAD: monthly micro data Table D.47 medRAE: monthly micro data Table D.48 % Better: monthly micro data Table D.49 Benchmark: monthly micro data Table D.50 Average symmetric MAPE: monthly industry data Table D.51 MedAPE: monthly industry data Table D.52 Average Rank: monthly industry data Table D.53 MAD: monthly industry data Table D.54 medRAE: monthly industry data Table D.55 % Better: monthly industry data Table D.56 Benchmark: monthly industry data Table D.57 Average symmetric MAPE: monthly macro data Table D.58 MedAPE: monthly macro data Table D.59 Average Rank: monthly macro data Table D.60 MAD: monthly macro data Table D.61 medRAE: monthly macro data Table D.62 % Better: monthly macro data Table D.63 Benchmark: monthly macro data Table D.64 Average symmetric MAPE: yearly trend all data Table D.65 MedAPE: yearly trend all data Table D.66 Average Rank: yearly trend all data Table D.67 MAD: yearly trend all data Table D.68 medRAE: yearly trend all data - 111 - Table D.69 % Better: yearly trend all data Table D.70 Benchmark: yearly trend all data Table D.71 Average symmetric MAPE: quarterly trend all data Table D.72 MedAPE: quarterly trend all data Table D.73 Average Rank: quarterly trend all data Table D.74 MAD: quarterly trend all data Table D.75 medRAE: quarterly trend all data Table D.76 % Better: quarterly trend all data Table D.77 Benchmark: quarterly trend all data Table D.78 Average symmetric MAPE: monthly trend all data Table D.79 MedAPE: monthly trend all data Table D.80 Average Rank: monthly trend all data Table D.81 MAD: monthly trend all data Table D.82 medRAE: monthly trend all data Table D.83 % Better: monthly trend all data Table D.84 Benchmark: monthly trend all data Table D.85 Average symmetric MAPE: Trend All data Table D.86 MedAPE: Trend All data Table D.87 Average Rank: Trend All data Table D.88 MAD: Trend All data Table D.89 medRAE: Trend All data Table D.90 % Better: Trend All data Table D.91 Benchmark: Trend All data - 112 -

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 32 |

posted: | 10/6/2011 |

language: | English |

pages: | 112 |

OTHER DOCS BY yantingting

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.