dissertation

Document Sample
dissertation Powered By Docstoc
					   THE DEVELOPMENT AND VALIDATION OF FUZZY LOGIC
        METHOD FOR TIME-SERIES EXTRAPOLATION




                          By


               JEFFREY STEWART PLOUFFE




A DISSERTATION SUBMITTED IN PARTICAL FULFILLMENT OF THE
           REQUIREMENTS FOR THE DEGREE OF
                DOCTOR OF PHILOSOPHY
                          IN
               BUSINESS ADMINISTRATION




              UNIVERSITY OF RHODE ISLAND
                         2005




                          -1-
                                     ABSTRACT


       It has been established across a large number of studies that statistically simple
forecasting method, for the extrapolation of univariate time-series of business data,
provide more accurate ex ante forecasts, in most situations, than do ones that are
statistically sophisticated. The problem is that scholars attempting to develop new,
more accurate forecasting methods have all but ignored this knowledge on forecast
accuracy. Fildes and Makridakis (1998), Makridakis Hibon (2000), Fildes (2001) and
Small and Wong (2002) suggest that what is needed are new statistically simple
extrapolative forecasting methods that are robust to the fluctuations that occur in
business data.
      This dissertation discusses the development and validation of the Direct Set
Assignment (DSA) extrapolative forecasting method. The DSA method is a new,
statistically simple, non-linear extrapolative forecasting method that was developed
within the Mamdani Development Framework, and was designed to mimic the
architecture of a fuzzy logic control system.
      The relative forecast accuracy of the DSA method was established through the
use of three forecasting competitions. The time series used in these competitions, as
required, included one hundred thirty five series drawn from the M3 International
Forecasting Competition. These series represent nine subcategories and three
categories of data including yearly, quarterly and monthly series each containing
microeconomic, macroeconomic and industry data. In the first competition it was
found that the fuzzy set parameter, in the range of two fuzzy sets to twenty fuzzy sets,
can be manipulated in the DSA method to improve ex ante forecast accuracy.
      In Competitions # 2 and #3 the most accurate DSA methods from competition
#1 were compared to alternative simple extrapolative methods, including those that
were found to produce the most accurate forecasts in the M3 Forecasting Competition
held in 2000.
      The DSA method and its combination with Winter's Exponential smoothing
                                          -2-
provided the highest observed forecast accuracy in seven of the nine subcategories of
time series, and was ranked in the top three in terms of observed accuracy in the other
two subcategories. In addition, these methods provided the highest observed accuracy
in two of three categories of time series and were ranked in the top three in terms of
observed accuracy. They also provided the highest observed forecast accuracy for all
one hundred thirty five series used in the competition and they provided the highest
observed accuracy when for time series with a trend component.




                                         -3-
                             ACKNOWLEDGEMENTS


      There are a number of people who have contributed both directly and indirectly
to this research who deserve my thanks. This includes my parents Robert and Mildred
Plouffe, whose love of discovery and incomparable work ethic have served as a
constant source of inspiration to me.
      Thank you also to the exceptional group of scholars who I have been so
fortunate to have as program advisors. Professor Jeffrey Jarrett, my major advisor,
mentor and friend for over a decade provided me extraordinary latitude in pursuing
my interests, but was always there to provide a course correction when required. I am
forever grateful to him, and would not have completed this work without his patience
and guidance.
      Professor Shaw Chen in countless meetings and discussions taught me how to
operationalize my research ideas and helped me develop an appreciation for analytical
rigor. His sights regarding this research were invaluable and it has been a pleasure to
work with him. Professor Jerry Cohen has been a continual source of ideas. His sage
advice is reflected throughout this project and his attitude toward research, and his
commitment to quality has set a standard to which I aspire. Professor John Boulmetis
has provided me with endless support and encouragement throughout my program.
John's enthusiasm and curiosity for research and teaching have influenced and
inspired me more than he could possibly know. A special thank you to Professor
Choudary Hanumara and Dean Maling Ebrahimpour for their time, and their interest
in this project. Their support and patience throughout the later stages of my program
has been greatly appreciated, and their suggestions for improvements advanced the
quality of this research. These two gentlemen represent the very best of the academic
profession.
      I also owe a special debt of gratitude to Dean Maling Ebrahimpour and
Professor Paul Mangiamelli for encouraging me to become a member of the Decision
Science Institute, whose conferences served as the platform on which this research
                                         -4-
was developed. I am also especially grateful to Michelle Hibon, Senior Research
Fellow at INSEAD Business School, for providing me with, and helping me sort
through, the extraordinary amount of data generated by the M3-Competition. Her
suggestions lead to significant improvements to this project overall.




                                          -5-
                                    CHAPTER 1
                                 INTRODUCTION


      This chapter provides an overview of research that was conducted to develop
and validate a new fuzzy logic based method for time-series extrapolation. This new
forecasting method is called the Direct Set Assignment (DSA) method.
      The first section of this chapter provides a description of problem that has been
debated, by both scholars and practitioners, working in the field of business
forecasting, for nearly two decades, that justifies the need for this research.
Specifically, the problem is that extrapolative forecasting method development during
the past two decades has focused on the development of statistically sophisticated
methods despite the fact that research on forecast accuracy conducted during this
same period has shown that statistically simple methods, that are robust to the
fluctuations that occur in real-world data, produce forecasts that are at least as
accurate as those produced by statistically sophisticated methods.
      A subsequent section offers a hypothesis as to why statistically simple methods
produce forecasts that are at least as accurate as those produced by statistically
sophisticated methods. Also provided, is a discussion of the perceived benefits of
fuzzy logic and an argument for why it should serve as the basis for an extrapolative
forecasting method.
      A brief discussion of the major research hypotheses of this study has been
included, followed by a discussion of the experimental design that was used to
evaluate seven specific null hypotheses. The chapter closes with a few remarks about
significant findings from this research and an outline of the remaining chapters in this
dissertation.


1.1 Problem Specification and Research Justification


      Makridakis and Hibon (1979) were one of the first to report that statistically
simple extrapolative forecasting to report that statistically simple extrapolative
                                          -6-
forecasting methods provide forecasts that are at least as accurate as those produced
by statistically sophisticated methods. Such a conclusion was in conflict with the
accepted view at the time, was not received well by the great majority of scholars.
      In response to these criticisms Makridakis and Hibon held the M-Competition
(1982), the M2-Competition (1993) and the M3-Competition (2000). In each of these
additional studies, the major findings of the Makridakis and Hibon (1979) study were
upheld and this included the finding concerning the relative accuracy of statistically
simple extrapolative methods.
      In addition to the M-Competitions, myriad other research, described as
accuracy studies, were held utilizing new time series as well as time series from the
M-Competitions, and they confirmed the original findings of Makridakis and Hibon
regarding the relative accuracy of extrapolative methods. These studies include,
Geurts and Kelly, (1986); Clemen, (1989); Fildes, (1984); Lusk and Neves, (1984);
Koehler and Murphree, (1988); Armstrong and Collopy, (1992); Makridakis et al,
(1993) and Fildes et all, (1998).
      The problem as reported by Fildes and Makridakis (1998) and Makridakis and
Hibon, (2000) is that many scholars have all but ignored the empirical evidence that
has accumulated across these competitions on the relative forecast accuracy of various
extrapolative methods, under various conditions. Instead they have concentrated their
efforts on building more statistically sophisticated forecasting methods, without
regard to the ability of such methods to accurately predict real-life data.
      Makridakis and Hibon (2000) suggest that future research should focus on
exploiting the robustness of simple extrapolative methods, that are less influenced by
the real life behavior of data, and that new statistically simple methods should be
developed.


1.2. Forecast Accuracy and Simple Extrapolative Methods


      Makridakis and Hibon (2000) suggest that real-life time series are not stationary,
and that many of them also reflect structural changes resulting from the influence of
                                           -7-
fads and fashions, and that these events can change established patterns in the time
series. Moreover, the randomness in business time series is high and competitive
actions and reactions cannot be accurately predicted. Also, unforeseen events
affecting the series in question can and do occur. In addition, many series are
influenced by strong cycles of varying duration and lengths whose turning points
cannot be predicted. It is for these reasons that statistically simple methods, which do
not explicitly extrapolate a trend or attempt to model every nuance of the time series
can and do outperform more statistically sophisticated methods.


1.3 Fuzzy Logic


      Mukaidono (2002) concluded, "It is a big task to exactly define, formalize and
model complicated systems", and it is at precisely at this task that fuzzy logic has
excelled. In fact, fuzzy logic has routinely been shown to outperform classical
mathematical and statistical modeling techniques for many applications involving the
modeling of real world data.
      For example, fuzzy logic has found wide acceptance in the field of systems
control. Fuzzy logic has been used in control applications ranging from controlling
the speed of small electric motor, to controlling an entire subway system. In nearly
every one of these applications fuzzy logic control systems have been shown to
outperform more traditional, yet highly advanced, digital control systems.
      Fuzzy logic's success in these applications has been attributed to its ability to
effectively model real world data. Mukaidono (2002) suggests that Fuzzy Logic's
success lies in the fact that it offers a "rougher modeling approach".
      The process of digital control is actually remarkably similar to time series
extrapolation. In a digital control system, sensors provide a set of quantitative or
qualitative observations as input to the controller. The controller in turn models those
inputs and provides either a qualitative or quantitative output to the system that is
under control. In time series extrapolation, a set of historical observations on a time
series, serve as the input data to the forecasting method. The method then produces an
                                           -8-
output that in the case of time series extrapolation is the forecast or future value of the
time series of interest.
      Given the similarities with respect to the task of modeling complex real world
data, and the structure of the two modeling systems, a fuzzy logic based method for
time series extrapolation would appear to be the type of statistically simple method
which Makridakis and Hibon (2000) suggest is needed.


1.4 Major Research Hypotheses


      It is clear from over two decades of research on the relative accuracy of various
extrapolative methods that simple methods will in most forecasting situations, and for
most data types, produce the most accurate ex ante forecasts.
      In this study there are two major hypotheses. The first hypothesis is that the ex
ante forecast accuracy of the DSA method will change in response to changes in the
fuzzy set parameter. The fuzzy set parameter is the number of fuzzy sets used to
model the time series of interest. The second hypothesis is that the DSA method will
provide more accurate ex ante forecasts than the traditional extrapolative forecasting
methods to which it has been compared.


1.5 Research Approach


      Elton and Gruber (1972), Reid (1972), and Newbold and Granger (1974), were
among the first to establish the relative accuracy of different forecasting methods
across a large sample of time series. However, these early studies compared only a
limited number of methods. Makridakis and Hibon (1979) extended this early work
by comparing the accuracy of a large number of methods across a large number of
heterogeneous, real-life business time series.
      In 1982 Makridakis and Hibon, (1982) conducted a second accuracy study. In
this study the authors invited forecasting experts to participate who had an expertise
with a particular extrapolative method, thereby creating a forecasting competition.
                                           -9-
Since 1982 there have been a number of improvements made to the forecasting
competition methodology particularly in terms of predictive and construct validity.
      The research, which is the subject of this current study, relied on the data,
methods and procedures of the M3 Forecasting Competition conducted in 2000 as this
competition utilized the most recent advances in the forecasting competition
methodology. In this current study, three competitions were required to evaluate the
research hypotheses and to establish the relative forecast accuracy of the Direct Set
Assignment Method.


1.6 Important Findings


      The results of this research support three of the major findings of the prior
forecasting competitions and accuracy studies that have been conducted during the
past two decades. Most important however is that, the findings of this study support
both major research hypotheses discussed above. Thus it can be concluded that the
DSA method does produce ex ante forecasts that are as accurate, and in most
instances more accurate than the forecasts produced by the alternative extrapolative
methods to which it was compared. These alternative methods are the methods that
produced the most accurate forecasts for the identical time series form the
M3-forecasting competition held in 2000.


1.7 Organization of Dissertation


      Chapter 2, Literature Review and Research Hypotheses, is the next chapter, and
it begins with a discussion of the importance of the field of business forecasting as
well as an overview of the methods that are available for producing forecasts of
business data. A section has been devoted to extrapolative forecasting methods as they
are the subject of this study. The background on the use of forecast accuracy as the
primary criteria for extrapolative forecast method selection has also been provided.
Also in this chapter is a comprehensive review of forecasting competitions and other
                                         - 10 -
studies conducted to establish the relative forecast accuracy of extrapolative methods.
      Six sections are devoted to a review of the data processing technology fuzzy
logic. These sections describe the origin of fuzzy logics, the mechanics of fuzzy logic,
its applications, the Mamdani Framework for fuzzy logic method development and
the background on the use of fuzzy logic in time series extrapolation.
      The chapter also presents the specific research hypotheses that have been
evaluated in this study. The chapter concludes with a brief summary that describes the
linkage between past research and this current research.
      Chapter 3, The Direct Set Assignment Method, opens with a discussion of the
application of the Mamdani Framework to the development of the DSA method. The
next two sections of this chapter contain examples of the DSA method used to
produce forecasts of a non-seasonal as well as a seasonal time series. The chapter
closes with a summary on the development and use of the DSA method.
      Chapter 4, Methodology, opens with a description of the forecasting methods
that have been used in this study, including a brief overview of the DSA method, and
a description of the six forecast accuracy measures that were used to establish the
relative accuracy of the forecasting methods compared in this study. A subsequent
section discusses the data used in this study, including complete descriptive statistics.
The chapter continues with a discussion on the study's experimental design referred to
as a forecasting competition, and includes specific details on each of the three
forecasting competitions that were conducted to test the hypotheses outlined in
Section 2.12. The chapter closes with a summary.
      Chapter 5, Results, provides summary tables that contain the values of the six
accuracy measures, for each method being compared, for multiple forecast horizons.
These tables have been provided for each of the three forecasting competitions
conducted in this study. The chapter closes with a brief summary.
      Chapter 6, Discussion, contains an evaluation of each of the research
hypotheses discussed in Section 2.12 in the context of the results presented in
Sections 5.1-5.3 including statements of major findings. This findings include that the
DSA method and the DSA method in combination with Winter's Exponential
                                          - 11 -
smoothing were the top performing methods in this competition. The specific
hypotheses being tested have been restated in this section for convenience. The
chapter continues with an assessment of the contributions of this research specifically
to the investigation of fuzzy logic extrapolative methods and to the theory and
practice of forecasting. The chapter closes with suggestions for future research on the
DSA method and some concluding remarks.




                                         - 12 -
                                     CHAPTER 2
                                LITERATURE REVIEW




      This chapter opens with a discussion of the important role that business
forecasting play in the operation of many businesses. The chapter continues with a
review of a recently introduced taxonomy of business forecasting methods. Special
attention is given to extrapolative forecasting methods, the methods that are the focus
of this research.
      The subsequent section provides a review of the role of forecast accuracy in
forecasting method selection. A discussion on the measurement of forecast accuracy
has been provided as well. Also included is the background on the use of a
methodology, referred to as a forecasting competition, to establish the relative forecast
accuracy of extrapolative forecasting methods for different forecasting situations. The
research presented in these sections provides the justification for the development of a
new more accurate, statistically simple, extrapolative forecasting method.
      Following the discussion on the need for new extrapolative forecasting methods,
are three sections on the data processing technology, Fuzzy Logic. The first two of
these three sections highlights why fuzzy logic may be uniquely suited for the job of
time series extrapolation. The third of these three sections describes a theoretical
framework that is routinely used to develop fuzzy logic methods for all manner of
data processing applications.
      The second to last section provides statements of the hypotheses that will be
evaluated in this study, as well as a discussion of the relevance of each hypothesis.
This chapter concludes with a summary that integrates the prior research on the
accuracy of extrapolative methods with the justification for this research.


2.1 The Need For Business Forecasting


     The role of management in all organizations is to oversee the functions of
                                          - 13 -
planning, administering and controlling (Daft, 1983; Koontz, 1984; Jarrett, 1991). The
planning function, referred to as the first function of management, focuses on the
development of strategy, allocation of resources and establishment of the policies that
guide the operation of the organization into the future. The future, in the context of the
planning process, is referred to as the planning horizon and it can range from a few
hours for decisions about production schedules, to several years for decisions
concerning capital expenditures and enterprise strategy implementation, (Ascher,
1978; Armstrong, 1978; Makridakis, 1998;).
     Unfortunately, management must make these crucial planning decisions in an
environment of uncertainty about the outcome of the future events that serve as key
inputs to the firms planning processes. These events include such things as the future
levels of product demand and market-share, raw material and labor costs, inventories,
personnel requirements as well as the impact of various market, competitive and
economic factors on their organization's performance, to name but a few (Jarrett, 1991;
Makridakis, 1998).
     Business Forecasting is a formal process for managing the uncertainty inherent
in an organization's planning process by providing a numerical prediction or forecast
of the future level of the event or key input of interest (Jarrett, 1991; Baines, 1992;
Altabet, 1998; Li, Ang and Gray, 1999; Winklhofer, 2002).
     The field of Business Forecasting began in earnest in the late 1950's at a time
when many individuals questioned the validity of a discipline aimed at predicting an
uncertain future. However, since that time the results of empirical research presented
in over one thousand articles and books have demonstrated the efficacy of business
forecasting. Today, business forecasting is a discipline with a strong and
comprehensive theoretical framework and one that is widely accepted by scholars,
and routinely applied by practitioners (Chatfield, 1997; Makridakis, 1996, Makridakis,
1998; Ord, 2000; Armstrong, 2001).
     The widespread adoption of databases and data warehouses combined with the
continued decline in the cost of mass storage have allowed businesses to capture and
store data on virtually every aspect of their operation. Giacomini (2003) suggests that
                                          - 14 -
it is for this reason that the literature on and interest in business forecasting is
experiencing a renaissance.


2.2 Forecast Method Taxonomy and Selection


        A large number of business forecasting methods have been developed during
the past several decades. Chambers, Mullick and Smith (1971) were among the first to
examine the problem of how to select from among available methods. These authors
created a chart of six forecasting performance criteria by eighteen forecasting
techniques. The performance criteria included accuracy, application, data required,
cost, ease of implementation and robustness. Their rating of each technique on each
criterion was based on these authors' general impressions.
        Reid (1972) advanced the idea of Chambers, Mullick and Smith (1971) by
representing the method selection process in the form of a decision tree in which the
branches reflected the criteria, and the rating of each model was based on empirical
evidence as opposed to general impressions.
        Jenkins (1974) suggested that a better approach to classifying methods on
criteria was to simple the Box-Jenkins method to identify and estimate a model from
ARIMA class of time series models in all forecasting situations.
        Armstrong (1982) conducted a survey of Academics and Practitioners at the
First International Symposium on Forecasting to solicit their opinions on the criteria,
which they felt were most important for selecting a forecasting method. The survey
results indicated that 70% of practitioners believed that accuracy was the most
important criteria for selecting a method. The notion that accuracy should be the
primary criterion for selecting the appropriate forecast method was reinforced in
studies by Newbold and Granger, 1974; Reid, 1975; Hibon and Makridakis, 1979 and
1982.
        Georgoff and Murdick (1986) were the first to suggest that guidelines, based on
prior research findings, could be used to identify the method that would be most
accurate for a given forecasting situation. This was important research that laid the
                                          - 15 -
groundwork for the current belief that forecasting method accuracy tends to be data
and situation specific.
      Dalrymple (1987) used a mail survey to obtain information about the use of
methods for sales forecasting in one hundred thirty four US companies. These
companies reported that they relied on Expert Opinion (sales force 44.8%, executives
37.3% and industry experts 14.9%); Analogies (leading indicators 18.7%);
Econometric models (12.7%) and Extrapolation (49.6%) to produce their forecasts.
He also cited several other studies on the use of forecasting methods that contained
similar findings.
      Rhyne (1989) conducted a survey of the senior management at forty hospitals.
It was reported that a "jury of executives" was used to produce a forecast by 87% of
respondents with 67% relying on the forecasts produced by experts. Extrapolative
methods were used by 65% of respondents followed by 12.5% of respondents who
used regression analysis.
      Frank and McCollough (1992) conducted a similar survey to that of Dalrymple
that included 290 Practitioners of the Finance Officer Association for US state
governments. They found that the most widely used forecasting method by this group
was judgment 82%, followed by trend line 52%, econometric techniques 26%,
moving averages 26% and exponential smoothing 10%.
      Sanders and Manrodt (1994) found that while knowledge of quantitative
methods seemed to be increasing over time, firms still relied heavily on judgmental
methods.
      Yokum and Armstrong (1995) conducted an analysis of previous survey
research and concluded that accuracy was the most important selection criteria.
Further, these authors highlight that the implication of selecting the most accurate
methods are extremely important in practical terms, as even small improvements in
the accuracy of a forecast can provide considerable savings to an organization.
Makridakis (2000) also reported his observation that forecast accuracy was of primary
importance.
      The increasing focus on forecast accuracy as the primary criteria for forecast
                                        - 16 -
method selection described by Makridakis (2000), resulted primarily from the
increasing presence of digital computing technology. The processing power of
computers made selection criteria such as cost and time all but irrelevant. Further, the
availability of forecasting software provided forecasters with the ability to produce
quantitatively derived forecasts as readily as a judgmental forecast or forecasts based
on expert opinion.
      Makridakis (2000); Meade (2000) and Armstrong (2001) report that a number
of important conclusions, about the relative accuracy of alternative forecasting
methods, have consistently been reached in prior empirical studies of forecast
accuracy. They are: 1) accuracy of a structured approach, whether data is available or
not, is greater than the accuracy of an ad hoc approach; 2) accuracy of quantitative
methods exceeds that of judgmental methods when enough data exists; 3) accuracy of
extrapolative methods often exceeds that of explanatory variable or casual models
depending on the level of change in the variable of interest.
      Armstrong (2001) in "Selecting Forecasting Methods" presents a decision tree
allows users to identify the forecasting method that should produce the most accurate
forecast given a number of situational factors and conditions. The structure of the tree
is based on a method taxonomy in which the myriad of forecast models and
techniques that have been developed during the past few decades are classified into
one of ten methods or method categories.
      In this taxonomy the ten methods belong to one of two major categories. The
categories are judgmental methods and quantitative methods. Judgmental methods
rely on the forecaster's judgment, the opinion of experts, leading indicators; analogous
situations and intuition to produce a forecast value of the variable or event of interest.
Quantitative methods in contrast rely on statistical relationships within and among
data collected on the specific variable or event of interest, as well as on related
variables and events, to produce the required forecast.
      The category of Judgmental Methods includes the seven sub-categories: Expert
Forecasting, Judgmental Bootstrapping, Conjoint Analysis, Intentions, Role Playing
and Analogies. The sub-categories of Quantitative Methods are: Time Series
                                          - 17 -
Extrapolation, Explanatory, Rule-Based Forecasting and Expert Systems. These
sub-categories each contain many alternative models, or alternative specifications of a
model, that are used to actually produce the numeric forecast.
       Armstrong (2001) developed the following rules to select the most accurate
method from among the ten alternative methods. Given that there is enough data
available, quantitative methods will produce more accurate forecasts then judgmental
methods. If quantitative methods are selected then the forecaster needs to consider if
the casual influences on the variable of interest are known; the amount of change that
is expected in that variable of interest; the type and amount of data that is available;
the need for policy analysis and the extent of domain knowledge.
       If judgmental methods are selected the forecaster needs to consider whether or
not large changes are expected in the value of the variable of interest over the forecast
horizon; if a large number of forecasts will be required; differing view among key
decision makers and policy considerations. Figure 2.1 is the decision tree from
Armstrong (2001).


2.3 Traditional Extrapolative Methods


       The Extrapolative Forecasting methods from Figure 2.1 are the focus of this
study. These methods are accurate, reliable and easy to automate, and for these
reasons they are the most popular quantitative methods. These methods are widely
used for producing inventory and production forecasts, demand forecasts, budgeting
forecasts, operational planning and some long-term forecasts, as well as forecasts in
many other areas of a business's operations, Armstrong (2001).
       Extrapolative Forecasting Methods produce a forecast or future value of a
variable of interest by examining the past behavior of that variable. Unlike
explanatory variable or econometric methods, extrapolative methods do not attempt to
identify the factors that are responsible for the historical levels of the variable of
interest.


                                          - 18 -
                                               (Figure 2.1)


       In an extrapolative method it is the passage of time that acts as a proxy for
whatever is really causing the behavior of the variable.
       The goal with extrapolative methods is to identify the pattern in the values of
the variable of interest, and then extrapolate that pattern into the future. Extrapolative
methods have routinely been found to provide more accurate forecasts than
econometric methods, Makridakis (2000); Meade (2000) and Armstrong (2001).
       To use an extrapolative method the variable of interest must be organized as a
time series. A time series is a collection of historical observations on a quantitative
variable that are equidistant with respect to time and are arranged sequentially. For
example, a ten-observation time series of annual sales data would be the sales level
measured on December 31st for each of the years 1995-2004. While any time interval
is possible, in business forecasting most data is captured on a daily, weekly, monthly,
quarterly or yearly basis.
       Although dozens of extrapolative methods have been developed during the past
thirty years, practitioners and academics have adopted fewer than twenty methods for
regular use. In theory, extrapolative methods are particularly good for producing
short-term forecasts where, with reason, the past behavior of the variable of interest is
a good predictor of the variables future behavior.
       These methods range from simple to complex relative to the statistical
procedures required to model the historical observations of a time series. The general

form of an extrapolative method is Yt 1  f (Yt , Yt 1 , Yt  2 ,..., Y0 , t ) .

       In this equation Yt 1 represents the predicted value of the time series for one

time period ahead. The most recent actual historical observation is designated as Yt .

The nest most recent actual historical observation is Yt 1 and occurs in period (t  1) ,

and so on. This model implies that a predicted value is a function of its previous
values and time, (Jarrett, 1991).


                                                    - 19 -
      Time series data has five basic components. The first four components are the:
average, trend, seasonal and cyclical. These are referred to as the systematic
components. The fifth component is error and it is the non-systematic component. The
relationship can be represented as data=(pattern) + error = (systematic components) +
error. Conventional wisdom suggests that to make accurate forecasts the extent to
which each component is present in a given time series must be taken into account. In
fact a considerable amount of research has focused on divining ways to disaggregate
the components of a time series so that forecasts can be produced of each individual
component. The final forecast of the time series overall is produced by aggregating
the component forecasts in a systematic way. This process is generally referred to as
time series decomposition.
      The average component, also referred to as the level component, is the sum of
the values of the time series divided by the number of time periods. Time series that
have only an average component are referred to as stationary series.
      The trend component is represented by the tendency for the values of the time
series to systematically increase or decrease over time. Trends can be linear as well as
curvilinear. Time series in which the trend component is present are referred to as
non-stationary series. Trends can be identified by conducting a visual inspection of a
plot of the data or by fitting a trend line.
      The seasonal component is any repeating pattern in the time series that has a
period of exactly one year for a complete cycle. This component represents a
predictable increase or decrease in demand depending on the week, month or season
of the year. The seasonal component can arise from calendar or climatic influences as
well as from other influences as well as from other influences that repeat at
approximately the same time each year. Seasonality is most frequently associated with
monthly quarterly and bi-annual series; however it can exist in any series of any time
interval except series with a yearly time interval. There are several ways to identify
seasonality in a time series. The first is to conduct a visual inspection of a line graph
of the values of the time series of interest. A second graphical approach is to examine
the Autocorrelation Function (ACF) for the series. The pattern in the ACF reveals the
                                               - 20 -
presence or absence of a seasonal component. The most widely used approach
however is to calculate seasonal indices for the time series of interest. One method for
doing so is the ratio-to-moving-average method (Jarrett, 1991).
      The cyclical component is represented by long term repeating cycles in the
series that are not related to seasonal effects. This component arises from two factors.
The first is the business cycle, which is influenced by a number of economic factors
that cause the economy to go through a repeating pattern of recession and expansion.
The second factor is the product life cycle. The product lifecycle reflects demand for a
product from its introduction through its decline. The magnitude of and duration of
cycles is difficult to predict. This difficulty arises from an inability to predict the
effects of national and international events, such as elections, wars or political turmoil
around the globe.
      The error component is represented by any fluctuations in the time series that
are not classified as one of the four systemic components. In essence these
fluctuations are error. This error results from the occurrence of non-periodic,
unpredictable and catastrophic events. This can include strikes, terrorist attacks, a
stock market crash and etc. In addition, error can also arise from selecting a
forecasting model that is incorrectly specified given the nature of a particular time
series or, from measurement error in the historical observations of the time series or
from the randomness inherent in the series itself. It is due to the presence of the error
component that forecasts are always wrong even though they may still be quite useful
for decision-making purposes. The value of the error component can be found as the
difference between the actual and forecast values for each time period.
      There are some general guidelines for the application of extrapolative methods.
Within this method category, there are methods that have been designed to be the
most appropriate method for extrapolating stationary series in which only the average
component is present; there are methods that are most appropriate for extrapolating
series containing a trend and methods that are most appropriate for extrapolating
series containing a seasonal component. In practice however it is difficult to know
which specific extrapolative method will produce the most accurate forecast of a
                                          - 21 -
given time series. Thus, a common approach for selecting the most appropriate
extrapolative method for a given situation is to compare several alternative methods
as to their forecast accuracy.


2.4 Measures of Forecast Accuracy


      There are numerous measures available to establish the accuracy of the
forecasts produced by extrapolative forecasting methods. These measures reflect
different approaches to aggregating the individual differences between the observed
and forecast values for the same time period of a given time series. During the past
several decades a number of measures of forecast accuracy have been proposed.
      In some instances the accuracy measures are used to determine the accuracy of
the fit of the model to all of the historical observations of the time series in question.
This is referred to as in-sample forecast accuracy, or model-fit. In other instances
these measures are used to establish the accuracy of forecasts of the values of a time
series that were not used to calibrate the model. This second approach relies on post
sample or as ex ante forecasts, and is considered to be the preferred approach for
assessing the accuracy of an extrapolative forecasting method.
      In this second approach the actual historical observations of a time series are
divided into a training data set and a validation data set. This second set is frequently
referred to as the hold out set. The training data is used to calibrate the model and the
validation set contains the observations to which the forecasts values will be
compared.
      Statisticians who focused primarily on theoretical considerations developed the
first forecast accuracy measures. One of these accuracy measures is the Mean Square
Error (MSE). To calculate this measure the individual difference between each
observed and forecast value for a given time period is squared. The average of these
squared errors is obtained. This average of the squared errors is the MSE. Another
early measure is the Root Mean Square Error (RMSE), which is obtained by taking
the square root of the MSE. Yet another early measure is Mean Absolute Deviation
                                          - 22 -
(MAD) which is the average of the absolute value of the difference between each
observed and forecast value. In practice MAD provides the most useful interpretation
of these three measures, as it describes on average by how much each forecast will be
wrong in the actual units of the time series in question.
      Makridakis and Hibon (1979) in a large-scale empirical study relied on several
accuracy measures including: Theil's U, Mean Absolute Percentage Error (MAPE),
Percentage Better and Relative Ranking to determine the relative accuracy of the
forecasting methods being evaluated in their study.
      Carbone and Armstrong (1982) in a survey of forecasting experts found that
Root Mean Square Error (RMSE) was the most preferred measure of forecast
accuracy. The use of this measure was in opposition to the conventional wisdom at the
time that error measures such as (RMSE) that were not unit free, were not reliable
measures of relative accuracy. Mean Absolute Percentage Error (MAPE) was found to
be the most widely used unit-free accuracy measure by these authors. These authors
concluded that the choice of error measure used to identify the most accurate
forecasting method appeared to be a question of personal taste.
      Ahlburg (1982) reviewed seventeen papers dealing with accuracy of population
forecasts. The author found that the accuracy measures: Mean Absolute Percentage
Error (MAPE) was used in ten papers; Root Mean Square Error (RMSE) was used in
four papers; Root Mean Square Percentage Error (RMSPE) was used in three papers
and Theil's U was used in three papers. Multiple measures were used in four papers.
The author observed that no justification for the use of a particular measure was
provided in any of these papers.
      Armstrong and Collopy (1992) conducted research for the purpose of
establishing guidelines for the selection of the appropriate accuracy measure. In their
study these authors evaluated the relative accuracy of eleven extrapolative forecasting
methods, across one hundred ninety-one time series, with six different forecast
accuracy measures. These authors concluded that the choice of accuracy measure does
indeed make a difference in the identification of the most accurate forecasting method.
They recommended the Geometric Mean of the Relative Absolute Error (GMRAE)
                                          - 23 -
accuracy measure when the need is to assess the accuracy of model fit. Further they
concluded that the error measure that should be used to select the most accurate
method for producing out-of-sample forecasts is Median Relative Absolute Error
(MdRAE), and in those situations when only a few series are being evaluated the
Median Absolute Percentage Error (MdAPE) should be used. They also observed that
the Percent Better error measure performed well when many series are being
evaluated. Finally, they concluded that (RMSE) is not reliable, and should not be used
for comparing the accuracy of alternative methods across series.
      Fildes (1992) conducted a similar study to that of Armstrong and Collopy
(1992). In his study he observed that different forecasting conditions (date type,
forecast horizon, component type) affect the ranking on accuracy of alternative
forecasting methods by different accuracy measures. In this study Geometric Root
Mean Squared Error (GRMSE) and Median Absolute Percentage Error performed
well, while (RMSE) was found to be sensitive to values close to zero and (MAPE)
was sensitive to outliers.
      Makridakis (1993) investigated the concern among forecasters, at that time, as
to the selection of the appropriate accuracy measure. In his study the author reviews
the prior research on accuracy measures and their selection under various forecasting
conditions. He suggests that Theil's U2 as well as RAE, and this includes the
Geometric, Mean and Median RAE are highly problematic because their divisor is the
difference between the forecast value and the random walk forecast, which in some
instance can be zero and in other instance it can be very large. Further, he indicates
that the various RAE values are meaningless to most decision makers and that the
Geometric means posses the additional problem of not being able to be calculated
when working with a large number of series.
      As accuracy measures based on rankings and median measures are not relative
measures, calculated as a ratio of a proposed model to a baseline model, these
measures are not for general forecasting use, however they can be used in large scale
accuracy studies. He does indicate however that the Percentage Better measure is
reliable but should only be used in large-scale empirical studies as well, and, MSE
                                         - 24 -
and RMSE are neither relative nor do they convey much meaning to decision makers.
An additional baseline error measure useful for establishing the relative accuracy of
alternative methods in large studies is Benchmark. Benchmark is simply the
difference between the SMAPE value of a Benchmark method such as Naïve 2 and
the SMAPE value for each of the other methods.
      Makridakis suggested further that MAPE is a relative measure that incorporates
the best characteristics of the other accuracy measures, and is the only one other than
percent better, that leads to a meaningful interpretation by decision makers. MAPE
can be used in large-scale studies as well as for general use. This author provides an
over view of the problems with the MAPE measure and proposes a modification to
the measure to address its shortcomings. This improved MAPE measure was
originally referred to as modified MAPE and latter came to be known as Symmetric
MAPE or SMAPE.
      Collopy and Armstrong (2000) conducted an empirical study to re-examine the
problems with RAE and to investigate the performance of SMAPE. These authors
concluded that the search for the most effective error measure for making
comparisons across series is still underway. They acknowledge that the SMAPE error
measure introduced by Makridakis (1993) has desirable characteristics, specifically
that it is a relative measure and that it is unbiased, and as such it should be further
investigation is warranted. However, pending the results of further research, a relative
error measure such as MedRAE should also be used.


2.5 Forecasting Competitions


      Elton and Gruber (1972), Reid (1972), and Newbold and Granger (1974), were
among the first to establish the relative accuracy of different forecasting methods
across a large sample of time series. However, these early studies compared only a
limited number of methods. In these studies the accuracy of the various methods was
measured as model fit. Elton and Gruber (1972) also used group difference testing to
identify the most accurate method.
                                         - 25 -
      Makridakis and Hibon (1979) compared the relative accuracy of nine
forecasting methods across one hundred eleven time series of strictly business and
economic data. The accuracy measures, MAPE, Theil's-U and Percentage Better were
used to establish the relative accuracy of the forecast methods using a validation data
set. The accuracy measures were aggregated across series in such a way that the
method most accurate method for each data type in the study could be identified, as
could the most accurate method for all series used in the competition.
      Statistical tests for differences between methods as applied by Elton and Gruber
(1972) were abandoned in this study and in most subsequent large-scale accuracy
studies. There are three reasons for this decision. Firstly, in a forecasting competition
the methods are all reasonable alternatives for one another. For example, consider two
forecasting methods A and B. Consider that it is found that Method A has the higher
observed accuracy of the two methods. If A is the method selected to produce the
forecast and a difference exists between A and B, then A would produce the more
accurate the more accurate forecast. If on the other hand there is no difference
between A and B and A is the selected method, then method A will produce a forecast
that will be as accurate as a forecast produced by method B. For this reason observed
accuracy is emphasized in method selection. It should be noted that in practice
confidence intervals are routinely used.
      Secondly, the accuracy measure Percent Better is typically considered to
provide more useful information about the real differences between the accuracy of
alternative methods and is used in most large-scale studies.
      Thirdly, studies have shown that the ranking of various methods on forecast
accuracy differs according the accuracy measure used. Therefore, group difference
testing based on rankings derived from different measures of forecast would lead to
very different conclusions about which method produced the most accurate forecasts.
      The major finding of this study was that simple extrapolative methods perform
at least as well as more statistically sophisticated ones such as Winter's Method,
which were designed to extrapolative the trend and seasonal components of the time
series in addition to the average component. This conclusion was in conflict with the
                                           - 26 -
accepted view held by most experts in the late 1970's, that statistically sophisticated
forecasting methods would outperform statistically simple methods as the more
sophisticated methods could more precisely model the time series.
      Makridakis and Hibon (1982) introduced the first of what would become three
international forecasting competitions referred to generically as the M-competitions.
The goal of the competitions was to establish the relative accuracy of established, as
well as new, forecasting methods under various business-forecasting conditions and
for varying data types.
      In this study the authors used five measures of forecast accuracy, MAPE, MSE,
Average Ranking, Median Absolute Percentage Error (MdAPE) and Percentage Better,
to establish the relative accuracy of twenty-one extrapolative methods across one
thousand and one time series, comprised of macroeconomic, microeconomic and
industry and demographic data captured over yearly, quarterly and monthly data.
Group difference testing was not used.
      Each of the time series was divided into a training data set and a validation data
set. The training set for the one thousand and one series was provided to each of nine
contestants who were experts with one of the twenty-one extrapolative methods. The
use of outside experts to produce the forecasts was in response to criticisms of the
author's 1979 study in which the authors themselves produced all of the forecasts. The
experts produced six, eight and eighteen one period ahead forecasts for the yearly,
quarterly and monthly time series respectively.
      The forecasts were then compared to the values in the validation data set in a
post sample fashion. Accuracy measures were aggregated for each data type by time
period subcategory, each data type category and for all of the series in the study.
      The results of the M-competition were similar to those of the Makridakis and
Hibon 1979 study. The four important findings of this study were, that 1) statistically
sophisticated methods do not necessarily provide more accurate forecasts than do
simple models, 2) the ranking of methods on relative accuracy differs for different
accuracy measures, 3) the combination of the forecasts from alternative models
outperforms the accuracy of each of the methods being combined, 4) the ranking of
                                          - 27 -
methods on forecast accuracy differs for different forecast horizons.
      Hill and Fildes (1984); Lusk and Neves, (1984); and Koehler and Murphee,
(1988) used the M-competition data in what amounted to a replications of the
M-competition. They reported similar findings to those of Makridakis and Hibon,
(1982).
      Gardner and McKenzie, (1985); Geurts and Kelly (1986); and Clemen, (1989)
relied on a portion of the M-competition data to develop and test new extrapolative
methods. In particular, the Gardner and McKenzie (1985) model called Robust Trend
Exponential Smoothing has been examined in number of additional accuracy studies
since 1985, and has been found to be very accurate, particularly with yearly time
series and with time-series that have a trend component.
      Armstrong and Collopy, (1992, 1993); Makridakis et al., (1993) applied the
notion of a competition established in the M-competition to a set of
telecommunications time series. The results of this study reconfirmed the four major
findings of the findings of the 1982 M-competition.
      Makridakis et al., (1993) discussed the findings of the M-2 forecasting
competition, held during 1987 and 1988, as way to advance the study of forecasting
accuracy and to address the major criticism of the M-competition. That criticism was
that forecasters in real situations could utilize domain knowledge about their business
and industry to improve the accuracy of extrapolative methods. The format of the
M2-competition therefore was designed to evaluate this hypothesis as well as to
evaluate hypotheses relating to the four 1982 M-competition findings.
      This competition consisted of distributing twenty-nine actual time series from
four companies to five expert forecasters. The competition was run in real time over
the course of a two-year period. The experts could incorporate any information they
could obtain from either the company or from secondary sources into their monthly
forecasts. The accuracy of the forecasts produced by the experts was measured on a
validation data set comprised of the actual values for the twenty-nine time series that
were obtained after the beginning of the competition. The accuracy measures used in
this study were MAPE, Percent Better, and Benchmark. In this study the Benchmark
                                         - 28 -
calculations were based on the MAPE values for each method.
      The primary finding of the M2-competition is that the additional information
used by the experts did not result in forecasts that were more accurate than those
produced by the quantitative methods alone. Further, the other important findings of
this study reproduced the four major findings of the original M-competition.
      Makridakis and Hibon (2000) introduced the third and what the author's state is
the final forecasting competition, the M3-competition. In the words of the authors
"The goal of this study is to respond to those experts who continue to build more
sophisticated methods without regard to the ability of such methods to more
accurately predict real-life data".
      The M3-competition was designed to extend the M and the M-2 competitions as
well as the myriad of other accuracy studies that were conducted during the past
twenty years. The M3-competition established the relative accuracy of twenty-four
extrapolative forecast methods, across three thousand three time-series of
macro-economic, micro-economic, industry, demographic, financial and other time
series captured over yearly, quarterly monthly time intervals.
      The forecasting methods were used to produce forecasts for the forecast
horizons of six, eight and eighteen periods ahead, for yearly, quarterly and monthly
data respectively for each of the five date types. The forecasting methods examined in
this study range from the statistically simple random walk method to statistically
sophisticated methods that include neural networks, the Box-Jenkins approach, expert
systems and Rule Based Forecasting.
      Six measures of forecast accuracy were used to establish the relative accuracy
of the twenty-four methods. As with the earlier studies, the time series were divided
into a training data set and a validation data set. The accuracy measures used in this
study were Symmetric Mean Absolute Percentage Error (sMAPE); Average Ranking;
Percentage Better; Median Symmetric Absolute Percentage Error (MdAPE); Median
Relative Absolute Percentage Error (MdRAE) and Benchmark. As was the case in the
earlier M-competitions, accuracy measures were aggregated by data type-time interval
subcategory, data type category and for all of the series overall. The four major
                                          - 29 -
conclusions of the M3-competions reconfirmed the findings of the previous
M-competitions. The decision as to the most accurate method was based on consensus
among the error measures. Further, as difference testing was not used the authors
chose to report the top three methods for the various sub-categories, categories and for
all of the series evaluated in the competition.
      Fildes and Makridakis (1998), Makridakis and Hibon (2000), and Fildes (2001)
concluded, based on the findings of the M-competitions and other accuracy studies
that simple extrapolative methods, that do not explicitly extrapolate through
decomposition the trend or seasonal components of a time series provide forecasts
that are as accurate, than those produced by sophisticated methods that do explicitly
extrapolate these components.
      These authors conjectured independently that the reason simple methods can
outperform more sophisticated methods is because the former are robust to features in
real-life time series that confound more complex methods. These features include: 1)
structural changes caused by fads and fashions that can result in changes to
established patterns; 2) a high level of randomness or uncertainty that results from
competitive actions and reactions, and unforeseen events that can not be accurately
predicted; and 3) strong cycles of varying duration and lengths whose turning points
cannot be predicted.
      These authors argue that future research to improve the accuracy of
extrapolative methods should focus on the development of statistically simple
methods that can take into account the real-life behavior of time series.
      As an example, Makridakis and Hibon (2000) cite the introduction of a new
method, Theta (Assimakopoulos and Nikolpoulos, 2000). Although this method is not
based on strong statistical theory, it performs remarkably well across different types of
series, forecasting horizons and accuracy measures. Makridakis and Hibon (2000)
conclude: "Hopefully, new extrapolative methods, similar to Theta, can be identified
and brought to the attention of practicing forecasters".


2.6 Is Fuzzy Logic The Solution?
                                           - 30 -
      Fuzzy Logic is a data processing technology that has received wide acclaim for
its ability to more accurately model real world data than traditional mathematical
approaches (Stevens, 1993; McNeil and Freiberger, 1993; Kosko, 1994; Hajek, 2002;
Mukaidono, 2002; Mendel, 2001 and Nguyen and Walker, 2000).
      Fuzzy logic is based on both traditional logic and traditional set theory and was
developed by Lotfi Zadeh professor of electrical engineering at the University of
California at Berkley in 1965 (Zadeh, 1965).
      Traditional propositional logic is based on the Laws of Thought, as defined by
Aristotle and other early Greek philosophers. In this system, a proposition is an
ordinary statement that is comprised of a priori defined terms. For example, "It is cold
outside today". One of these laws, The Law of The Excluded Middle, states that every
proposition must be either true or false and is accordingly associated with a
truth-value of 1 or 0 respectively. Meaningful propositions like the one in the above
example can be determined to be either true or false. Logical reasoning is the process
of combining propositions into other propositions forming a logical structure that
allows for the truth or falsity of all propositions in that structure to be determined.
      Propositions can be combined in many ways, all of which are derived from
three fundamental operations. They are conjunction, disjunction and implication. For

two propositions p and q , conjunction (denoted p  q ) asserts their simultaneous

truth. For example, it is snowing today AND it is cold today. Disjunction (denoted

p  q ) asserts the truth of either or both propositions. For example, I will shovel the

snow from my driveway today. Implication, denoted ( p  q ) asserts a conditional

relationship between two propositions in the form of IF-THEN rules. For example IF
it is cold outside today THEN I will wear a warm jacket. In implications the
proposition associated with the IF portion of the rule is referred to as the antecedent
and the proposition associated with the THEN portion of the rule is referred to as the
consequent. Conjunction and disjunction can also be used to combine additional
propositions within the antecedent and consequent of the rules. For example, IF it is
                                           - 31 -
cold today AND it is snowing today THEN I will wear a warm jacket AND a warm
hat.
       The German mathematician Georg Cantor in 1884 introduced traditional set
theory (McNeil and Freiberger, 1993). He proposed a theory of sets that very much
built on the work of the early Greek philosophers. Cantor defined sets as collections
of definite distinguishable objects. Sets can represent people, things, words, or any
creation of the human imagination. In Cantor's theory, sets divide the world into IN
and OUT or TRUE and FALSE with the associated truth-values of 1 or 0, respectively.
Each potential member of a set either belongs or does not belong to a given set. For
example, given two sets Cold Days and Hot Days, the day December 1st 2004 can be
assigned to one and only one of the sets based on the temperature in Fahrenheit on
that day. The similarities between these two bodies of thought are illustrative of their
common origin.
       Consider the similarity between the conjunction operation in logic, and the
intersection operation in set theory. In conjunction a proposition is true overall only if

proposition p AND proposition q are both true. In intersection, a set element is in

the intersection only if the element is a member of Set 1 AND Set 2. The same
correspondence exists among many other logic and set operations. Further, in logic if
a proposition is true it is assigned a truth-value of 1 and if it is false it is assigned a
truth-value of 0. In set theory if an element is a member of a set it receives a
membership value of 1 and if it is not an element of a set it receives a membership
value of 0. Zedah relied on these similarities to meld logic with set theory to form
fuzzy logic.


2.7 Zadeh's Epiphany


       Zadeh who is also the father of modern Systems Theory began working in the
area of complex systems in the 1950's. Zadeh (1962) concluded that, "as the
complexity of a system increases, it becomes more difficult and eventually impossible


                                           - 32 -
to make a precise statement about its behavior, eventually arriving at a point of
complexity where the methods for reasoning and decision making born in humans is
the only way to get at the problem". Human beings reason and make decisions based
on human language rules that are organized as IF-THEN rules similar to a logical
implication, (McNeil and Freiberger, 1993; Cox, 1994 and 1995). Zadeh observed
however that the use of traditional or two valued logic by computers prevented them
from manipulating data representing subjective or vague human ideas such as IF the
weather is fine today THEN I will wear appropriate clothing. Clearly, there is some
vagueness in the meaning of the word fine and the word appropriate in the above rule,
to most readers. However these words undoubtedly have a very precise meaning to
the individual who spoke them. Vagueness is the condition that exists in which the
status of an object is a matter of definition. The question becomes one of how to
harness this decision making structure.
      Zadeh (1962) suggested, "We need a radically different kind of mathematics,
the mathematics of fuzzy or cloudy quantities which are not discernable in terms of
probability distributions". This appears to have be Zadeh's first reference to what
would latter become Fuzzy Logic.
      Zadeh however was only one in a long line of philosophers, mathematicians
and scientists who had wrestled with the problem of the excluded middle and its
associated vagueness. Plato was one of the first to raise concerns about the
appropriateness of the Law of the Excluded Middle and in so doing laid the
groundwork for what would become Fuzzy Logic. He observed that there was a third
region beyond TRUE and FALSE where in his words these opposites "tumbled about"
(Aziz, 1996).
      Charles Sanders Peirce the preeminent nineteenth century philosopher is
reported to have referred to those who split the world into TRUE and FALSE as the
"sheep and goat separators" (Nadin, 1983). He suggested instead that all that exists is
continuous, and such continuums govern knowledge. For example size is a continuum,
height is a continuum and even behaviors such as anger and sadness are also
continuums. He stated that vagueness "is no more to be done away with in the world
                                          - 33 -
of logic than is friction in mechanics", (Burch, 2001).
      Bertrand Russell another renowned philosopher concluded in the early 1900's
that both vagueness and precision were features of language, not reality. Russell even
challenged the notion of TRUE and FALSE. He concluded that without precise
symbols, they too are vague. Therefore any proposition would have a range of facts
that would make it TRUE, (Irvine, 2004). For example, the statement: This is a car
could refer to a sports car, an economy car, a racecar or even a toy car. Russell (1923),
asserted: "Vagueness, is clearly a matter of degree".
      Jan Lukasiewicz in the early 1900's relying on the work of Russell, Pierce, and
others introduced what was the first attempt at a formal model of vagueness. Today it
is referred to as three-valued logic, and it laid the foundation for the development of
Fuzzy Logic. Lukasiewicz in 1920 introduced a new logic in which the truth-value of
1 still stood for TRUE, and the truth-value of 0 still stood for FALSE however he
added the new truth value of 1/2 which stood for possible (McNeil and Freiberger,
1993). This represented a gigantic leap in the field of logic in that an assertion and its
negation had the same value. For example, it can be asserted that, it is possible that it
will snow today. This assertion has a truth-value of 1/2. The negation is, it is possible
that it will not snow today. The negation also has a truth-value of 1/2. Lukasiewicz
had created partial contradiction and in so doing opened the door to fuzzy logic.
      Zadeh (1965) set forth the mechanics of Fuzzy Logic in which he fused the
classic set theory of Cantor with the three-valued logic of Lukasiewicz. Zadeh
recognized that if it was possible, as Lukasiewicz had described, to have truth-values
of 1, 1/2, 1, then it was also possible to have truth-values of 1/4 and 3/4 as well. In
fact, if these truth-values were possible, then there are actually an infinite number of
truth-values in the interval [0,1]. He concluded that truth actually exists in degrees. He
extended this notion of degrees of truth to set theory and concluded that membership
functions can assign elements to sets in degrees of truth in the interval [0,1] as well.
This allows elements to belong partly to a set. This is the origin of what Zadeh
described as a fuzzy set.
      Fuzzy sets discriminate much better between and among objects and supply
                                          - 34 -
more information. While it is counter intuitive, fuzzy sets are more precise than are
Cantor's bivalent sets. For example, consider the question: if an individual lives in the
state of Rhode Island, USA for half the year and in the state of Florida, USA for half
the year, is that individual a resident of Rhode Island? Cantor's sets are unable to
represent or answer this question. Fuzzy sets on the other had can answer this
question with ease. The individual has a truth-value in the set of: Residents of Rhode
Island, USA of .5 and in the Residents of Florida, USA of .5. While these membership
assignments have been mistaken for probabilities they are in fact degrees of truth.
Probability values assert the chance that an entire set element belongs to a set,
whereas a grade of membership asserts the degree to which an element is a member of
a particular set.
      With the advent of fuzzy sets Zadeh had developed a mechanism through which
the vagaries of human thought or of data could be captured. As a final step Zadeh
combined traditional logical implications with fuzzy sets by replacing the propositions,
or a portion a proposition, which serve as the antecedents and consequents of the
IF-TEHN rules with fuzzy sets, thus creating fuzzy rules or fuzzy logical implications.
The same set theoretic operations that apply to Cantor's sets including union and
intersection, or analogously the same logic operators that can be applied including
conjunction and disjunction can used to combine these fuzzy logical rules into fuzzy
logical rule sets. This is the essence of Fuzzy Logic.


2.8 Mapping a Domain With Fuzzy Logic


      Kosko, (1994) introduced the Fuzzy Approximation Theorem as an explanation
for how fuzzy logic models data. Fuzzy Logic approximates a function by defining its
surface initially with fuzzy sets and the in turn by covering its surface with fuzzy that
the author refers to as patches. The input and output of the fuzzy method can be
associated together using these patches. Figure 2.2 below is an illustration of how
patches are used to map a function. In this way Fuzzy Logic provides a much more
accurate representation of the way systems behave in the real world.
                                          - 35 -
                                        (Figure 2.2)


Cox (1995) states regarding FAT, "Instead of isolating a point on the function surface,
a fuzzy rule localizes a region of space along the function surface." When multiple
rules are executed, multiple regions are combined in the same local space to produce a
composite region or a fuzzy rule set. The final point of the surface is found through
defuzzification.
      As an example of the application of fuzzy logic to modeling a real world
phenomenon consider the decision to adjust the thermostat in a house in response to
the temperature in the house. In rule based fuzzy logic, the general form of the rules
are, IF x is A THEN y is B, where A and B are fuzzy sets. First, fuzzy sets need to be
established for both the antecedent and consequent of the rules. In this example the
sets are established for different domains (ambient temperature in the house and
thermostat setting) however this is not always the case. The number of sets required to
describe each domain is based on a number of sets required to describe each domain
is based on a number of factors including the modeler's judgment about each domain,
their prior experience or is based on trial and error. In most instances fuzzy sets
overlap and the amount of overlap need not be the same for all sets in a given model.
For example some sets may overlap by 25% while other overlap by 90%. As with the
number of sets, the decision on the amount of overlap is based on the modeler's
judgment, prior experience or is based on trial and error. The overlap between and
among sets is the feature of fuzzy logic that allows an object to have a degree of
membership in more than one set.
      In this example three sets (TOO COLD, JUST RIGHT, TOO HOT) will be used
to model the room temperature form 40 degrees Fahrenheit to 90 degrees Fahrenheit
and three sets (INCREASE, NOTCHANGE, DECREASE) will model the action to be
taken relative to the setting of the thermostat. These sets translate to the following
fuzzy rules and fuzzy rule set:


1) IF the room is too cold THEN the thermostat setting should increase

                                             - 36 -
2) IF the room is just right THEN the thermostat setting should notchange

3) IF the room is too hot THEN the thermostat setting should decrease



      The above example is a demonstration of a simple control system designed to
regulate the temperature in a home. This is a multi-pass system in that the rules are
fired multiple times in order to regulate temperature. It is also possible to have a
single pass system where the rule set is developed from a single modeling or pass of
the data. Further, in this example, the modeler established the fuzzy relationship
between sets. For example the antecedent set, "too clod" was matched with the
consequent set, "increase". In other applications it is preferable to allow the data to
dictate the antecedent and consequent of the fuzzy rules.


2.9 Fuzzy Logic Applications


      Although Fuzzy Logic was introduced first in the United States, American
scientists and academics generally avoided using it mainly due to its unconventional
name. The same was generally true of scientists in Europe. It seems that many
scientists refused to be involved with a technology that had a name that sounded so
child-like, (Kaehler, 1998). On the other hand many other scientists gave fuzzy logic
more serious consideration but none-the-less discounted fuzzy logic as being nothing
more than probability theory in disguise, (Kosko, 1994; McNeil and Freiberger, 1993).
At the same time researchers in many Asian countries including China and Japan
enthusiastically accepted this new technology. Japan is currently positioned at the
leading edge of the application of Fuzzy Logic research. The US in contrast, by some
estimates is, is ten years behind in the applied use of this technology, (Mendel, 2001)
      One of the first significant applications of Fuzzy Logic was in the area of
automated systems or machine control in 1973, (Krantz, 1999). At the University of
London Professor Ebrahim Mamdani and his graduate student Sedrak Assilian were
trying to stabilize the speed of a small steam engine. Although they were using the
most sophisticated digital control equipment available, they were unable to stabilize
                                             - 37 -
the speed of the engine as it would either overshoot the target speed, or would be to
sluggish in achieving the target speed. Professor Mamdani as the story goes had
recently read about the control method proposed by Professor Zadeh, and decided to
try it. He created a simple fuzzy logic controller that worked better than any of the
other systems that they had tried, (Sowell, n.d.; Krantz, 1999).
      The most well know large-scale application of Fuzzy Logic to date is its use as
the control system for the subways, constructed in 1987, in Sendai, Japan. It has been
reported many times that the trains start and stop without the jolts and tug of inertia
common to most subways. It has been estimated that the Fuzzy Logic controller used
on this subway systems has resulted in a 10% fuel savings as well. Also in Japan
researchers created a Fuzzy Logic controller that can fly a helicopter that is missing
one of its rotor blades. Something that not even a human pilot can do.
      Fuzzy Logic has received widespread acceptance as a technology for automated
systems control and it is gaining acceptance as a technology for many other data
processing applications. In Japan there are several billion dollars of successful Fuzzy
Logic based commercial products including: auto-focusing cameras; washing
machines that adjust to how dirty the cloths are, automatic transmission and engine
controllers; anti-lock braking system controllers; color film developing systems and
computer programs that successfully trade in the financial markets, (Krantz, 1999).


2.10 The Mamdani Development Framework


      Fuzzy Logic is a rich discipline in which there is more than one way to skin the
proverbial data processing cat. The wide range of fuzzy methods that have evolved for
the same application evidences this fact. With that said, most rule-based fuzzy logic
methods have four major components or modules. This four-module framework is
attributed to the work of Ebrahim Mamdani, (Mendel, 2001). The modules are, in
order of operation: fuzzification, inference, composition and defuzzification. Fuzzy
IF-THEN rules guide the operation of each module. Figure 2.3 provides a graphical
representation of the Mamdani Framework.
                                          - 38 -
                                      (Figure 2.3)


      The fuzzification module establishes the fact base for the fuzzy method. The
input to this module is the scalar values of the data to be processed. In this module the
IF-THEN rules that will be used in all four modules are developed and established.
These rules are used in fuzzification module to associate the input scalar observations
with input fuzzy sets; in the inference module to associate input fuzzy sets with other
input fuzzy sets; in the composition module to create fuzzy rule sets and finally in
defuzzification to associate the output fuzzy sets with output scalar values. In addition,
the numbers of fuzzy sets that will be used to model the data are established, as are
the characteristics of the membership function for each set.
      The membership function is defined by the method developer and is used to
determine each scalar observation's membership in each fuzzy set. Each fuzzy set can
have a membership function that is unique to that set. The characteristics of a
membership function affect each scalar observations degree of membership (DOM).
Membership functions have six characteristics that include, shape, height, width,
shouldering, center points, and overlap. These functions can be depicted graphically
in a Cartesian coordinate system with an x and a y axis. The most common shapes are
triangular, bell shaped, trapezoidal and exponential. The height of the function is
normalized or set at one so that maximum membership in any set is one. The width of
the function is its distance along the x-axis for each set and the width can vary by set.
Shouldering is typically used to lock the height for a given set at the maximum DOM
of one. The overlap between sets in many instances is set at 50%, however overlap
can range from zero to nearly 100% and the overlap between the various functions
does not have to be the same. Figure 2.4 provides a graphical depiction of a three set
fuzzy model.
                                      (Figure 2.4)




                                          - 39 -
       While there are numerous approaches to assigning an observations DOM, the
most frequently used method for control systems is to first identify the observation's
location on the x-axis and then project vertically to identify its location on the
membership function. The DOM in a given set can be established by judgment or
through a more standardized calculation but must be a value in the interval [0,1]. The
output of the fuzzification module is a fuzzy set or sets for each scalar observation.
       The inference module has as its input the fuzzy sets representing the fuzzified
scalar values established in the fuzzification module. In this step the appropriate
IF-THEN rules are evaluated resulting in inferences being made about the relationship
between the fuzzy sets. These relationships are often referred to as Mamdani fuzzy
relationships. These relationships are captured as fuzzy rules in which the fuzzy sets
serve as the antecedent and consequent of the rules. The relationship can be
determined a priori by the modeler as was the case in the earlier example of a fuzzy
home heating controller in section 2.8, or the relationship can be determined by the
data with each firing of the inference modules IF-THEN rules.
       The composition module has its input the fuzzy rule set that was the output of
the inference module. In this module firing of its IF-THEN rule results in the creation
of composite fuzzy sets that serve as the fuzzy output. Individual fuzzy rules may
have different conclusions so composition is the process in which all rules are
considered and combined. The output, are fuzzy sets that summarize the fuzzy
relationship between the observations in the original data.
       The Defuzzification module has as its inputs the composite or combined fuzzy
sets that served as the output of the composition module. In this module an IF-THEN
rule converts the fuzzy output into scalar values that can be used by the physical
system being modeled. There are many approaches to defuzzification including
Maximum, Centroid, Center-of-Sums, Height, and Center-of-Sets.


2.11   Fuzzy Logic Based Extrapolative Methods


       Song and Chissom (1991) introduced an extrapolative forecasting method based
                                          - 40 -
on Fuzzy Logic. These authors proposed this method to address uncertainty in the
form of, what they referred to as, fuzziness or vagueness in the historical observations
of university enrollment data. These authors drew a distinction between uncertainty in
the form of fuzziness, and uncertainty that result from white noise. In the latter case,
they concluded that white noise results from random factors that affect the values of
the time series. In the former case, they concluded that fuzziness results from
non-random factors such as measurement error. They argued that a method based on
fuzzy logic would be required to handle uncertainty resulting for these non-random
factors.
      In their study, the APE for each forecast value as well as MAPE were used to
establish the relative accuracy of their fuzzy logic method and time series linear
regression (TSLR) on a single series of enrollment data. Accuracy was based on
model-fit using one period ahead forecasts, that is a forecast horizon of one period.
      In the their experiment they altered the values of the time series to simulate
measurement error and demonstrated that their method was more robust than TSLR as
it produced forecasts that provided a better fit to the historical observations than did
the forecasts produced by the regression method, in which occupancy was regressed
against time.
      Although their method was not designed within a specific method development
framework, their method non-the-less can be discussed generally within the Mamdani
Framework. Again, the four modules in this framework are fuzzification, inference,
composition and defuzzification.
      While The Song and Chissom (1991) method showed promise as a viable
approach to extrapolation, overall their method was quite cumbersome to use and it
would be difficult to automate their procedures.
      In their fuzzification module they pre-assigned seven fuzzy elements to each of
seven fuzzy sets resulting in an unusual and highly cumbersome approach to
fuzzification. The authors also indicated that they believed that seven sets would yield
optimal results, however they did not provide any empirical evidence to support their
conclusion. This claim contracts earlier findings and the generally held belief by most
                                          - 41 -
modelers that there is no single parameter value in any method, that that yields
optimal forecasts for all time series.
      In addition, their use of matrices and vectors in the composition module to
capture the relationship between the historical observations of the time series in
question is extraordinarily involved. It would appear that the author's goal was to
create a final matrix that captured or summarized all knowledge on the relationship

between fuzzy set, much in the same was that  does so in a regression model.

      Chen (1996) introduced a new method in which he replaced the cumbersome
composition module of the Song and Chissom (1991). Specifically he replaced the
matrices and the MIN-MAX composition operators in their composition module with
Mamdani style fuzzy logical relationships, and replaced their height defuzzifier with a
simpler maximum defuzzifier.
      In Chen's study, which was a replication in part of the Song and Chissom (1991)
study and relied on the same enrollment data, APE and MAPE were used to establish
the relative accuracy of the Song and Chissom method (1991) method and his method
with new composition and defuzzifier modules. In his method he retained the Song
and Chissom fuzzifier module. This was done in apparent support of the assertion on
the part of Song and Chissom that a fuzzy set parameter of seven yielded the most
accurate forecasts.
      Chen demonstrated that, relative to model fit using one period ahead forecasts,
his method produced more accurate forecasts, and was more robust to measurement
error, than was the Song and Chissom (1991) method. It should be noted however that
both the Song and Chissom (1991) study and Chen study were deficient in the number,
of series evaluated, methods examined and accuracy measures employed.
      Jarrett and Plouffe (1996) investigated the use of the Song and Chissom (1991)
method to forecast occupancy levels in undergraduate student housing. These authors
believed that the historical occupancy data contained measurement error and that the
Song and Chissom (1991) method would be robust under these conditions, resulting
in more accurate forecasts.


                                         - 42 -
      In this study four measures of forecast accuracy were used to establish the
relative accuracy of the Song and Chissom (1991) method and six alternative
extrapolative methods that included five smoothing methods and TSLR across fifteen
time series of occupancy data. The measures of forecast accuracy used were MAPE,
MAD, MSE and RMSE, and they were used as a measure of model fit using one
period ahead forecasts.
      Jarrett and Plouffe used the procedures and method of the Song and Chissom
study (1991). The ranking of the methods on forecast accuracy was based on
minimizing both MAPE and MAD with MSE and RMSE reserved for comparison to
other studies.
      The major finding of this study was that the Song and Chissom (1991) method
provided more accurate forecasts based on model fit, using one period ahead forecasts
than did the alternative extrapolative methods. In addition, and as was the case in the
Song and Chissom 1991 study, a visual inspection of the plots of forecasted values
produced by the Song and Chissom method and the observed values of the time series,
revealed how closedly these patterns replicated one another. With that said an
apparent lag of one period was observed when a trend was present in the historical
values of the time series.
      Jarrett and Plouffe (1998) in an extension of their 1996 study used the same
four accuracy measures to establish the relative accuracy of the Chen method in
addition to the methods evaluated in 1996, across twenty time series of occupancy
data. The fuzzy set parameter for both fuzzy methods was seven. The ranking of the
models was again based on minimizing both MAPE and MAD as to model fit using
one period ahead forecasts (a forecast horizon of one period).
      The additional five time series were combinations of fall semester and spring
semester occupancy level for the periods under investigation. These combined series
created a seasonal pattern in the data as the occupancy level for each spring was lower
than for the fall of the same year.
      The major findings of this study were that the Chen method provided the most
accurate forecasts for the original fifteen series, however, the Song and Chissom and
                                         - 43 -
Chen methods were both outperformed by five and three of the traditional methods
respectively on the five series that were simulating seasonality.
      The Chen method was, computationally, much easier to implement than the
Song and Chissom method as a result of the development by Chen of new
composition and defuzzifier modules. The improvement in forecast accuracy of the
Chen method over the Song and Chissom method may well be attributed to the use of
the Mamdani fuzzy logical relationships rather than the extensive matrices and
vectors of the Song and Chissom method. The advantage of the Chen method may
reside in the fact that the Mamdani fuzzy logical relationships provide a rougher
modeling solution than do the matrices and vectors of the Song and Chissom method
continually over estimated the actual values of the time series when a trend was
present in the time series.
      These authors argue that future research to improve the accuracy of a fuzzy
logic based extrapolative methods should focus on developing methods that will
provide more accurate forecasts than alternative traditional methods when a trend or
seasonal component is present in the time series. In addition, a fuzzifier module must
be developed that is simple to use, easy to automate and allows for alternative values
of the fuzzy set parameter to be considered. Finally, a center-of-sets defuzzifier should
be considered.
      Further, they suggest that future experiments conducted to establish the relative
accuracy of fuzzy based methods should use ex ante forecasts as opposed to model fit
and that the ex ante forecasts should be produced for multiple forecasts horizons.
Additionally, they suggest that a broader sample of data types should be examined.


2.12 Research Hypotheses


      Twenty years of research on time series extrapolation has demonstrated that
statistically simple methods provide forecasts that are as accurate, and in many cases
more accurate, than those produced by statistically complex methods. Authors
including Fildes and Makridakis (1998), Makridakis Hibon (2000), Fildes (2001) and
                                          - 44 -
Small and Wong (2002) suggest that future research to improve the accuracy of
extrapolative methods should focus on the development of statistically simple
methods that have the characteristic of being robust to the fluctuations that exist in
real world data resulting from both random and non-random events.
      Since 1991 four studies have been conducted to extend the initial work of Song
and Chissom, to develop a fuzzy logic method for time series extrapolation. The
results of those studies lend empirical support to the theoretical evidence that a fuzzy
logic extrapolative method can provide more accurate forecasts than traditional
extrapolative methods.
      Jarrett and Plouffe (1998) suggest in their conclusion that improving the
accuracy of these methods requires that a new fuzzy logic extrapolative method be
developed that will have the implicit ability too provide accurate forecasts of times
series in which a trend or seasonal component is present, without the need to
decompose the time series. Additionally, this method should allow for fuzzy set
parameters other than seven.
      In response, a new fuzzy logic method for time series extrapolation has been
developed and introduced in this research. This method builds on the work of Song
and Chissom (1991) and Chen (1996). This new method has a new fuzzifier module
that allows for scalar values to be simply and directly assigned to fuzzy sets. This
module will also capture a trend if one exists in the time series, and further it allows
the modeler to specify the value of the fuzzy set parameter.
      In addition, the inference module from the earlier methods has been modified to
capture the seasonal component, of any duration, and a new defuzzifier module has
been created that uses the center-of-sets principle. Finally the composition module,
that uses Mamdani fuzzy logic relationships, which was used successfully in the Chen
method has been retained in the Direct Set Assignment method.
      Three forecasting competitions have been designed to validate the relative
accuracy of the Direct Set Assignment method. These competitions have used, as
required, for each competition, the data, accuracy measures, procedures and best
performing simple extrapolative methods from the M3-Competition, Makridakis
                                         - 45 -
(2000).
      To investigate the effect of and changes to fuzzy set parameter, in this study,
two null hypotheses will be tested:


      HO1: The ex ante forecast accuracy of the DSA method will not change in
      response to a change in the number of fuzzy sets, all other model parameters
      held constant


      HO2: A fuzzy set parameter of seven in a DSA model will yield the most
      accurate ex ante forecasts when compared to DSA models with fuzzy set
      parameters other than seven, in the range of set values from two to twenty, all
      other model parameters held constant


      There are three findings regarding the relative accuracy of extrapolative
forecasting methods that have consistently been affirmed in the forecasting
competitions and accuracy studies conducted during the past two decades, including
the M3 forecasting competition held in 2000. As the data, accuracy measures and
procedures are those of the M3-competition it is expected that these same three
hypotheses will be reaffirmed in this study as well. Therefore in this study the
following three null hypotheses will be tested:


      HO3: The ranking on forecast accuracy of the DSA method and the traditional
      methods compared in this study will be the same for all accuracy measures
      considered


      HO4: The ranking on forecast accuracy of a combination of alternative
      forecasting methods will be lower than that of the specific forecasting methods
      being combined


      HO5: The ranking on forecast accuracy of the DSA method and the traditional
                                         - 46 -
      methods compared in thus study does not depend on the length of the forecast
      horizon


      Small improvements in forecast accuracy can lead to cost reduction, enhanced
market penetration, and improvement in both operational efficiency and customer
service for many businesses. For this reason, and as indicated above, the goal of this
research is to introduce a new extrapolative forecasting method, based on fuzzy logic
that will provide more accurate ex ante forecasts than alternative simple extrapolative
methods across a varied selection of business data types and forecasting conditions
including those series in which a statistically significant trend is present. Therefore in
this study the following three null hypotheses will be tested:


      HO6: The ranking on forecast accuracy of the time series specific DSA model,
      will be less than or equal to the ranking on forecast accuracy of both the
      subcategory and category specific DSA models


      HO7: The ranking on forecast accuracy, of the DSA method, will be lower than
      that of the traditional extrapolative methods to which it is being compared in
      this study, by time series subcategory, time series category and for all of the
      time series category and for all of the time series evaluated in this study


      HO8: The ranking on forecast accuracy, of the DSA method, will be lower than
      that of the traditional extrapolative methods to which it is being compared in
      this study, on those series in which a statistically significant trend is present


2.13 Summary


      As the 21st century gets underway, the field of business forecasting is
experiencing a renaissance. This rebirth can be attributed in part to the emergence of
mass storage technologies that allow business to capture data on all of their essential
                                           - 47 -
activities, and in part to the fact that businesses are compelled to use all the
information in their arsenal to gain competitive advantage. Forecasts of a business's
essential activities are a critical input to their planning processes and are used for
developing competitive responses and improving operational efficiency.
      When sufficient quantitative data are available in the form of a time series,
extrapolative forecasting methods are preferred as they will provide more accurate
forecasts than other available quantitative, as well as qualitative forecasting methods.
Several authors have argued that new statistically simple extrapolative methods are
needed to achieve further improvements in forecast accuracy for this category of
forecasting methods. For this reason, this research has focused on the development
and validation of a new fuzzy logic based method referred to as the Direct Set
Assignment Method. The findings, from several prior studies on fuzzy logic based
extrapolative methods, indicate that these methods can provide more accurate
forecasts than traditional extrapolative methods.
      To validate the forecast accuracy of this new method, three forecasting
competitions have been conducted using the standards and procedures of the M3
forecasting competition. The competitions in this study were designed to evaluate
eight hypotheses relating to this new method's relative accuracy when compared to
the most accurate methods from the M3-competition under different forecast
situations.
      The seven specific hypotheses tested were derived from this study's two major
research hypotheses. The first hypothesis is that the ex ante forecast accuracy of the
DSA method will change in response to changes in the fuzzy set parameter. The
second hypothesis is that the DSA method will provide more accurate ex ante
forecasts than the traditional extrapolative forecasting methods to which it has been
compared.
      In the next chapter, the development of the DSA method with the Mamdani
framework is discussed. Two examples of the implementation of the DSA method on
two of the time series evaluated in this study have been provided.


                                         - 48 -
                                    CHAPTER 3



              THE DIRECT SET ASSIGNMENT METHOD


      This chapter begins with a comprehensive description of the development of the
Direct Set Assignment method within the four-module Mamdani development
framework. This section is followed by two examples demonstrating the
implementation of the DSA forecasting method. The chapter closes with a summary
of the application of fuzzy logic to time series extrapolation.


3.1 DSA Method Development


      The Direct Set Assignment extrapolative forecasting method was developed
within the Mamdani design framework discussed in section 2.10, and has as its
primary inspiration, the fuzzy logic based extrapolation methods introduced by Song
and Chissom in 1991 and Chen in 1996. A discussion of the four design components
of the DSA method follows.
      The inputs to the DSA method are those historical values of the time series of
interest that have been selected by the modeler as the training set for that time series.
      There are four IF-THEN rules that are used in the DSA method with one set
used in each of the four modules. A description of each rule appears in the following
sections on each of the four modules that comprise the DSA method.
      Important new features in the DSA fuzzifier include explicitly describing the
membership function as well as the degree of overlap between and among fuzzy sets.
This was not done in either the Song and Chissom or Chen methods. This adds two
additional model parameters to the DSA method that can be manipulated to improve
ex ante forecast accuracy. In the DSA model a triangular membership function was
used for all fuzzy sets, and the degree of overlap for successive sets for a particular
                                           - 49 -
model is identical.
      An additional new feature in the DSA Fuzzifier is a universe of discourse that
reflects an extension of the range of the historical values of the time series of interest.
In the DSA fuzzifier the minimum and maximum values of the range are decreased
and increased respectively, by the average of the absolute differences between the
values of successive periods in the time series of interest. This provides the DSA
method with the implicit ability to produce in-sample as well as out-of-sample
forecasts that reflect either growth or decay in the time series.
      In the DSA method fuzzy sets are defined on the universe of discourse which
serves as both the input and output domain for the method. While in most fuzzy
methods, sets receive linguistic labels in the DSA method simple labels with
subscripts suffice. Subscripts with low values are associated with low values of
demand while subscripts with larger values are associated with higher levels of

demand. Therefore fuzzy sets have been labeled Ai ( i =1 to n) where n is the number

of fuzzy sets selected by the modeler. The minimum number of fuzzy sets is two, as
one fuzzy set produces a horizontal-line forecast. While it is possible to evaluate an
infinite number of fuzzy sets, in this study the maximum number of sets evaluated is
twenty. Beyond twenty sets, fuzzy set intervals converge and as a result fuzzy forecast
value converge. In the DSA method, unlike earlier fuzzy methods, the number of
fuzzy sets defined on the universe of discourse is considered to be a model parameter
that can be manipulated to improve ex ante forecast accuracy. Previously it was
believed that seven fuzzy sets were optimal, (Song and Chissom, 1993).
      Also, while an observation's degree of membership in a fuzzy set can be
established by judgment, in this study membership intervals were defined for each
fuzzy set and each interval is associated with a specific degree of membership in the
range [0,1]. The number of intervals defined should be sufficient to differentiate the
degree of membership of observations and differ by model. This parameter is not
considered to effect forecast accuracy but does ensure that the results of this study can
be reproduced.


                                           - 50 -
      In the final step in the fuzzification module its IF-THEN rule set is used to
assign the historical values of the time series, that is, the values of the training set, to
one of the fuzzy sets that were defined on the universe of discourse for the time series
in question. The rule is, IF an observation occurs within an interval of one and only
one of the candidate fuzzy sets THEN that observation is directly assigned to that
fuzzy set, exclusively OR, IF the observation occurs within an interval of more than
one fuzzy set, THEN it is directly assigned to the fuzzy set in which it has maximum
membership, exclusively. It is from this step in the fuzzifier, in which each historical
observation of the training set is directly assigned to a set without reference to a
linguistic label, that the DSA method derives its name.
      This simplified fuzzifier results in one input fuzzy set per historical obsevation
being passed to the inference module. While it is possible to have more than one
fuzzy set per observation passed to the inference module, one set was selected, as it is
the simplest approach, and as such it represents the best starting point for developing
the DSA method.
      In the DSA inference module its IF-THEN rule set is used to make inferences
about the relationship between the input fuzzy sets. This results in the creation of
fuzzy rules in which the fuzzy input sets, from the fuzzifier module, serve as the
antecedent and consequent of those rules. The output from this module is a fuzzy rule
set comprised of the individual fuzzy rules. For each antecedent and consequent pair
of sets, the antecedent set is considered to be the current state of demand, while the
consequent set is considered to be the future state of demand. Demand is a generic
reference to the values of the time series. The pairs of sets cumulatively represent a
fuzzy model of the time series.
      To identify the antecedent and consequent pairs, the periodicity, which is a
measure of the seasonal component of the time series, must be known. The periodicity
of the time series can be determined by either a visual inspection of a plot of the
observations in the training set, or from the calculation of seasonal indices. The use of
periodicity in a fuzzy extrapolative method is unique to the DSA method and has as
its inspiration Winter's Seasonal Method.
                                            - 51 -
      The rule is, IF the periodicity is one, no seasonality is present, THEN the rules
are formed for each (t) and (t+1) fuzzy sets beginning with the earliest observations in
the time series, OR, IF the periodicity is four or eight, and the time series is quarterly,
seasonality is present, THEN the rules are formed for each (t) and (t+4) or (t+8) fuzzy
sets respectively beginning with the earliest observations in the time series, OR, IF the
periodicity is twelve or twenty-four, and the time series is monthly, seasonality is
present, THEN the rules are formed for each (t) and (t+12) or (t+24) fuzzy sets
respectively beginning with the earliest observations in the time series. The data is
processed only once making for a one-pass system that results in the creation of a
fuzzy forecasting rule set that that will serve as the input to the composition module.
These rules capture the relation between the historical observations of the time series.
      In the composition module its' IF-THEN rule creates composite rules that yield
a (t + n) fuzzy forecast, in the form of fuzzy sets, for each fuzzy set, that represents a
fuzzified historical observation of the training set, for the time series of interest. The
rule is, IF for each fuzzified historical observation there is one or more fuzzy rules in
which that fuzzy set is the current state or antecedent of the fuzzy rule, then the fuzzy
forecast are the fuzzy sets which are the future state or consequent of the composite
fuzzy rule, OR IF for each fuzzified historical observation there are no fuzzy rules in
which that fuzzy set is the current state or antecedent of the fuzzy rule, then the fuzzy
forecast is the fuzzy set that is the current state or antecedent of the composite rule.
      In the defuzzification module it' IF-THEN rule utilizes a center-of-sets
defuzzifier to convert the fuzzy forecasts to scalar forecasts. The rule is, IF there is
one and only one set in the fuzzy forecast, THEN the scalar forecast is the center point
of that fuzzy set, OR IF there are two or more fuzzy sets in the fuzzy forecasts THEN
the scalar forecast is the average of the center points of all fuzzy sets in the fuzzy
forecast.
      The next section provides two examples of the DSA method on two time series
that were used in this study. The first time series is N0006. This series has twenty
observations of yearly microeconomic data collected for the periods 1975-1994. The
first fourteen observations are the training data set, and the final six observations are
                                           - 52 -
the validation data set. The observations in the validation data set are the values that
will be used to establish ex ante forecasts accuracy. As such six ex ante forecasts are
required. Series N0006 contains a statistically significant trend, however there is no
indication of seasonality as the periodicity of the series is one.
      The second time series is N0671. This series has forty-four observations of
quarterly microeconomic data collected for the periods 1984-1994. The first thirty-six
observations are the training data set and the last eight observations are the validation
data set. The observations in the validation data set are the values that will be used to
establish the ex ante forecast accuracy. As such eight ex ante forecasts are required.
Series N0671 contains a statistically significant trend and there is an indication of
seasonality as the periodicity is four. Additional descriptive information on these
series can be found in Appendix A.


3.2 DSA Example: Non-seasonal Series N0006


      Step 1: Create Universe of Discourse. The minimum and maximum values for
series N0006 are 1458.1 and 4095.0, respectively. The mean absolute change for
successive periods is 245.0. Therefore the universe of discourse is 1213.0-4339.6, and
the range or interval for the universe of discourse is 3126.6.
      Step 2: Select Membership Function and Fuzzy Set Parameters and Define
Fuzzy Sets. A triangular membership function in which nine fuzzy sets are defined on
the universe of discourse will be used to model the training set of series on each fuzzy
set. As the membership function is triangular, the maximum degree of membership
occurs at the apex of the N0006. (In this study nine sets was shown to provide
forecasts that minimized forecast error). To allow for graded set membership each set
interval was extended by twenty-five percent beyond that of the crisp set values. The
overlap between sets is the same amount over consecutive fuzzy sets for a given DSA
model. To provide consistency to the process of assigning degree of membership to a
historical observation of the series, twenty membership intervals have been defined
for each fuzzy set. Therefore, the value at the center point of the set has the maximum
                                           - 53 -
membership in each set, and as such has a membership in each set, and as such has a
membership value of 1.0. Ten membership intervals were defined on the sub intervals
to the left and right of the set midpoint for each fuzzy set. The membership intervals
are ordered from the maximum membership of 1.0 at the apex of the function, to the
minimum membership value of 0.1 at the base of the function. For example consider
the fuzzy set (1213.1, 1430.2, 1647.3). The observation 1429.0 would be assigned a
membership value of 1.0, while observations 1214 and 1646 would each be assigned a
membership value of 0.1.
      The nine fuzzy sets with their endpoints and midpoints for series N0006 are as
follows: A1=(1213.1, 1430.2, 1647.3); A2=(1549.6, 1766.7, 1983.8); A3=(1886.1,
2103.3, 2320.4); A4=(2222.7, 2439.8, 2656.9); A5=(2559.2, 2776.3, 2993.5);
A6=(2895.8, 3112.9, 3330.0); A7=(3232.3, 3449.4, 3666.5); A8=(3568.8, 3786.0,
4003.1); A9=(3905.4, 41222.5, 4339.6)
      Step3: Fuzzify observations in training set. If an observation has a degree of
membership in one or more fuzzy sets, then that observation's fuzzy set assignment is
the single set in which it had the highest degree membership. For example, in series
N0006 the observation for 1975 is 1458.05, which has a membership of 0.9 in set A1.
Therefore it is assigned to fuzzy set A1 exclusively. The observation for 1976 is
1931.53, has a membership in set A2 of 0.4 and in set A3 of 0.3. Therefore it is
assigned to fuzzy set A2 exclusively.
      Step 4: Establish Fuzzy Rules. The periodicity of series N0006 is one. Therefore,
fuzzy rules will be established with the sets for each (t) and (t+1) pair of time periods.
For example, in series N0006 for the years 1975 and 1976 the fuzzy rule is, IF the
demand for 1975 is A1 THEN that for 1976 is A2. The logical relationship A1-A2 is
established for the years 1975 and 1976. In this pair the current state is A1 and the
future state is A2. The (n-1) pairs for the time series, when taken cumulatively
represent a fuzzy model of the training set for the time series.
      Step 5: Produce fuzzy forecasts. Given series N0006, for each time period (t)
produce a (t+1) forecast beginning with the first observation in the training set. If the
fuzzy set at time (t) is a current state in one or more fuzzy logical relationships, then
                                           - 54 -
the fuzzy forecast is the future state of all fuzzy logical relationships for which that
fuzzy set is the current state. From the above example, the fuzzified observation for
1975, time period (t) is A1. A1 is the current state in the fuzzy logical relationship
A1-A2 only. Therefore, A2 the future state in the A1-A2 fuzzy logical relationship, is
the fuzzy forecast for 1976, which is the (t+1) time period, OR IF the fuzzy set at time
(t) is not a current state in one or more fuzzy logical relationships, then the fuzzy
forecast for (t+1) is that that fuzzy set. From the above example, if the fuzzified
observation is A3 and the only fuzzy logical relationship is A1-A2 then the fuzzy
forecast for A3 is A3. In step 5, fuzzy forecasts are produced for each year 1976-1989.
The forecasts for 1976-1988 are in-sample forecasts while the forecast for 1989 is an
out-of-sample or ex ante forecast. For series N0006, six ex ante forecasts are required
for the period 1989-1994. For this (t+1) model, the five additional ex ante forecasts
for 1990-1994 are a replication of the first ex ante forecast produced for 1989.
      Step 6: Produce Scalar Forecasts. If there is only one fuzzy set in the fuzzy
forecast, then the scalar forecast is the center point of the set of the fuzzy set that is
the fuzzy forecast. If there are two or more fuzzy sets, then the scalar forecast is the
average of the center point of the fuzzy sets that are the fuzzy forecast.
      Table 3.1 presents a summary of the output from the steps of the DSA method
for series N0006.


                                        Table 3.1
                    DSA Method Implementation For Series N0006




3.3 DSA Example: Seasonal Series N0671
      Step 1: Create Universe of Discourse. The minimum and maximum values for
series N0671 are 1264.9 and 4414.0, respectively. The average absolute change for
successive periods is 505.0. Therefore the universe of discourse is 759.94-4919.0, and
the range or interval for the universe of discourse is 4159.1.
      Step 2: Select Membership Function and Fuzzy Set Parameters and Define
                                           - 55 -
Fuzzy Sets. A triangular membership function in which seventeen fuzzy sets are
defined on the universe of discourse will be used to model the training set of series
N0671. (In this study seventeen sets was shown to provide forecasts that minimized
forecast error). To allow for graded set membership each set interval was extended by
twenty-five percent beyond that of the crisp set values. The overlap between sets is
the same size over consecutive fuzzy sets for a given DSA model. To provide
consistency to the process of assigning degree of membership to a historical
observation of the series, twenty membership intervals have been defined on each
fuzzy set. As the membership function is triangular, the maximum degree of
membership occurs at the apex of the function. Therefore, the value at the center point
of the set has maximum membership in each set, and as such has a membership value
of 1.0. Ten membership intervals were defined on the sub intervals to the left and right
of the set midpoint for each fuzzy set. The membership intervals are ordered from the
maximum membership of 1.0 at the apex of the function to the minimum membership
value of 0.1 at the base of the function. For example consider the fuzzy set (1213.1,
1430.2, 1647.3). The observation 1429.0 would be assigned a membership value of
1.0, while observations 1214 and 1646 would each be assigned a membership value of
0.1.
       The seventeen fuzzy sets with their endpoints and midpoints for series N0671
are as follows: A1=(759.9, 912.8, 1065.8); A2=(1000.8, 1153.7, 1306.6); A3=(1241.6,
1394.5, 1547.4); A4=(1482.4, 1635.3, 1788.2); A5=(1723.3, 1876.2, 2029.1);
A6=(1964.1, 2117.0, 2269.9); A7=(2204.9, 2357.8, 2510.7); A8=(2445.7, 2598.6,
2751.5); A9=(2686.6, 2839.5, 2992.4); A10=(2927.4, 3080.3, 3233.2); A11=(3168.2,
3321.1, 3474.0); A12=(3409.0, 3562.0, 3714.9); A13=(3649.9, 3802.8, 3955.7);
A14=(3890.7, 4043.6, 4196.5); A15=(4131.5, 4284.4, 4437.3); A16=(4372.4, 4525.3,
4678.2); A17=(4613.2, 4766.1, 4919.0).
       Step 3: Fuzzify observations in training set. If an observation has a degree of
membership in one or more fuzzy sets, then that observation's fuzzy set assignment is
the single set in which it had the highest degree membership. For example, in series
N0671 the observation for 1984-Q1 is 1264.9, which has a membership of 0.3 in set
                                         - 56 -
A2 and .2 in set A3. Therefore it is assigned to fuzzy set A2 exclusively. The
observation for 1984-Q2 is 1386.3, which has a membership of 1.0 in set A3.
Therefore it is assigned to fuzzy set A3 exclusively.
      Step 4: Establish Fuzzy Logical Relationships. The periodicity of series N0671
is four. Therefore, fuzzy rules will be established between each (t) and (t+4) pair of
time periods. For example, in series N0671 for the quarters 1984-Q1 and 1985-Q1 the
fuzzy rule is, IF the demand for 1984-Q1 is A2 THEN that for 1985-Q1 is A3. The
fuzzy logical relationship A2-A3 is established for the years 1984-Q1 and 1985-Q1.
In this pair the current state is A2 and the future state is A3. The (n-4) pairs when
taken cumulatively represent a fuzzy model of the training set for the time series.
      Step 5: Produce fuzzy forecasts. Given series N0671, for each time period (t)
produce a (t+4) forecast beginning with the first observation in the training set. If the
fuzzy set at time (t) is a current state in one or more fuzzy logical relationships, then
the fuzzy forecast is the future state of all fuzzy logical relationships for which that
fuzzy set is the current state. From the above example, the fuzzified observation for
1984-Q1, time period (t) is A2. A2 is the current state in the fuzzy logical relationship
A2-A3, only. Therefore A3, the future state in the A2-A3 fuzzy logical relationship, is
the fuzzy forecast for 1985-Q1, which is the (t+4) time period, OR IF the fuzzy set at
time (t) is not a current state in one or more fuzzy logical relationships, THEN the
fuzzy forecast for (t+4) is that fuzzy set. From the above example, if the fuzzified
observation is A4 and the only fuzzy logical relationship is A2-A3 then the fuzzy
forecast for A4 is A4. In step 5, fuzzy forecasts are produced for each quarter
1985-Q1 through 1993-Q4. The forecasts for 1985-Q1 through 1992-Q4 are
in-sample forecasts while the forecasts for 1993-Q1 through 1993 is an out-of-sample
or ex ante forecast. For series N0671, eight ex ante forecasts are required for the
periods 1993-Q1 through 1994-Q4. For this (t+4) model, the four additional ex ante
forecasts for 1994-Q1 through 1994-Q4 are a replication of the first four ex ante
forecasts produced for 1993-Q1 through 1993-Q4.
      Step 6: Produce Scalar Forecasts. If there is only one fuzzy set in the fuzzy
forecast, then the scalar forecast is the center point of the set of the fuzzy set that is
                                          - 57 -
the fuzzy forecast. If there are two or more fuzzy sets, then the scalar forecast is the
average of the center point of the fuzzy sets that are the fuzzy forecast.
      Table 3.2 presents a summary of the output from the steps of the DSA method
for series N0671.


                                        Table 3.2
                    DSA Method Implementation For Series N0671




3.4 Summary


      The Direct Set Assignment method has as its primary inspiration the
extrapolative forecasting method of Song and Chissom (1991) and Chen (1996).
However, unlike those methods which were designed to forecast only the level
component of the time series, the DSA method was designed to forecast the trend and
seasonal components of the time series as well as the level component. In addition,
the DSA method was designed to forecasts all three components without using
externally calculated parameters to adjust the forecast produced by the model, as is
the case with methods including Robust Trend, Damped Trend and Theta, nor was it
accomplished through decomposition of the time series as was the case with the Holt's
and Winter's methods.
      A new fuzzifier module was developed for this method exclusively for use in
times series extrapolation. The same is essentially true for the defuzzifer in which a
center of sets defuzzifer was adapted from its typically application in control systems.
      The Inference module used by Song (1991) in which the antecedent and
consequent of the fuzzy rules formed fuzzy logical pairs was retained. The manner in
which the antecedent and consequent for the rules were created however was
modified to reflect the periodicity of the time series. Using the periodicity of the
series in the forecasting model was adopted from the Winter's decomposition method.
      The Composition module in the DSA was adopted from the Chen method
                                           - 58 -
(1996), relied on Mamdani Fuzzy Logical Relationships and was introduced, by Chen,
to overcome several identified problems with the composition process used by of
Song and Chissom (1991).
      Two examples have been provided to illustrate the implementation of the four
modules of the DSA method on two of the time series used in this study. The next
chapter discusses the experimental design used in this study to validate the forecast
accuracy of this new method.




                                     CHAPTER 4

                                 METHODOLOGY


      This chapter describes the data and experimental design that were used to
produce the measures of forecast accuracy required to evaluate the specific
hypotheses discussed in section 2.12. The purpose of this chapter is to provide
information that will allow for replication of this current study in part or in full.
      This chapter begins with a comprehensive description of the traditional and
fuzzy extrapolative forecasting that were included, as required, in this current research
to establish the relative accuracy of the DSA method under various forecasting
conditions and for different data types. This includes a description of a combination of
methods.
      A subsequent section has been provided that reviews the seven measures of
forecast accuracy that served as the basis for establishing the relative forecast
accuracy of the methods compared in this study. This information is followed by a
description of the collection methods used to obtain, and the source and
characteristics of, the time series in this study for which ex ante forecasts were
produced.
      This information is followed by a description of the specific procedures that
were followed in each of three forecasting competitions that utilized, as required, the

                                            - 59 -
methods, accuracy measures and data described above. The methods, measures, data
and procedures used in this study were adopted from the M3 Forecasting Competition
held in 2000.
      In particular, nine subcategories of time series were selected that were defined
by time interval and data type. These data were selected because they were the series
for which statistically simple methods produced more accurate ex ante forecasts than
did statistically sophisticated methods.


4.1 Forecasting Methods


      The forecasting methods discussed in the following sub sections are classified
as statistically simple extrapolative forecasting methods. These methods rely on the
use of weights, referred to as model parameters, to establish the relationship between
the historical observations of the time series. This is with the exception of the DSA
method, which uses fuzzy logical relationships to establish the relationship between
the observations of the time series. Also, all of the methods in these sub-sections are
linear methods with the exception of the DSA method, which is a non-linear method.
All these methods rely on the established knowledge of the relationships between the
observations in the training set to produce forecast values.
      Statistically sophisticated methods in contrast, rely on more sophisticated
statistical theory, including correlation and covariance, to model the series, and this
includes their use in procedures for diagnostic testing and training on multiple data
sets. Statistically sophisticated methods include automated neural network methods,
the family of Box-Jenkins methods and expert systems.


4.1.1 Naïve 2


      The simplest of all forecasting methods is the Naïve method, also referred to as
a random walk. It is easy to understand and takes no calculation. The assumption in a
naïve model is that whatever happened last period will happen in the next period. So,
                                           - 60 -
the last observation in a time series becomes the forecasts for the next periods. A
slightly more involved version of the Naïve method is the Naïve 2 method. In this
method the last observation is adjusted for seasonality and that adjusted value is used
as the forecast. The Naïve 2 method lags trends and does not forecast turning points in
the time series. The Naïve 2 model reverts to the Naïve model when seasonality is not
present in the time series.


4.1.2 Single Exponential Smoothing (SES)


      Single Exponential Smoothing was introduced by Brown (1957) and is more
complex than Naïve 2. It produces forecasts that are in principal a weighted average
of the historical observations of the time series. In this instance however the weight
applying to older observations is exponentially decreased, hence the name exponential
smoothing. Single refers to the fact that the model uses only one smoothing parameter.
This weight can assume a value in the range 0.1-0.9. This method provides automatic
adjustment for past forecast errors and typically does not perform well when a trend
or seasonality is present. This method is primarily for the extrapolation of the average
component of a time series.


4.1.3 Holt's Linear Exponential Smoothing


      Holt's Linear Exponential Smoothing introduced by Holt (1959), is an extension
of single exponential smoothing and it provides for the extrapolation of a linear trend
in the historical observations of the time series in addition to extrapolating the average
component. This method uses two smoothing parameters, one of which is used to
forecast the level component and the other is used to forecast the trend component.
The weights each assume a value in the range 0.1-0.9. This method is also referred to
as double exponential smoothing for this reason.


4.1.4 Winter's Exponential Smoothing
                                          - 61 -
      Winter's Exponential Smoothing extends Holt's method by including an extra
equation that is used to adjust the forecast to reflect the presence of seasonality in the
historical observations of the time series. In this way Winter's method can forecast the
average, the trend and seasonal components of a time series. Thus this model uses
three smoothing parameters. Each of the parameters can take a value in the range 0.1
and 0.9.


4.1.5 Damped-Trend Exponential Smoothing


      Damped-Trend Exponential Smoothing was introduced by Gardner and
McKenzie in (1985). It is an extension of single exponential smoothing, as are the two
previously described methods. This method is also used to extrapolate both the level
and trend components of a time series. This method uses two smoothing parameters,
one is used to forecast the average component and the other is used to forecast the
trend component. The average component parameter is typically in the range of
0.1-0.9 and the trend parameter is in the range 0.7-1.0. In addition, there is a third
parameter referred to as the trend modifier. This parameter is used to reduce or
"damp" the amount of growth extrapolated into the future. It is from this feature lies
in the economic principle of diminishing return, which suggests that growth or decay,
the trend, are rarely a sustained feature over the long term.


4.1.6 Robust Trend


      The Robust Trend Model was introduced by Grambsch and Stahel, (1990), and
is a non-parametric version of Holt's Linear Exponential Smoothing method described
in subsection 4.1.3. As implied by the name this method was designed to forecast both
the average and trend components of a time series. In this method there is no
weighting oer se of the historical observations. The average component is simply a
Naïve forecast and the trend component is based on a median estimate of the
                                           - 62 -
differenced data. The median estimate of the trend, unlike the mean estimate, is robust
to the presence of outliers and it is for this reason that the method was named Robust
Trend.


4.1.7 Theta and Theta sm


      The Theta Method was introduced by Assimakopoulos and Nikolopoulos (2000)
and is a statistically simple extrapolative method. This method competed, and was one
of the top performers in the M3-Competition. These authors also introduced a
derivative of the Theta method called Theta sm or Theta Seasonal method. This
method also competed in the M3 competition. While the Theta Seasonal method did
not rival the performance of the Theta method, it did perform well in some situations
and for that reason it has also been included in this study.
      Relative to Theta method, the authors' description of how to implement the
method required many pages of mathematical calculations. Hyndman and Billah
(2003) examined the Theta model and found that it could be expressed more simply.
In fact, they demonstrate that Theta is comparable to Single Exponential Smoothing
with drift, that is with an added trend component plus a constant where the slope of
the trend is half that of the fitted line through the original time series. In any case
Theta has been shown to be a very accurate forecasting methods. The Theta sm
implementation represents a modification to the implementation of the Theta method.


4.1.8 Direct Set Assignment (DSA)


      The Direct Set Assignment Method introduced in this study utilizes Fuzzy
Logic. It has been hypothesized that fuzzy logic's capability to model real-life data
may provide for a more accurate forecasting method than the traditional and fuzzy
extrapolative methods currently available.
      The DSA method provides a non-linear mapping of the relationship between
historical observations of the time series. This method captures the relationship
                                           - 63 -
between historical observations in the form of relationships between fuzzy sets.
Knowledge of these relationships is then used to produce a fuzzy forecast in the form
of fuzzy sets. The fuzzy sets are then defuzzified to produce scalar forecasts. An
implementation of the DSA method has been provided in sections 3.2 and 3.3.


4.1.9 Combination of Methods


      A major finding of the original M-competition and one that has been reaffirmed
in many accurate studies and forecasting competitions since that time, is that a
combination of alternative methods will often produce forecasts that are more
accurate than the forecasts produced by each of the alternative methods in their native
form. It has been speculated that a combination of methods provides a more accurate
forecast because each method being combined in some way offsets the forecast error
of the other methods. For example, consider that method A, and method B, are being
combined. Method A provides forecasts that continually over estimate demand, and
Method B provides forecasts that continually underestimate demand. If the forecasts
of the two methods for a common time period are combined, the average of the two
forecast values would be the forecast for that time period. In the example, the average
would likely be more accurate then either of the two original forecasts. In this study
the traditional methods as well as the DSA method have been combined, as required,
for each of the forecasting competitions.


4.2 Forecast Accuracy Measures


      A number of measures of forecast accuracy have been developed during the
past two decades. These measures are used for selecting the forecasting method that
will produce the most accurate forecasts for a given situation of data type. The
research presented in section 2.4 raises concern about the utility of some of these
accuracy measures for selecting the most accurate method.
      The measure discussed in the following subsections represent those measures
                                            - 64 -
that have been found to be most appropriate for use in forecasting competitions. In
each case the accuracy measure reflects the difference between the observed and
forecast value for a given time period. This value is indicated in the calculations for
these measures by the symbol (et). Each measure evaluates and combines the (et)
values for a given method in such a way that it can be used to make a statement about
the methods relative, and in some instances, absolute forecast accuracy.


4.2.1 Symmetric Mean Absolute Percentage Error (sMAPE)


      Symmetric MAPE is the average sAPE value for the same forecast horizon, for
a selection of time series, for a specific method when used to establish the relative
accuracy of various methods over a large selection of time series. However this
measure can also be used to establish the accuracy of forecast method over the
forecast horizons of a single time series.
      Using the sMAPE avoids the problem caused by large errors when the observed
values are close to zero, as well as when there are large difference between the
absolute percentage error, that occurs when the observed value is greater than the
forecast value. Finally sMAPE has the advantage of being easy to interpret as it
expresses forecast error as a percentage of the observed value. The sMAPE value is
calculated as the average of the sAPE values for each forecast horizons of a selected
series for each method or for all the forecast horizons of a selected series. The sAPE
value is calculated as the ratio of the absolute difference between the forecast and

observed values, et and the average of the sum of the forecast and observed values.

This value is multiplied by one hundred to convert the value to a percentage.


4.2.2 Median Absolute Percentage Error (MedAPE)


      Although it is not indicated in its name, MedAPE is the median sAPE value for
the same forecast horizon, for a selection of time series, for a specific method or for


                                             - 65 -
all forecast horizons for a selected series for a particular method. This measure has the
advantage of not being influenced by extreme values and for this reason is more
robust than sMAPE. The measure is reasonably easy to interpret. The MedAPE value
is calculated as the median of the sAPE values for each forecast horizon across all
selected series for each method or for all forecast horizons of a particular series. The
sAPE value is calculated as the ratio of the absolute difference between the forecast

and observed values, et and the average of the sum of the forecast and observed

values. This value is multiplied by one hundred to convert the value to a percentage.


4.2.3 Mean Absolute Deviation (MAD)


      This accuracy measure reflects the average dispersion about the mean and for
this reason it can be interpreted as the standard deviation of the forecast value. In
short this measure indicates how much, more or less, the forecast will be than the
actual observation in the units of the time series. In addition, this measure can be used
to create confidence intervals for the forecasts value. The MAD value is found by
taking the average across all selected series, of the absolute difference between the

forecast and observed value et for each forecast horizon.



4.2.4 Median Relative Absolute Error (MedRAE)


      This measure has been found to be particularly well suited for comparing the
accuracy of various methods as in the case of this forecasting competition. The

Median Relative Absolute Error is calculated as the absolute error et of a proposed

model divided the absolute error et of the Naïve 2 model for a given series. The

median value of these ratios across a number of time series can be found, and this
term is the Median Relative Absolute Error. This accuracy measure is reasonably easy
to interpret and lends itself to summarizing across horizons and series as it controls


                                          - 66 -
for scale and for outliers.


4.2.5    Percentage Better


        The percentage Better measure is used to count the percentage of time that a
given method has a smaller forecasting error than another method. In this study, for
this accuracy measure, the method to which all of the methods were compared is
Theta. Theta was one of the best performing statistically simple methods in the M3
competition held in 2000. In this measure each forecast is given equal weight and for
this reason it is a useful measure for use in forecasting competitions.


4.2.6 Average Ranking


        In a forecasting competition the methods under evaluation can be given a rank
that indicates their accuracy, relative to that of the other methods in the competition.
The method with the lowest rank is the method with highest relative accuracy. The
Average Rank can be based on a number of other measures of forecast accuracy, but
in most cases is based on one of the numerous MAPE values. In this study it was
based on the sAPE values, as was the case in the M3 competition.
        To produce the average rank for a method, a rank is assigned for each forecast
horizon for each of the selected time series for all methods under evaluation based on
the methods sAPE value for each forecast horizon for each series or across the
forecast horizons of a particular series. Then the ranks for each forecast horizon for
each series are averaged for each method or are averaged across the forecast horizons
of a particular series. This measure while typically used in the aggregate can be used
to compare the forecast accuracy of various methods for a single series.


4.2.7 Benchmark


        The absolute accuracy of the methods in a competition is not as important as
                                          - 67 -
how well each method performs relative to a benchmark method. The simplest
benchmark and the one used in this study is Naïve 2. The benchmark value is the
difference in the sMAPE value between Naïve 2 and the alternative method. Positive
values indicate that the alternative method produced more accurate forecast for the
selected time series than did Naïve 2. The alternative methods can then be evaluated
in terms of how much better of worse the performed than Naïve 2. The use of sMAPE
allows for the difference to be interpreted as a percentage.


4.3 Determination of Periodicity


      The periodicity, or degree of seasonality, of a particular time series can be
determined in several different ways. These include conducting a visual inspection of
a plot of the historical observations of the time series itself, examining the
autocorrelation function for the time series or through an algebraic calculation of the
seasonal indices for the time series. In practice and in competitions the latter approach
is generally preferred. The most frequently used approach to calculation of seasonal
idiocies is the ratio-to-moving average method. In this method the ratio of the actual
observation to a centered moving average forecast is calculated. This ratio produces a
de-trended value for each period, typically a month or quarter. The average of these
de-trended values for similar periods (i.e., quarter one for all years covered by the
time series) is the seasonal indices. If the value of each of the seasonal indices is 1.00
then seasonality is not present in the time series. If the calculated value of the indices
is other than 1.00 then seasonality is present and the periodicity is the number of
indices with a value other than 1.00.


4.4 Data


      Fifteen time series were randomly selected, without replacement, from each of
nine subcategories of data used in the M3 forecasting competition held in 2000, for a
total of one hundred thirty-five time series. These data were organized as yearly,
                                          - 68 -
quarterly and monthly categories of microeconomic, macroeconomic and industry
data. This created the nine subcategories referenced above. The time dimension refers
to the time interval between successive observations.
      Makridakis and Hibon (2000) collected the original data for the M3 competition
on a quota basis. The three thousand three time series collected for the M3
competition were real-world, heterogeneous, business time series of yearly, quarterly,
monthly other data each containing microeconomic, macroeconomic, industry,
demographic and financial data. This created a total of twenty subcategories of time
series.
      Makridakis and Hibon used a variety of means to collect the M3 Competition
time series. These include written requests for data sent to companies, industry groups
and government agencies, as well as the retrieval of data from the Internet and more
traditional sources of business and economic data. The authors labeled the three
thousand three time series by creating a unique ID number for each series ranging
from N0001 to N3003. They also assigned to each of these time series a brief
description     of    the     data     type,       (i.e.,   SALES,     INVENTORIES,
COST-OF-GOODS-SOLD, etc.) and included the time period from which the data
was generated. Further, the authors partitioned each of the three thousand three time
series into a calibration data set and a validation data set. The calibration data was
used to calibrate the forecasting methods and the validation data set was used to
evaluate the accuracy of the ex-ante forecasts for each time series.
      The validation data set contains the last six observations for each of the yearly
time series; the last eight observations for the quarterly time series; and the last
eighteen observations for monthly time series. The number of observations in the
validation data set represents the forecast horizon, or the number of ex-ante forecasts
that had to be produced for that particular time series. The entire M3 Competition data
set can be retrieved form: www.marketing.wharton.upenn.edu/forecast/data.html.
      In the one hundred thirty-five time series selected for this study, the minimum
series length, for yearly series was twenty and the maximum length was forty-seven;
for quarterly data the minimum length was twenty four and the maximum length was
                                          - 69 -
seventy two with a mean and median length respectively of fifty-two and fifty-four;
and for monthly data the minimum length was sixty-nine and the maximum length
was one forty-four with a mean and medium length respectively of one hundred
twenty-two and one hundred thirty-four. These values are consistent with the values
reported by Makridakis and Hibon, (2000) for all of the time series in the same nine
subcategories of data. Table 1 in Appendix A provides complete descriptive statistics
including, periodicity, for all of the one hundred thirty five time series evaluated in
this study.
      To ensure that each of the methods used in the competition was fairly evaluated
all forecasts for the traditional methods used in the study were obtained from the
experts used in the M3-Forecasting Competition via Michelle Hibon, Senior Research
Fellow, from INSEAD Business School, and coauthor of the M3-Competition. That is
with the exception of the DSA method, whose forecasts, were produced by the author
of this current study.


4.5 The Forecasting Competitions


      Three forecasting competitions were conducted in this study to evaluate the
hypotheses discussed in section 2.12. These competitions adopted the methodology
used in the M3 Forecasting Competition for a newly introduced method, Theta. In this
current study the competitions were used to establish the relative forecast accuracy of
the extrapolative forecasting methods discussed in section 4.1 of this chapter as
required, as were the accuracy measures described in section 4.2, for the three
competitions. The time series used in these competitions are those discussed in
section 4.3 of this chapter, also used as required for each of the three competitions.
      The measures of forecast accuracy were calculated for the forecast horizons of
individual series, for subcategories of series, for categories of series and for all of the
series used in a competition, as required. This is referred to as the level of series
aggregation. In practice having knowledge of which method provides the most
accurate forecasts for a particular data type or time interval is quite useful, as is the
                                           - 70 -
knowledge that a particular method provides the most accurate forecast, over a broad
cross-section of time intervals.


4.5.1 Competition #1 Procedures


      In Competition #1, the relative forecast accuracy of nineteen models of the
DSA method was established. The nineteen DSA models evaluated in this competition
differed in the value of their fuzzy set parameter. The parameter values examined
were the whole numbers from two through twenty. The fuzzy set parameter is the
number of fuzzy sets used to model the training set for the time series. These models
were labeled (FS2) through (FS20).
      Six, eight and eighteen ex ante forecasts were produced for each one of the one
hundred thirty-five yearly, quarterly and monthly time series, respectively. These
forecasts were produced in several automated Micro Soft Excel workbooks. These ex
ante forecasts were then compared to the observed values in the validation data set for
each one of the time series.
      The two measures of forecast accuracy, sMAPE and Average Ranking, were
calculated for each method for, each time series across all its forecast horizons for the
series level of aggregation. All seven accuracy measures however were calculated for
the time series in each of the nine subcategories and the time series in each of the
three categories. The most accurate model was the one that minimized both sMAPE
and Average Ranking for the series level. A consensus method was used to identify
the most accurate model at the subcategory and category level of aggregation. This
approach allowed for a statement to be made as to the relative accuracy of a particular
method for, and individual series, a subcategory, as well as a category.
      In addition to testing hypotheses H01 and H02 discussed in section 2.12, the
goal of this competition was to identify the DSA model that produced the most
accurate forecasts for individual series, for subcategories of series and for categories
of series so that the forecasts for those models could be used in competition number
two and number three. As such, the forecasts for the models of best fit by series,
                                          - 71 -
subcategory and category were combined and relabeled DSAA, DSAB and DSAC,
respectively.


4.5.2 Competition #2 Procedures


      In Competition #2, the relative forecast accuracy of, the eight traditional
extrapolative methods, a combination of SES, Holt's and Dampen Trend methods,
designated S-H-D, DSA-A, DSA-B, DSA-C and a combination of the three DSA
methods with Winter's Exponential Smoothing, designated DSAA-W, DSAB-W and
DSAC-W, was established. Winters method was selected as it was the most complete
of the exponential smoothing methods.
      Six, eight and eighteen ex ante forecasts were obtained from experts for each
one of the one hundred thirty-five yearly, quarterly and monthly time series,
respectively. These ex ante forecasts were then compared to the observed values in the
validation data set for each one of the time series.
      The seven measures of forecast accuracy discussed in section 4.2, were
calculated for each method for, the time series in each of the nine subcategories, the
time series in each of the three categories and for all one hundred thirty five
time-series used in this competition. In this competition the three methods with the
highest observed accuracy by subcategory, category and overall were selected based
on a consensus among the seven accuracy measures. This approach allowed for a
statement to be made as to the relative accuracy of a particular method for, a
subcategory of series, a category of series and for all of the time series in this
competition.
      Competition #2 was conducted specifically for the purpose of testing
hypotheses H03, H04, H05 and H06.


4.5.3 Competition #3 Procedures


      In Competition #3, the relative forecast accuracy of, the eight traditional
                                           - 72 -
extrapolative methods, a combination of SES, Holt's and Dampen Trend methods,
designated S-H-D, DSA-A, DSA-B, DSA-C and a combination of the three DSA
methods with Winters Exponential Smoothing, designated DSAA-W, DSAB-W and
DSAC-W, was established. Winters method was selected as it was the most complete
of the exponential smoothing methods.
      Six, eight and eighteen ex ante forecasts were obtained from experts for each
one of forty-five time series each containing a statistically significant trend. These
forty-five series were comprised of fifteen series that were randomly selected without
replacement from each of the three categories of yearly, quarterly and monthly time
series. These ex ante forecasts were then compared to the observed values in the
validation data set for each one of the time series.
      The seven measures of forecast accuracy discussed in section 4.2, were
calculated for each method for, the time series in each of the three categories and for
all forty-five, time series used in this competition. In this competition the three
methods with the highest observed accuracy by category, and overall, were selected
based on a consensus among the seven accuracy measures. This approach allowed for
a statement to be made as to the relative accuracy of a particular method for, a
subcategory of series, a category of series and for all of the time series in this
competition. Competition #3 was conducted specifically for the purpose of testing
hypotheses H07.


4.6. Concerns About Forecasting Competitions


      A longstanding concern, when using the forecasting competition design, is that
the accuracy measures are averaged across series and over different forecasting
horizons. The effect potentially is to obscure the top performance of a model on
specific series, or specific forecast horizons, when in fact these series or forecast
horizons, could be of primary interest to a practitioner. This problem is exacerbated as
the number of series and forecasts produced is increased. Further, by not bringing to
the attention of forecasters, a model's top performance on some limited number of
                                           - 73 -
series or forecast horizons, the inevitable question as to why the model performed so
well in these situations is never asked. So, declaring a particular model as the winner
of a competition has some real limitations to its importance, and this is particularly
true when declaring a model the overall winner of a competition.
      Another concern is with the nature of the data used in the M-competitions and
the other accuracy studies that rely on these data. Tashman (2001), suggests that time
series are multi-attributed and that a term such as "microeconomic data" used in the M
competitions is a catch all for series that actually vary greatly with respect to a
number of features including: company, brand, item, product, financial, marketing,
operations, country and region. In addition, these data vary by seasonality, level of
volatility, presence of outliers, and whether or not a trend is present.
      The problem is that the time series in the data types used in the M competitions,
(microeconomic, industry, microeconomic, demographic and financial), are actually
quite heterogeneous within the data type subcategories and quite homogeneous with
respect to the series in the other subcategories. This makes conclusions about a
methods performance on a subcategory or category less meaningful as these levels of
aggregation do not represent the way in which time series are encountered in the real
world. Further, in the M-competitions held since 1982 the time origins of the data
have been predominately yearly, quarterly and monthly, where as in business, a great
deal of data is captured on an hourly, daily or weekly basis. Data with these time
origins has not been used in the M competitions and it is unlikely the results on yearly,
quarterly or monthly data can be generalized to these different time origins.
      In section 2.5 of this current study a discussion was provided on the reasons
why difference testing is not used to establish forecast accuracy in forecasting
competitions. While the reasons enumerated for not using different testing are
justified, there is far less justification for not using prediction intervals in place of the
current point estimates.
      Prediction intervals are constructed from the standard deviation, which in
forecasting is the accuracy measure, Mean Absolute Deviation (MAD). As such, the
debate from section 2.5, on which accuracy measure to use becomes moot. Further, as
                                            - 74 -
to the use of the Percent Better accuracy measure, there is no reason why this
accuracy measure could not be used in conjunction with prediction intervals. Finally,
while the argument provided for methods being reasonable alternatives still holds, the
fact remains that the use of prediction intervals would help to resolve with greater
justification, issues of relative forecast accuracy.
      Another topic of concern is the absence of domain knowledge about the series
used in the competitions. Authors including Armstrong (2001), suggest that domain
knowledge about the series used in the competition should be provided to participants.
This would ensure that any methods that could benefit from domain knowledge would
be fairly evaluated in the study and, this approach better represents how practitioners
produce forecasts. As such, the caveat to a particular model being declared the winner
at some level of series aggregation is that other models may have performed as well if
domain knowledge had been made available.


4.7 Summary


      Three forecasting competitions were conducted to investigate the relative
forecast accuracy of the Direct Set Assignment (DSA) method. This is a newly
developed fuzzy logic based extrapolative forecasting method that was investigated
due to its potential to provide more accurate ex ante forecasts than currently available
statistically simple extrapolative forecasting methods.
      The data, procedures and alternative forecasting methods used in these
competitions, as required, were adopted from the M3 forecasting competition held in
2000. The alternative methods included eight traditional methods including a
combination of three traditional methods.
      These competitions were conducted to answer several specific questions
concerning the impact of the fuzzy set parameter on the relative forecast accuracy of
the DSA method, as well as questions about the relative accuracy of the DSA method,
on different types data including series in which a trend was present. An additional
question was what would be the effect on relative accuracy of combining the most
                                            - 75 -
accurate fuzzy method with a selected traditional method.
      In the next chapter the results of the three competitions are presented.


                                      CHAPTER 5
                                       RESULTS


      This chapter reports the relative forecast accuracy of the Direct Set Assignment,
(DSA) forecasting method introduced in Chapter 3, and that of the alternative
forecasting methods that were analyzed in three forecasting competitions outlined in
section 4.5.
      The one hundred thirty-five time series that were randomly drawn from the
M3-Competition data set were evaluated in their entirety in Competition #1 and #2,
resulting in the analysis of, twenty-seven thousand three hundred and sixty forecasts,
and twenty one thousand six hundred forecasts, respectively. In Competition #3 a
sample of forty-five series, comprised of fifteen series randomly drawn from each of
the three categories of time series, in the sample drawn for this study, were evaluated,
resulting in the analysis of four thousand fifty forecasts.
      The results generated from the three competitions are too numerous to present
in their entirety in this chapter. Therefore, tables containing the accuracy measures for
the methods evaluated in Competitions #1, #2 and #3 can be found in Appendix B, C,
and D respectively. A description of the information provided in each Appendix can
be found in the sections describing the results for each competition, accompanied by
the appropriate summary tables. The tables and commentary in the remainder of this
chapter report the relative forecast accuracy of the best performing methods in each
competition for the various levels of series aggregation.


5.1 Competition #1 Results


      In Competition #1 the goal was to assess the impact on the forecast accuracy of
the DSA method of varying the fuzzy set parameter in the model from two sets (FS2),
                                           - 76 -
to twenty sets (FS20), and to determine if there is an optimal or universal number of
fuzzy sets. Relative forecast accuracy was assessed at three levels of aggregation.
They are individual series, subcategory and category.
      The results for Competition #1 indicate that there is no universal or optimal
fuzzy set parameter for the DSA method. Rather, the set parameter that will yield the
most accurate forecasts is specific to each individual series, each subcategory of series
and each category of series. This finding suggests that the set parameter in the DSA
method functions in a similar manner to the parameter weights used in exponential
smoothing methods.


5.1.1 Individual Series Competition #1


      The values of sMAPE and Average Ranking accuracy measures for the one
hundred thirty five series for the various forecast horizons, for the methods evaluate in
this competition have been reported, by category, in Table B.1 - Table B.6 in
Appendix B.
      Table 5.1 – Table 5.3, in this chapter, reports the DSA model that was selected
as the model providing the highest observed accuracy for each individual series across
its own forecast horizons. The (FS #) designation in these tables indicates the value of
the fuzzy set parameter for that DSA model. The forecasts provided by the DSA
models with the highest observed accuracy by series, reported in Table 5.1 – Table 5.3,
were combined and relabeled the DSA-A model and were evaluated in Competition
#2.


Table 5.1
Model Which Gives Best Results by Series – Yearly Data


Table 5.2
Model Which Gives Best Results by Series – Quarterly Data


                                          - 77 -
Table 5.3
Model Which Gives Best Results by Series – Monthly Data


5.1.2 Subcategory Competition #1


      The values of the seven accuracy measures for the nine subcategories of time
series, for the various forecast horizons, for the methods evaluated in this competition
have been reported in Table B.7 – Table B.69 in Appendix B. Table 5.4 – Table 5.12 in
this chapter reports, in order, the three DSA models with the highest observed
accuracy for each of the seven measures of forecast accuracy, for each the nine
subcategories of series respectively across the various forecast horizons. Table 5.13,
in this chapter, reports the DSA model that was selected, based on the average across
all forecast horizons, as the model providing the highest observed accuracy for each
of the nine subcategories for each of the seven accuracy measures.
      Table 5.13A reports the DSA models that were selected on a consensus basis for
each subcategory, with sMAPE breaking ties, from Table 5.13, as the model with the
highest observed accuracy by subcategory. The forecasts from these models were
combined, and labeled, and DSA-B model. The forecast accuracy of the DSA-B
model was established in Competition #2.


Table 5.4
Best Models For Yearly Micro Series


Table 5.5
Best Models For Yearly Industry Series


Table 5.6
Best Models For Yearly Macro Series


Table 5.7
                                         - 78 -
Best Models For Quarterly Micro Series


Table 5.8
Best Models For Quarterly Industry Series


Table 5.9
Best Models For Quarterly Macro Series


Table 5.10
Best Models For Monthly Micro Series


Table 5.11
Best Models For Monthly Industry Series


Table 5.12
Best Models For Monthly Macro Series


Table 5.13
Models which give best results – subcategory


5.1.3 Category Competition #1


      The values of the seven accuracy measures for the three categories for the
various forecast horizons, for the methods evaluated in this competition, have been
reported in Table B.70 – Table B.90 in Appendix B. Table 5.14 – Table 5.16 in this
Chapter reports, in order, the three DSA models with the highest observed accuracy,
for each of the seven measures of forecast accuracy, for each of the three categories of
series respectively, across the various forecast horizons.
      Table 5.17 in this Chapter, reports the DSA model that was selected as the
model providing the highest observed accuracy for each of the three categories, based
                                          - 79 -
on the average across all identical forecast horizons, for each of the seven accuracy
measures. In Table 5.17, it can be seen that generally, the ranking of the models varies
according to the error measure being used.
      Table 5.17A reports the DSA models that were selected on a consensus basis by
category, with sMAPE breaking ties, from Table 5.17, as the model with the highest
observed accuracy by category. The forecasts from these models were combined, and
labeled the DSA-C model. The forecast accuracy of the DSA-C model was
established in Competition #2.


Table 5.14
Best Models For Yearly All Data


Table 5.15
Best Models For Quarterly Series


Table 5.16
Best Models For Monthly Series


Table 5.17
Models Which Give Best Results – category


5.2 Competition #2 Results


      In Competition #2 the goal was to establish the relative forecast accuracy of
several models of the DSA method that had proven to be the most accurate at the
series, subcategory, and category level of aggregation in competition #1, and eight
traditional methods; a combination of traditional methods and a combination of the
DSA methods and Winters Exponential Smoothing. Relative forecast accuracy was
assessed at three levels of aggregation for all one hundred thirty-five series used in
this competition. They are the subcategory, category and all series levels of
                                         - 80 -
aggregation.
      In Competition #2 the DSAA and DSAA-W models provided more accurate
forecasts at the subcategory, category and all series levels of aggregation than did the
DSAB, DSAC, DSAB-W or DSAC-W.
      In the subcategory competition the DSAA model provided the highest observed
accuracy in the five subcategories: Yearly-Micro and Yearly-Industry data, and
Quarterly-Micro, Quarterly-Industry and Quarterly-Macro data. The DSAA model
was also a top three performer on observed accuracy in the Monthly-Micro
subcategory.
      The DSAA-W model provided the highest observed accuracy in the two
subcategories: Monthly-Micro and Monthly Industry-Industry Data. The DSAA-W
model was also a top three performer in the five subcategories: Yearly-Industry,
Quarterly-Micro, Quarterly-Industry, Quarterly-Macro and Monthly-Macro data. The
DSAB-W and DSAC-W models were top three performers in the Quarterly-Industry
and Quarterly-Macro subcategories. It was in only the subcategory of Yearly-Macro
data where none of the DSA method derivatives were among the three models with
the highest observed accuracy.
      In the category competition the DSAA model provided the highest observed
accuracy in the Quarterly Category and was a top three performer in the Yearly and
Monthly Categories. The DSAA-W model provided the highest observed accuracy in
the Monthly Category and was a top three performer in the Yearly and Quarterly
Categories. The DSAC-W model was a top three performer in the Quarterly Category.
      In the All Series competition the DSA-A model had the highest observed
accuracy of all models and the DSAA-W model was a top three performer.
      In addition to the DSA methods, other top performing methods in the
subcategory competition included Theta and Robust Trend. In the subcategory
competition Robust Trend provided the highest observed accuracy in two
subcategories: Yearly-Macro and Monthly-Macro data and was a top three performer
in the two additional subcategories: Yearly-Micro and Quarterly-Macro. Theta was a
top three performer in the three subcategories: Yearly-Industry and Yearly-Macro and
                                         - 81 -
Monthly-Industry.
      In the category competition Robust Trend provided forecasts with the highest
observed accuracy in the Yearly Category. Theta was a top three performer in the
Monthly Category. In the All Series competition Theta was a top three performer. It is
worthy to note that Robust Trend and Theta were two of the top performing methods
in the M3 competition.


5.2.1 Subcategory Competition #2


      The values of the seven accuracy measures for the nine subcategories for the
various forecast horizons, for the methods evaluated in this competition, have been
reported in Table C.1- Table C.63 in the Appendix C. Table 5.18- Table 5.26 in this
Chapter reports, in order, the three models with the highest observed accuracy for
each of the seven measures of forecast accuracy, for each of the nine subcategories of
series respectively, across the various forecast horizons.
      Table 5.27 reports, for the convenience of the reader, the models with the
highest observed accuracy for each subcategory for each accuracy measure based on
the average across all forecast horizons. Table 5.27A reports the model that was
selected on a consensus basis, with sMAPE breaking ties, by subcategory from Table
5.27, as the model with the highest observed accuracy by subcategory. In Table 5.27,
it can be seen that the ranking of the models varies according to the accuracy measure
being used.


Table 5.18
Best Models For Yearly Micro Series


Table 5.19
Best Models For Yearly Industry Series


Table 5.20
                                          - 82 -
Best Models For Yearly Macro Series


Table 5.21
Best Models For Quarterly Micro Series


Table 5.22
Best Models For Quarterly Industry Series


Table 5.23
Best Models For Quarterly Macro Series


Table 5.24
Best Models For Monthly Micro Series


Table 5.25
Best Models For Monthly Industry Series


Table 5.26
Best Models For Monthly Macro Series


Table 5.27
Models which give best results - Subcategory


5.2.2 Category Competition #2


      The value of the seven accuracy measures for the three categories, for the
various forecast horizons, for the methods evaluated in this competition, have been
reported in Table C.64 – Table C.84 in Appendix C. Table 5.28 – Table 5.30 in this
Chapter reports, in order, the three models with the highest observed accuracy for
each of the seven measures of forecast accuracy, for each of the three categories of
                                         - 83 -
series respectively, across the various forecast horizons.
      Table 5.31 reports, for the convenience of the reader, the model with the highest
observed accuracy, based on the average across all forecast horizons, for each
category for each of the seven accuracy measures. Table 5.31A reports the model that
was selected on a consensus basis, sMAPE breaking ties, by category from Table 5.31,
as the model with the highest observed accuracy by category. In Table 5.31 it can be
seen that the ranking of the models varies according to accuracy measure being used.


Table 5.28
Best Models For Yearly All Data


Table 5.29
Best Models For Quarterly All Data


Table 5.30
Best Models For Monthly All Data


Table 5.31
Models which give the best results – category


5.2.3 All Series Competition #2


      The values of the seven accuracy measures for all one hundred thirty five series
in competition #2, for the various forecast horizons, for the methods evaluated in this
competition have been reported in Table C.85 – Table C.91. Table 5.32 in this Chapter
reports, in order, the three models with the highest observed accuracy for each of the
seven measures of forecast accuracy, for all one hundred thirty five series in
competition #2 respectively, across the various forecast horizons.
      Table 5.33 reports, for the convenience of the reader, the model with the highest
observed accuracy, based on the average across all forecast horizons, for all one
                                          - 84 -
hundred thirty five series in competition #2, for each of the seven accuracy measures.
Table 5.33A reports the model that was selected on a consensus basis for all series,
with sMAPE breaking ties, from Table 5.33 as the model with the highest observed
accuracy overall. In Table 5.33 it can be seen that the ranking of the models varies
according to accuracy measure being used.
      Table 5.34 reports the sMAPE values for the combination methods evaluated in
competition #2 for the models in both combined and native form. These values
indicate that with the exception of the sMAPE values for the DSAA model for the
average of the 1-4, 1-6 and 1-8 forecast horizons, combined methods perform at least
as well as do the methods in their native form.


Table 5.32
Best Models For Overall Data


Table 5.33
Model which gives best results – overall


Table 5.34
Symmetric MAPE of Single, Holt, Dampen, DSA-A and their combinations


5.3 Competition #3 Results


      In Competition #3 the goal was to establish the relative accuracy of several
models of the DSA method that had proven to be the most accurate at the series,
subcategory, and category level of aggregation in Competition #1, and eight
traditional methods; a combination of traditional methods and a combination of the
DSA methods and Winters Exponential Smoothing. Relative forecast accuracy was
assessed at two levels of aggregation for forty-five series containing a statistically
significant trend. They are the category and all series levels of aggregation.
      In competition #3, the DSAA model provided the highest observed accuracy in
                                           - 85 -
the Quarterly Category was one of three methods with the highest observed accuracy
in the Monthly Category. The DSAA-W model provided the highest observed
accuracy in the Yearly Category and was one of the three methods with the highest
observed accuracy in the Quarterly and Monthly Categories. The DSAC-W model
was a top three performer in the Quarterly Category. In the All Series competition,
DSAA was one of three models with the highest observed accuracy for all forty-five,
time series and DSAA-W provided the highest observed accuracy in competition #3.
      The other models that performed well in the category competition were Theta
and Holt's Exponential Smoothing. Theta provided the highest observed accuracy in
the Monthly Category and was a top three performer in the Yearly Category. Holt's
Exponential Smoothing was a top three performer in the Yearly Category. In the All
Series Competition, Theta was also a top three performer.


5.3.1 Category Competition #3


      The values of the seven accuracy measures for the three categories for the
various forecast horizons for the methods evaluated in this competition have been
reported in Table D.1 – Table D.63 in Appendix D. Table 5.35 – Table 5.37 in this
Chapter reports, in order, the three models with the highest observed accuracy for
each of the seven measures of forecast accuracy, for each of the three categories of
series respectively, across the various forecast horizons.
      Table 5.38 reports, for the convenience of the reader, the model with the highest
observed accuracy, based on the average across all forecast horizons, for each
category for each of the seven accuracy measures. Table 5.38A reports the model that
was selected on a consensus basis, with sMAPE breaking ties, from Table 5.38, as the
model with the highest observed accuracy by category. In Table 5.38 it can be seen
that the ranking of the models varies according to accuracy measure being used.


Table 5.35
Best Models For Yearly – Trend Series
                                          - 86 -
Table 5.36
Best Models For Quarterly – Trend Series


Table 5.37
Best Models For Monthly – Trend Series


Table 5.38
Models which give the best results- category


5.3.2 All Series Competition #3


      The values of the seven accuracy measures for all forty-five series used in
competition #3, for the various forecast horizons have been reported in Table D.64 –
Table D.90 in Appendix D. Table 5.39 in this Chapter reports, in order, the three
models with the highest observed accuracy for each of the seven measures of forecast
accuracy, for all forty-five series used in Competition #3 respectively, across the
various forecast horizons.
      Table 5.40 reports, for the convenience of the reader, the model with the highest
observed accuracy, based on the average across all forecast horizons, for all forty five
time series used in Competition #3, for each of the seven accuracy measures. Table
5.40A reports the model that was selected on a consensus, sMAPE breaking ties, from
Table 5.40 as the model with the highest observed accuracy for all forty-five series
overall. In Table 5.40 it can be seen that the ranking of the models does not vary
according to accuracy measure being used.


Table 5.39
Best Models For Overall – Trend Series


Table 5.40
                                         - 87 -
Model which gives the best results overall – Trend Series


5.4 Summary


      In this study three forecasting competitions were conducted for the purpose of
testing eight specific null hypotheses regarding the relative forecast accuracy of the
Direct Set Assignment forecasting method. Several important observations have been
made. Firstly, there does not appear to be a universal or single value of the fuzzy set
parameter that will yield the most accurate forecasts in all situations and for all types
of data. In fact, each of the fuzzy set parameter values provide the most accurate
forecast by series for several different series.
      Further, in this study it was observed that the parameter value is series specific,
as opposed to data type specific. It was also found that the DSAA and DSAA-W
outperformed the DSAB, DSAC, DSAB-W and DSAC-W models, albeit in two
instances the DSAB-W and DSAB-W were ranked a top three performing models.
However they were not ranked higher than the alternative of DSAA and DSAA-W
models and did not perform well overall.
      Secondly, the DSA-A and the DSAA-W models performed remarkably well
relative to the other methods, across subcategories and categories of time series that
differed by data type and time interval. In addition, these methods performed
extremely well on monthly and quarterly data in which both a trend and seasonal
component was present, and on data where only a trend was present. Further the DSA
method produced accurate forecast for series with short as well as long forecast
horizons and for series with short as well as long training sets.
      Lastly, there were two sets of combination models, one in which traditional
methods only were combined, and one in which traditional and fuzzy methods were
combined. In the case of the traditional combination S-H-D, it performed at least as
well as each of the traditional models did in their native form in fifteen of eighteen
comparisons. The three instances where S-H-D did not outperform the traditional
model where in the comparison with Dampen on average forecast horizons 1-4, 1-6
                                            - 88 -
and 1-8.
      In the case of the DSAA-W combination, it performed at least as well as the
DSAA method in one of six comparisons and at least as well as the Winter's method
in six of six comparisons. The DSAB-W combination performed at least as well as the
DSAB and Winter's models in twelve of twelve of comparisons. The DSAC-W
method also, performed at least as well as the DSAC and Winter's methods in twelve
of twelve comparisons.
      In the next chapter the results of the three forecasting competitions will be used
to evaluate the eight research hypotheses outlined in section 2.12.


                                     CHAPTER 6
                                     DISCUSSION


      The purpose of this study was to introduce and validate the ex ante forecast
accuracy of the Direct Set Assignment extrapolative forecasting method. This method
was developed in response to a reported need in the forecasting literature for a
statistically simple extrapolative forecasting method that would be robust to the
fluctuations that exist in real-life business and economic data. Extrapolative
forecasting methods are one of a large group of quantitative forecasting methods that
produce a future quantitative value of a variable of interest, by extrapolating the
historical values of that variable. To use these methods the historical values of the
quantitative variable must be organized as a time series.
      The DSA method differs from traditional extrapolative forecasting methods in
that it uses fuzzy logic or more specifically, fuzzy sets, to model the relationships
between the historical observations of a time series. Fuzzy logic is a data processing
technology that has earned a reputation for being a robust to real-life data for a variety
of applications including systems control and signal processing.
      It has been hypothesized that the DSA method will provide more accurate
forecasts than traditional, statistically simple methods, due to the DSA methods use of
fuzzy logic. To validate the relative forecast accuracy of the DSA method its
                                          - 89 -
performance has been established through the use of a forecasting competition
methodology. Specifically three competitions have been conducted that have relied on
the standards methods and procedures used in the M3 International Forecasting
Competition held in 2000.
      The next section of this chapter discusses the hypotheses that were discussed
and outlined in section 2.12 in the context of the results of the three forecasting
competitions. For convenience the hypotheses have been restated in this Chapter. In
subsequent sections the evaluation of the hypotheses has lead to a discussion of the
theoretical implications of this research. The Chapter closes with a discussion on the
future directions of the research on the Direct Set Assignment forecasting method and
some concluding remarks.


6.1 Evaluation of Research Hypotheses


      HO1: The ex-ante forecast accuracy of the DSA method will not change in
response to a change in the number of fuzzy sets, all other model parameters held
constant.
      In Competition #1 all parameters of the DSA method were held constant with
the exception of the fuzzy set parameter, which was allowed to vary between two and
twenty sets. This created DSA models (FS2) through (FS20). The (FS) model that
provided the highest observed accuracy for each series, subcategory and category was
identified for the purpose of creating composite models DSAA, DSAB and DCSC.
      Figure 6.1 – Figure 6.13 in this Chapter present the frequency with the various
fuzzy parameter values produced the most accurate forecasts for a given series. These
frequencies are based on the data from Table 5.1 – Table 5.3. The data in these tables
has been aggregated for all series used in the competition, for each of the nine
subcategories and for each of the three categories.
      The analysis of the bar charts for individual series, as well as subcategories and
categories suggests that the set parameter that produces the most accurate forecast is
series specific, very much in the way that the value of the parameter weight, in single
                                          - 90 -
exponential smoothing, is specific to a particular series. For example, in Figure 6.1 it
can be seen that to produce the most accurate forecast for the one hundred thirty five
series in competition #1 it was necessary to use every parameter value in the range of
FS2-FS20.
      As such hypothesis H01 was rejected and it has been concluded that changing
the fuzzy set parameter does effect forecast accuracy. The importance of this finding
is that it indicates that general criteria will need to be established for selecting the
fuzzy set parameter value in the DSA method.


Figure 6.1 Overall FS Frequency
Figure 6.2 Yearly Micro FS Frequency
Figure 6.3 Yearly Industry FS Frequency
Figure 6.4 Yearly Macro FS Frequency
Figure 6.5 Quarterly Micro FS Frequency
Figure 6.6 Quarterly Industry FS Frequency
Figure 6.7 Quarterly Macro FS Frequency
Figure 6.8 Monthly Micro FS Frequency
Figure 6.9 Monthly Industry FS Frequency
Figure 6.10 Monthly Macro FS Frequency
Figure 6.11 Yearly FS Frequency
Figure 6.12 Quarterly FS Frequency
Figure 6.13 Monthly Frequency


          HO2: A fuzzy set parameter of seven in a DSA model will yield the most
accurate ex ante forecasts when compared to DSA models with fuzzy set parameter
other than seven, in the range of set values from two to twenty, all other model
parameters held constant.


      Song (1991) suggested that a model with a fuzzy set parameter equal to seven
would produce the most accurate results. Song whose background is in systems
                                          - 91 -
control may have observed that in those applications that seven fuzzy sets is in fact
optimal, and by extension concluded that seven fuzzy sets would be optimal in
forecasting applications.
      This study was the first to investigate the impact on relative accuracy of
different values for the set parameter as discussed above, as well as the first to
evaluate the claim that seven sets is the optimal set value. Figure 6.1 – Figure 6.13
illustrate that for individual series, subcategories and for categories seven is not an
optimal or universal value for the fuzzy set parameter. For example, Figure 6.1
illustrates that there is no optimal set parameter value for the DSA method for the one
hundred thirty five individual series used in competition #1. In fact, the parameter
values that produced the most accurate forecasts most frequently was FS11 followed
by FS20, FS10 and FS8. Further, in Table B.1 in Appendix B, the difference between
the sMAPE values of the most accurate model (FS9) and the lease accurate model is
in excess of fourteen percentage points.
      As such hypothesis H02 was rejected and it has been concluded that a fuzzy set
parameter value of seven is neither an optimal nor universal value. The importance of
this finding is that it is unlikely that there is an optimal set parameter value and that
modelers should anticipate that it will be necessary to examine a range of set
parameter values to identify the (FS) model that will produce the most accurate
forecasts. Again this is similar to the process that is followed in other extrapolative
methods to identify the model that will produce the most accurate forecasts be they in
sample or ex ante forecasts.


      H03: The ranking on forecast accuracy of the DSA method and the traditional
methods compared in this study will be the same for all accuracy measures considered


      Table 5.18 – 5.33 for Competition #2 and 5.35 – 5.40 in Competition #3,
excluding the summary tables, report the three models that provided the highest
observed forecast accuracy, for subcategory, category and the all series levels of
aggregation, for the seven accuracy measures, across the various forecast horizons. An
                                           - 92 -
examination of these tables reveals that the ranking of the three best performing
methods in Competition #2 and Competition #3 differs, within a particular forecast
horizon, for the seven measures of forecast accuracy. Similar results can be observed
in the tables in Appendix C, containing the accuracy measures for each method. For
example, in Table 5.19 for the average of all six forecast horizons, the order of
methods on the basis of their sMAPE value is DAA, DAAW, and the order on the
basis of Average Ranking is DAAW, DAA, THET.
      As such hypothesis H03 was rejected and it has been concluded that the ranking
of the forecast accuracy of various methods will be different for the various accuracy
measured being used. The importance of this finding is that it reaffirms the findings of
early studies including the M competitions and thus adds support to the existing body
of knowledge on time series extrapolation.


      H04: The ranking on forecast accuracy of a combination of alternative
forecasting methods will be lower than that of the specific forecasting methods being
combined


      In Table 5.34, the sMAPE values for the combination of three traditional
methods, Single, Holt and Dampen, and their combination as well as for the DSA and
Winters methods and their combination have been reported for various forecast
horizons.
      Relative to the traditional methods and their combinations, the combination
methods outperform Single and Holts in their native form for all of the six sets of
averaged forecast horizons. The combination however does not outperform Dampen
Trend Exponential Smoothing for these same forecast horizons. For example, the
sMAPE for S-H-D for the average of forecast horizons 1-4 is 12.44 while the sMAPE
for Dampen is 12.23. The other absolute differences are of approximately the same
magnitude.
      Relative to the combination of the DSA and Winters Methods, the findings are
mixed. For the DSAB-W and DSAC-W method the combination outperforms both of
                                         - 93 -
the DSA models and the Winter's model in their native form. In the case of the
DSAA-W model the combination method outperforms the Winter's method across all
forecast horizon averages, however the combined method does not outperform the
DSAA model on any of the forecast horizon averages. For example the sMAPE for
DSAA-W for the average of forecast horizons 1-4 is 10.63 while the sMAPE for the
same forecast horizons is for the DSAA model is 10.00.
      Additionally, an examination of Table 5.38 indicates that the DSAA-W model
outperforms the DSAA and the Winter's models on yearly data with a trend, while the
DSAA model outperforms the DSAA-W model on quarterly data with a trend. Further,
in Table 5.40 DAA-W outperforms all other models including the DSAA and Winter's
models on all forty-five series in which a statistically significant trend is present. As
such hypothesis H04 was not rejected and it has been concluded that a combination of
methods do not perform at least as well, in all settings, as do the methods that have
been combined do, in their native form. This finding is disappointing in that this study
has not reaffirmed a finding that has been reaffirmed in so many other accuracy
studies. It may be that group difference testing could be used to resolve this disparity.


      H05: The ranking on forecast accuracy of the DSA method and the traditional
methods compared in thus study does not depend on the length of the forecast horizon


      Table 5.18 – 5.33 for Competition #2 and 5.35 –5.40 in Competition #3,
excluding the summary tables, report the three models that provided the highest
observed forecast accuracy, for subcategory, category and the all series levels of
aggregation, for the seven accuracy measures, across the various forecast horizons. An
examination of these tables reveals that the ranking of the three best performing
methods in Competition #2 and Competition #3 differs, for a particular accuracy
measure across forecast horizons, and the averages of those forecast horizons. Similar
results can be observed in the tables in Appendix C, containing the accuracy measures
for each method. For example, in Table 5.19 for the sMAPE accuracy measure, for
forecast horizon 1, the three models with the highest observed accuracy are DAAW,
                                           - 94 -
DAA and THES and the three models for forecast horizon 3 are DAA, DAAW and
DABW. Further, for the 1-4 horizon average the models are ranked, DAAW, DAA and
THES and for the 1-6 horizon average the models are ranked, DAA, DAAW, DABW.
      As such hypothesis H05 was rejected and it has been concluded that the ranking
of the forecast accuracy of various methods will be different across different forecast
horizons for a given measure of forecast accuracy. The importance of this finding is
two fold. Firstly, it reaffirms the findings of early studies including the M
competitions and thus adds support to the existing body of knowledge on time series
extrapolation. Secondly, it is important because it impacts on forecast method
selection. The method that produces the most accurate forecast across the forecast
horizons 1-4 may well not be the method that produces the most accurate forecast
over the forecast 1-18. So, modelers should be certain that they select the model that
will perform best for the forecast horizons of interest.


      H06: The ranking on forecast accuracy of the time series specific DSA model,
will be less than or equal to the ranking on forecast accuracy of both the subcategory
and category specific DSA models


      H07: The ranking on forecast accuracy, of the DSA method, will be lower than
that of the traditional extrapolative methods to which it is being compared in this
study, by time series subcategory, time series category and for all of the time series
evaluated in this study


      In competition #2 the goal was to establish the relative accuracy of the DSA-A,
DSA-B, DSA-C models, and eight traditional methods; a combination of traditional
methods; and a combination of the DSA models and Winters Exponential Smoothing.
The DSA-A, DSA-B and DSA-C models were developed in competition #1 to help
answer the question: Is there a fuzzy parameter value, for a data type and time interval
subcategory, or a time interval category, that will yield more accurate results for those
levels of aggregation, than it will for the series level of aggregation.
                                           - 95 -
      Relative to the combination models, the traditional combination model was
designated S-H-D and the fuzzy traditional combination models were designated
DSAA-W, DSAB-W and DSAC-W. Relative forecast accuracy was assessed at the
subcategory, category and all series levels of aggregation in Competition #2.
      The results of this competition indicate that the forecasts represented by the
DSA-A model, when assessed at the subcategory, category and all series levels of
aggregation, were more accurate, in the aggregate than were those represented by the
DSA-B and DSA-C models. This finding is important but not necessarily surprising.
This result suggests that the DSA method will produces the most accurate forecasts
when it is used to produce ex ante forecast for the forecast horizons of an individual
series, and less accurate forecasts will be obtained if the modeler selects a fuzzy set
parameter for all series of a particular data type or time interval. Certainly, from the
standpoint of the economy of the DSA method, it would have been preferable to have
a single fuzzy parameter value that would produce the most accurate forecast for a
subcategory or category of data. This would be particularly true for manufacturing
environments where forecasts for thousands of product components must be produced
on a routine basis. This finding reinforces the findings in competition #1 that were
used to test H01 and H02.
      In Competition #2 the DSAA model and its derivative, the DSAA-W model
dominated the competition. In the subcategory competition the DSAA model provided
forecasts with the highest observed accuracy for five of the subcategories, while the
DSAA-W model provided forecasts with the highest observed accuracy for two for
the subcategories. In total these two models provided forecasts with the highest
observed accuracy for seven of nine subcategories.
      In the category competition the DSAA model and the DSAA-W model each
provided forecasts with the highest observed for one of the categories. In total, these
two models provide the highest observed accuracy for two of three categories.
      In the All Series competition, the DSAA model provided forecasts with the
highest observed accuracy for all one hundred thirty five, time series in this
competition. The DSAA-W method provided forecasts with the second highest
                                         - 96 -
observed accuracy. The Theta method that received such wide acclaim in the M3
competition was ranked third in the All Series competition.
        As such, hypotheses H06 and H07 were rejected. It was concluded relative to
hypothesis H06 that the most accurate forecasts, for a large number of series in the
aggregate, will be obtained by first obtaining the most accurate forecasts produced by
the DSA method for the forecasts horizons of individual series. The importance of this
finding is that it indicates to modelers that they should not assume that a particular
fuzzy set parameter will produce the most accurate forecasts for a given data type but
that they should first produce forecasts across the forecast horizon of each series in
future studies of the DSA method.
        It was concluded relative to hypothesis H07 that the DSA-A and DSAA-W
models produce forecasts that are in most cases more accurate, than those produced
by the traditional extrapolative methods evaluated in this study. Remarkably, this
conclusion holds across a broad range of time series that differ by, data types (micro,
industry, macro); time origin, (yearly, quarterly, monthly); forecast horizon, (six, eight,
eighteen); presence of a mix of time series components, (average trend and seasonal),
and although it was not explicitly tested, training set length, (fourteen, seventeen,
thirty-six, forty-one, fifty-six, fifty-one, fifty-six, one-hundred sixteen and one
hundred twenty-six). The exceptions to this list are Yearly-Micro and Monthly-Macro
data.
        The importance of this finding is that it suggests that hypothesis presented in
several prior studies that a statistically simple method, that is robust to the fluctuations
in real-life time series, could advance the search for improvements in forecast
accuracy of extrapolative methods appears to be supported by the performance of the
DSA method in this study. In so doing it has demonstrated that the DSA method is a
method on which future research can be justified. In addition, these findings provide
to those who wish to advance the research on the DSA some specific facts about its
implementation that will help focus the direction of any future research on this
method.


                                           - 97 -
      H08: The ranking on forecast accuracy, of the DSA method, will be lower than
that of the traditional extrapolative methods to which it is being compared in this
study, on those series in which a statistically significant trend is present


      The fuzzifier module of the DSA method was designed to implicitly forecast the
trend component in a time series without the need to explicitly forecast the trend
through decomposition, or through modification to a forecast with an external
parameter. This is the case with Holts and Winters' methods, and Damped Trend
Exponential Smoothing and Theta method, respectively. In this way the DSA method
can truly be classified as a statistically simple extrapolative forecasting method.
      In competition #3 the DSA-A and DSAA-W models were again top performers.
These models together provided forecasts with the highest observed accuracy in two
of the three categories of time series and were one of three models that provided the
highest observed accuracy in at least one of the other categories.
      In the All Series competition DSAA-W provided the forecast with the highest
observed accuracy and DSA-A provided forecasts with the second highest observed
accuracy.
      As such hypothesis H08 was rejected and it has been concluded that the ranking
of the DSA method on forecast accuracy is at least as high as that of the alternative
traditional extrapolative methods evaluated in this study on time series in which a
statistically significant trend is present. The importance of this finding is that it
demonstrates that at least for series in which the average and trend component only
were present, the new fuzzifier module provides the DSA method with the ability to
accurately forecast time series with a trend.


6.2 Limitations


      There are a number of limitations to the conclusions that can be drawn from this
study's findings, or for that matter from any forecasting competition. This includes in
particular those studies that rely on the procedures and data from the M competitions.
                                           - 98 -
In section 4.6 a discussion has been provided that enumerates the concerns with the
forecasting competition methodology and the specific limitations that imposes on the
findings of this study.
      In this study, the decision to rely on the data and procedures from the M3
competition brought with it a limitation. Specifically the heterogeneous nature of all
of the series in the nine subcategories, resulted in a sample of time series that were
difficult to differentiate other than on the basis of the criteria set forth by the designers
of the M3-competition, Makridakis and Hibon (2000). These series each contained,
for the most part, a mix of time series components, outliers and high variation. For
this reason it was not possible to test the DSA method's performance on sub-samples
of series containing a seasonal component, or on series that were highly volatile or on
series that had only the average component. This is with the single exception of a
sub-sample of series with a trend component that were evaluated in Competition #3.
      Another problem specific to this current study was the omission of an All Series
competition within Competition #1. At the outset of the study the plan was to evaluate
accuracy of the DSA method at the individual series, subcategory and category levels
of aggregation. Given the performance of DSAB and DSAC models in competition #2
however, it would have been interesting to access the relative accuracy of a DSA
representing an all series level of aggregation.


6.3 Contributions to Theory


      This study has made several important contributions to the body of knowledge
on business forecasting. The first, and most important, relates to the performance
overall of the DSA method. Fildes and Makridakis (1998), Makridakis and Hibon
(2000), and Fildes (2001), have argued that future research to improve the accuracy of
extrapolative methods that can take into account the real-life behavior of time series,
that is, methods that are robust to the fluctuations that occur in real-life time series.
      The DSA method was introduced in response to this prior research, as a
statistically simple method that would be robust to the fluctuations in real-life data.
                                            - 99 -
The superior performance of the DSA method when compared to traditional methods
may be a measure of this methods robustness, and in this way, the findings of this
study support the hypothesis of those authors.
      This finding will, at the every least, add weight to the argument that statistically
simple methods produce forecasts that are at least as accurate as forecasts produced
by statistically sophisticated methods because they are robust. At most it will change
the direction of extrapolative forecasting method development to a focus on methods
that will be robust to the fluctuations in real-life business data.
      In addition, these results have reconfirmed two important findings from the M
competitions and other accuracy studies. They are, that different accuracy measures
will rank models differently in terms of relative accuracy, and that models relative
accuracy will differ across the forecast horizon and forecast horizon averages of a
particular series.
      This study has also made important contributions to the body of knowledge on
the development of fuzzy logic based extrapolative methods. Firstly, the results of this
study provide justification for additional research on the DSA method in particular,
and on fuzzy logic extrapolative methods in general. Secondly, this study
demonstrated the use of the Mamdani Framework for the development of a fuzzy
logic based extrapolative method. This framework provides the structure and a
common platform for the future development of the DSA method or, other fuzzy logic
based extrapolative methods. The hope is that research will focus on developing the
current modules to better match with specific forecasting conditions or problems.
Finally, this study has demonstrated that the value of the fuzzy set parameter can be
changed to improve forecast accuracy much in the same way that a parameter weight
can be changed in an exponential smoothing model to improve forecast accuracy.
Finally, the results of this study have demonstrated that there does not appear to be an
optimal or universal value for the fuzzy set parameter.


5.4 Future Directions


                                           - 100 -
      The DSA method has been shown in this study to be capable of producing
forecasts that rival in accuracy the forecasts produced by many traditional and
routinely used statistically simple extrapolative forecasting methods. As such,
additional research on this method appears to be justified.
      The next step in the development of the DSA method is to establish criteria for
the a priori selection of the fuzzy set parameter. Work in this are has already begun
and the focus is on a multi-pass analysis of the training set. In other words to use
multiple validation sets. The goal is to establish criteria that will identify in advance
the value for the fuzzy set parameter that will produce the most accurate forecasts.
The results of this current study are serving as the platform for this new research.
      Although the DSA method performed remarkably well overall, there were two
subcategories and one category, in competition #2, in which it did not provide
forecasts that were as accurate as it did for other subcategories and categories of data.
The subcategories were yearly and monthly macroeconomic data and the category
were attributable to the DSA methods performance in the yearly macroeconomic data
subcategory. The results in the two subcategories of macroeconomic data are
attributable to the nature of the trend in these time series.
      Unlike the trend that exists in the majority of the time series in the other
subcategories in which the trend component, either growth or decay, is damped in the
validation data set, the trend component in the macroeconomic data exhibits constant
growth, and in some instances decay. It is this constant trend that appears to adversely
affect the DSA methods performance.
      As a first attempt to address this problem, a fuzzy growth factor was developed,
based on the end points of the training set. This growth factor was in fuzzy set units
and worked remarkably well. In fact when applied to the yearly macroeconomic
subcategory, the DSAA model provided the most accurate forecasts of all models in
the competition for this subcategory of data. The problem is that this approach in
which the trend is explicitly accounted for is less desirable than an approach in which
the trend is forecasted implicitly. It is recommended therefore that the DSA method
defuzzifier and inference modules be modified to allow for output sets that are
                                           - 101 -
different than the input sets. The output sets would implicitly forecast the trend.
      As product life cycles become shorter, resulting from rapid technology
obsolescence and increased competition, there will be a need for forecasting methods
that can provide accurate forecasts, for various forecast horizons based on small
training data sets. Although it was not hypothesized in this study that the DSA method
would provide forecasts that were at least as accurate as traditional methods,
regardless of the length of the training set, the results of this study, as discussed in
section 6.2 suggest that the DSA method is not impacted by training set length. To
verify this observation and empirical investigation should be undertaken to examine
the impact of training set length on the relative accuracy of the DSA method. It should
be noted that the DSA method is capable of producing forecasts using training sets
that have as few as three observations.


6.5 Conclusions


      The results of this study demonstrate that the observed forecast accuracy of the
DSA method is at least as good, and in many cases better, than that of traditional
models to which it was compared, across a heterogeneous selection of time series.
      The DSA method performance under these various conditions is likely
attributable to, two factors. Firstly, the method is statistically simple and forecasts the
various components of the time series implicitly. Secondly, and equally, important is
the role, played by Fuzzy Logic in this traditional extrapolative methods.
      Fuzzy Logic has held a preeminent position in the field of systems control for
over two decades. The success of Fuzzy Logic in these applications has been
attributed to its ability to be robust to the anomalies that exist in real-life data
resulting from a rougher modeling approach than traditional methods and because it
provides a nonlinear mapping of inputs to outputs. The Direct Set Assignment
extrapolative forecasting method was developed within the Mamdani framework and
was designed to mimic the data processing approach of a fuzzy logic controller.
      While the DSA method has performed admirably in this first comparison to
                                          - 102 -
other statistically simple extrapolative forecasting methods, there remain many
opportunities to improve further the accuracy of the DSA method. Specific
suggestions for future research have been provided in section 6.3.




                                   APPENDIX A
                            DESCRIPTIVE STATISTICS


Table 1 Descriptive Statistics


                                   APPENDIX B
                                 COMPETITON #1


Table B.1 Yearly sMAPE Values
Table B.2 Quarterly sMAPE Values
Table B.3 Monthly sMAPE Values
Table B.4 Yearly Average Rank Values
Table B.5 Quarterly Average Rank Values
Table B.6 Monthly Average Rank Values
Table B.7 Average symmetric MAPE: yearly micro data
Table B.8 MedAPE: yearly micro data
Table B.9 Average Rank: yearly micro data
Table B.10 MAD: yearly micro data
Table B.11 medRAE: yearly micro data
Table B.12 % Better: yearly micro data
Table B.13 Benchmark: yearly micro data
Table B.14 Average symmetric MAPE: yearly industry data
Table B.15 MedAPE: yearly industry data
Table B.16 Average Rank: yearly industry data
Table B.17 MAD: yearly industry data
                                         - 103 -
Table B.18 medRAE: yearly industry data
Table B.19 % Better: yearly industry data
Table B.20 Benchmark: yearly industry data
Table B.21 Average symmetric MAPE: yearly macro data
Table B.22 MedAPE: yearly macro data
Table B.23 Average Rank: yearly macro data
Table B.24 MAD: yearly macro data
Table B.25 medRAE: yearly macro data
Table B.26 % Better: yearly macro data
Table B.27 Benchmark: yearly macro data
Table B.28 Average symmetric MAPE: quarterly micro data
Table B.29 MedAPE: quarterly micro data
Table B.30 Average Rank: quarterly micro data
Table B.31 MAD: quarterly micro data
Table B.32 medRAE: quarterly micro data
Table B.33 % Better: quarterly micro data
Table B.34 Benchmark: quarterly micro data
Table B.35 Average symmetric MAPE: quarterly industry data
Table B.36 MedAPE: quarterly industry data
Table B.37 Average Rank: quarterly industry data
Table B.38 MAD: quarterly industry data
Table B.39 medRAE: quarterly industry data
Table B.40 % Better: quarterly industry data
Table B.41 Benchmark: quarterly industry data
Table B.42 Average symmetric MAPE: quarterly macro data
Table B.43 MedAPE: quarterly macro data
Table B.44 Average Rank: quarterly macro data
Table B.45 MAD: quarterly macro data
Table B.46 medRAE: quarterly macro data
Table B.47 % Better: quarterly macro data
                                         - 104 -
Table B.48 Benchmark: quarterly macro data
Table B.49 Average symmetric MAPE: monthly micro data
Table B.50 MedAPE: monthly micro data
Table B.51 Average Rank: monthly micro data
Table B.52 MAD: monthly micro data
Table B.53 medRAE: monthly micro data
Table B.54 % Better: monthly micro data
Table B.55 Benchmark: monthly micro data
Table B.56 Average symmetric MAPE: monthly industry data
Table B.57 MedAPE: monthly industry data
Table B.58 Average Rank: monthly industry data
Table B.59 MAD: monthly industry data
Table B.60 medRAE: monthly industry data
Table B.61 % Better: monthly industry data
Table B.62 Benchmark: monthly industry data
Table B.63 Average symmetric MAPE: monthly macro data
Table B.64 MedAPE: monthly macro data
Table B.65 Average Rank: monthly macro data
Table B.66 MAD: monthly macro data
Table B.67 medRAE: monthly macro data
Table B.68 % Better: monthly macro data
Table B.69 Benchmark: monthly macro data
Table B.70 Average symmetric MAPE: yearly all data
Table B.71 MedAPE: yearly all data
Table B.72 Average Rank: yearly all data
Table B.73 MAD: yearly all data
Table B.74 medRAE: yearly all data
Table B.75 % Better: yearly all data
Table B.76 Benchmark: yearly all data
Table B.77 Average symmetric MAPE: quarterly all data
                                        - 105 -
Table B.78 MedAPE: quarterly all data
Table B.79 Average Rank: quarterly all data
Table B.80 MAD: quarterly all data
Table B.81 medRAE: quarterly all data
Table B.82 % Better: quarterly all data
Table B.83 Benchmark: quarterly all data
Table B.84 Average symmetric MAPE: monthly all data
Table B.85 MedAPE: monthly all data
Table B.86 Average Rank: monthly all data
Table B.87 MAD: monthly all data
Table B.88 medRAE: monthly all data
Table B.89 % Better: monthly all data
Table B.90 Benchmark: monthly all data


                                   APPENDIX C
                                COMPETITION #2


Table C.1 Average symmetric MAPE: yearly micro data
Table C.2 MedAPE: yearly micro data
Table C.3 Average Rank: yearly micro data
Table C.4 MAD: yearly micro data
Table C.5 medRAE: yearly micro data
Table C.6 % Better: yearly micro data
Table C.7 Benchmark: yearly micro data
Table C.8 Average symmetric MAPE: yearly industry data
Table C.9 MedAPE: yearly industry data
Table C.10 Average Rank: yearly industry data
Table C.11 MAD: yearly industry data
Table C.12 medRAE: yearly industry data
Table C.13 % Better: yearly industry data
                                          - 106 -
Table C.14 Benchmark: yearly industry data
Table C.15 Average symmetric MAPE: yearly macro data
Table C.16 MedAPE: yearly macro data
Table C.17 Average Rank: yearly macro data
Table C.18 MAD: yearly macro data
Table C.19 medRAE: yearly macro data
Table C.20 % Better: yearly macro data
Table C.21 Benchmark: yearly macro data
Table C.22 Average symmetric MAPE: quarterly micro data
Table C.23 MedAPE: quarterly micro data
Table C.24 Average Rank: quarterly micro data
Table C.25 MAD: quarterly micro data
Table C.26 medRAE: quarterly micro data
Table C.27 % Better: quarterly micro data
Table C.28 Benchmark: quarterly micro data
Table C.29 Average symmetric MAPE: quarterly industry data
Table C.30 MedAPE: quarterly industry data
Table C.31 Average Rank: quarterly industry data
Table C.32 MAD: quarterly industry data
Table C.33 medRAE: quarterly industry data
Table C.34 % Better: quarterly industry data
Table C.35 Benchmark: quarterly industry data
Table C.36 Average symmetric MAPE: quarterly macro data
Table C.37 MedAPE: quarterly macro data
Table C.38 Average Rank: quarterly macro data
Table C.39MAD: quarterly macro data
Table C.40 medRAE: quarterly macro data
Table C.41 % Better: quarterly macro data
Table C.42 Benchmark: quarterly macro data
Table C.43 Average symmetric MAPE: monthly micro data
                                         - 107 -
Table C.44 MedAPE: monthly micro data
Table C.45 Average Rank: monthly micro data
Table C.46 MAD: monthly micro data
Table C.47 medRAE: monthly micro data
Table C.48 % Better: monthly micro data
Table C.49 Benchmark: monthly micro data
Table C.50 Average symmetric MAPE: monthly industry data
Table C.51 MedAPE: monthly industry data
Table C.52 Average Rank: monthly industry data
Table C.53 MAD: monthly industry data
Table C.54 medRAE: monthly industry data
Table C.55 % Better: monthly industry data
Table C.56 Benchmark: monthly industry data
Table C.57 Average symmetric MAPE: monthly macro data
Table C.58 MedAPE: monthly macro data
Table C.59 Average Rank: monthly macro data
Table C.60 MAD: monthly macro data
Table C.61 medRAE: monthly macro data
Table C.62 % Better: monthly macro data
Table C.63 Benchmark: monthly macro data
Table C.64 Average symmetric MAPE: yearly all data
Table C.65 MedAPE: yearly all data
Table C.66 Average Rank: yearly all data
Table C.67 MAD: yearly all data
Table C.68 medRAE: yearly all data
Table C.69 % Better: yearly all data
Table C.70 Benchmark: yearly all data
Table C.71 Average symmetric MAPE: quarterly all data
Table C.72 MedAPE: quarterly all data
Table C.73 Average Rank: quarterly all data
                                        - 108 -
Table C.74 MAD: quarterly all data
Table C.75 medRAE: quarterly all data
Table C.76 % Better: quarterly all data
Table C.77 Benchmark: quarterly all data
Table C.78 Average symmetric MAPE: monthly all data
Table C.79 MedAPE: monthly all data
Table C.80 Average Rank: monthly all data
Table C.81 MAD: monthly all data
Table C.82 medRAE: monthly all data
Table C.83 % Better: monthly all data
Table C.84 Benchmark: monthly all data
Table C.85 Average symmetric MAPE: overall data
Table C.86 MedAPE: overall data
Table C.87 Average Rank: overall data
Table C.88 MAD: overall data
Table C.89 medRAE: overall data
Table C.90 % Better: overall data
Table C.91 Benchmark: overall data


                                    APPENDIX D
                                COMPETITION #3


Table D.1 Average symmetric MAPE: yearly micro data
Table D.2 MedAPE: yearly micro data
Table D.3 Average Rank: yearly micro data
Table D.4 MAD: yearly micro data
Table D.5 medRAE: yearly micro data
Table D.6 % Better: yearly micro data
Table D.7 Benchmark: yearly micro data
Table D.8 Average symmetric MAPE: yearly industry data
                                          - 109 -
Table D.9 MedAPE: yearly industry data
Table D.10 Average Rank: yearly industry data
Table D.11 MAD: yearly industry data
Table D.12 medRAE: yearly industry data
Table D.13 % Better: yearly industry data
Table D.14 Benchmark: yearly industry data
Table D.15 Average symmetric MAPE: yearly macro data
Table D.16 MedAPE: yearly macro data
Table D.17 Average Rank: yearly macro data
Table D.18 MAD: yearly macro data
Table D.19 medRAE: yearly macro data
Table D.20 % Better: yearly macro data
Table D.21 Benchmark: yearly macro data
Table D.22 Average symmetric MAPE: quarterly micro data
Table D.23 MedAPE: quarterly micro data
Table D.24 Average Rank: quarterly micro data
Table D.25 MAD: quarterly micro data
Table D.26 medRAE: quarterly micro data
Table D.27 % Better: quarterly micro data
Table D.28 Benchmark: quarterly micro data
Table D.29 Average symmetric MAPE: quarterly industry data
Table D.30 MedAPE: quarterly industry data
Table D.31 Average Rank: quarterly industry data
Table D.32 MAD: quarterly industry data
Table D.33 medRAE: quarterly industry data
Table D.34 % Better: quarterly industry data
Table D.35 Benchmark: quarterly industry data
Table D.36 Average symmetric MAPE: quarterly macro data
Table D.37 MedAPE: quarterly macro data
Table D.38 Average Rank: quarterly macro data
                                         - 110 -
Table D.39MAD: quarterly macro data
Table D.40 medRAE: quarterly macro data
Table D.41 % Better: quarterly macro data
Table D.42 Benchmark: quarterly macro data
Table D.43 Average symmetric MAPE: monthly micro data
Table D.44 MedAPE: monthly micro data
Table D.45 Average Rank: monthly micro data
Table D.46 MAD: monthly micro data
Table D.47 medRAE: monthly micro data
Table D.48 % Better: monthly micro data
Table D.49 Benchmark: monthly micro data
Table D.50 Average symmetric MAPE: monthly industry data
Table D.51 MedAPE: monthly industry data
Table D.52 Average Rank: monthly industry data
Table D.53 MAD: monthly industry data
Table D.54 medRAE: monthly industry data
Table D.55 % Better: monthly industry data
Table D.56 Benchmark: monthly industry data
Table D.57 Average symmetric MAPE: monthly macro data
Table D.58 MedAPE: monthly macro data
Table D.59 Average Rank: monthly macro data
Table D.60 MAD: monthly macro data
Table D.61 medRAE: monthly macro data
Table D.62 % Better: monthly macro data
Table D.63 Benchmark: monthly macro data
Table D.64 Average symmetric MAPE: yearly trend all data
Table D.65 MedAPE: yearly trend all data
Table D.66 Average Rank: yearly trend all data
Table D.67 MAD: yearly trend all data
Table D.68 medRAE: yearly trend all data
                                        - 111 -
Table D.69 % Better: yearly trend all data
Table D.70 Benchmark: yearly trend all data
Table D.71 Average symmetric MAPE: quarterly trend all data
Table D.72 MedAPE: quarterly trend all data
Table D.73 Average Rank: quarterly trend all data
Table D.74 MAD: quarterly trend all data
Table D.75 medRAE: quarterly trend all data
Table D.76 % Better: quarterly trend all data
Table D.77 Benchmark: quarterly trend all data
Table D.78 Average symmetric MAPE: monthly trend all data
Table D.79 MedAPE: monthly trend all data
Table D.80 Average Rank: monthly trend all data
Table D.81 MAD: monthly trend all data
Table D.82 medRAE: monthly trend all data
Table D.83 % Better: monthly trend all data
Table D.84 Benchmark: monthly trend all data
Table D.85 Average symmetric MAPE: Trend All data
Table D.86 MedAPE: Trend All data
Table D.87 Average Rank: Trend All data
Table D.88 MAD: Trend All data
Table D.89 medRAE: Trend All data
Table D.90 % Better: Trend All data
Table D.91 Benchmark: Trend All data




                                         - 112 -

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:32
posted:10/6/2011
language:English
pages:112