Docstoc

FORECASTING _

Document Sample
FORECASTING _ Powered By Docstoc
					FORECASTING :

FURTHER COMMENTS

And

SOME NOTES ON QUANTIFYING
UNCERTAINTY
                                               SALES of SINGLES (in Millions)

                           100



                           90


                MILLIONS   80



                           70



                           60



                           50

                                          y = 0.032x3 - 188.541x2 + 374302.894x - 247694272.834
                           40
                            1970   1975            1980            1985           1990            1995   2000


At the end of this week‟s lab you plotted the above series, the numbers of singles sold each year from 1973 to 1997 and
you fitted three trend lines. These are shown. They are a LINEAR TREND, a QUADRATIC TREND and a CUBIC
TREND. The equation of the cubic is also shown on the plot. There were questions after the lab about the use of
polynomials of different orders or degrees. The degree of a polynomial is simply the highest power of x in the
expression for the polynomial, e.g. in the case of the cubic shown, the degree is 3. Polynomials are very smooth curves
with the property that as we increase the degree so we increase the number of turning points, or maxima and minima in
the curve. The cubic above (degree 3) has two, the quadratic (degree 2) has 1. The effect of choosing higher degrees for
our fitted polynomials is that we always get a better fit to the data, because we are introducing a bit more flexibility, or
“bendiness” with each extra degree. And we can see this happen above, the cubic gives much better fit than the
quadratic. So, why not choose the highest degree available? Surely this gives the best fit to our data?
To answer these questions, begin by looking at the data in the plot above and asking yourself what you would guess will
happen to it. It has been rising for the last 5 years, so perhaps it will continue to do so. By how much? That‟s difficult,
buyt we note that in the last 25 years sales of singles always lay between about 50 and 90 millions per year. Thus, unless
there were a serious change in the market or technology or both it seems unlikely that it will rise much over 100 millions.
Now, consider the plot below and note the forecasts from our BEST FITTING CUBIC.
                                                         SALES of SINGLES (in Millions)

                                       380


                                       330


                                       280
                            MILLIONS

                                       230


                                       180


                                       130


                                       80


                                       30
                                        1970   1975   1980     1985     1990      1995    2000   2005   2010


The cubic trend forecasts are simply unbelievable, at least in the state of current knowledge. The 5 year forecast is about
150 millions per year, the 10 years one exceeds 300 millions. These are not just a little unlikely, but outrageously so.
This illustrates the problem with using polynomial trends to forecast. They can be made to FIT the HISTORICAL
DATA very well, but this is no guide to how well they will perform forecasting. The mathematical reason is that when we
forecast a trend we evaluate the function (here a polynomial) at a future time, i.e. we are increasing the x-value in the
expression for our trend, and so we naturally increase the highest power of that x-value by even more. For example, if
we use the cubic above the forecast for the year 2000 will contain a term of the form 2000*2000*2000. So, how can we
forecast sales of singles from these data and using a trend curve?
                                                               SALES of SINGLES (in Millions)

                                      90

                                      85

                                      80

                                      75           y = 7.0x + 36.6
                                      70




                           MILLIONS
                                      65

                                      60

                                      55
                                                                                           y = 42.21e0.10x
                                      50

                                      45

                                      40
                                           0   1           2          3         4         5           6      7   8
                                                                          YEAR = 1990 +



Since we wish to use a trend curve we examine the data for some consistency near the end or current time. Clearly, the
last 5 years have shown a consistent rise, so we might try to fit a suitable trend to those data only. A straight line fitted to
those data is shown above as the dashed line, and its equation is given in plot on top left. Thus, for each unit of X, i.e.
for each year, Y increases by 7 units, i.e. sales increase by 7 millions. Straight lines imply a constant increase every year.
In this case, it is 7 millions per year. The solid line shows the best-fitting exponential trend. Such trends are often used
because they imply change each year by a constant percentage. In this case, we see from the equation that the chosen
trend implies that sales are growing at the rate of 10% per year. Given these six years‟ data, both these trends appear
plausible. Also, they both appear to be close to the data points in the plot. Statistically, they FIT the data equally well.
So, do we need to choose between them, and, if so, how? The answer to the first question is YES: they will yield quite
different forecasts, as we see below.
                             SALES of SINGLES (in Millions)

           300


           250


           200
                         y = 42.21e0.10x
MILLIONS




           150


           100


           50                                                      y = 7.0x + 36.6


            0
                 0   2   4        6          8      10        12      14        16   18
                                           YEAR = 1990 +
LINEAR TREND FORECAST FOR NEXT YEAR – 1998

FORECAST OF 1998‟S SINGLES SALES:

    6.98 * 8 + 36.64 = 92.48 Million


FORECAST OF 2008‟S SINGLES SALES:

    6.98 *1 8 + 36.64 = 162.28 Million



EXPONENTIAL TREND FORECAST FOR NEXT YEAR – 1998

FORECAST OF 1998‟S SINGLES SALES:
                0.1*8
    42.21 * e            = 93.94 Million

FORECAST OF 2008‟S SINGLES SALES:
                0.1*18
    42.21 * e            = 255.36 Million
                                                       SALES of LPs, CASSETTES,CDs and SINGLES

                                180

                                160

                                140

                                120
                     MILLIONS



                                100                                                                                              LPs
                                                                                                                                 CASS
                                                                                                                                 CDs
                                80                                                                                               SINGLES


                                60

                                40

                                20

                                 0
                                      1973   1975   1977   1979   1981   1983   1985   1987   1989   1991   1993   1995   1997




How else might we forecast sales of singles? Interestingly, the plot of all types of records shown above, suggests the
sales of singles is fairly flat (certainly compared with the meteoric rise of CD sales). But note also that sales of singles
were falling from the late „70s and only picked up again in 1992 when sales of CDs accelerated their growth. Perhaps,
then, it is worth linking the sales of these two together, reflecting that rise in single sales is “on the coat-tails” of the rise
in CD sales, using the last 6 years again, 1992-1997.
                                                SINGLES SOLD NEXT YEAR = 0.34 * CDs SOLD THIS YEAR + 31.5 Millions

                                      90

                                      85

                                      80
                                                  y = 0.34x + 31.51
                                      75

                   SALES of SINGLES   70

                                      65

                                      60

                                      55

                                      50

                                      45

                                      40
                                           40       60          80           100        120           140       160   180
                                                                      SALES OF CDs in PREVIOUS YEAR




The above graph shows a plot of Single sales during the years 1992-1997 against CD sales for the years 1991-1996,
where at each point we have Single sales in a year and CD sales in the previous year. Why not choose the same year?
Simply, it would not be helpful for forecasting since if the relationship we use has both in the same year then in order to
forecast Single sales next year we would need to know CD sales for next year or forecast them first. However, if there is
a relationship between CD sales this year and Single sales next year then we need know only this year‟s CD sales to
forecast next year‟s Single sales. We cannot say that any “cause & effect” relationship exists here, but we can see that
the data lie very close to a straight line during these years, and hope that this relationship continues into the future.
LINEAR TREND FORECAST FOR NEXT YEAR – 1998

FORECAST OF 1998‟S SINGLES SALES:

    6.98 * 8 + 36.64 = 92.48 Million


FORECAST OF 2008‟S SINGLES SALES:

    6.98 *1 8 + 36.64 = 162.28 Million



EXPONENTIAL TREND FORECAST FOR NEXT YEAR – 1998

FORECAST OF 1998‟S SINGLES SALES:
                0.1*8
    42.21 * e            = 93.94 Million

FORECAST OF 2008‟S SINGLES SALES:
                0.1*18
    42.21 * e            = 255.36 Million
SINGLES SOLD NEXT YEAR = 0.34 * CDs SOLD THIS YEAR + 31.5 Millions

CDs SOLD THIS YEAR = 158.8 Million

FORECAST OF NEXT YEAR‟S SINGLES SALES:

0.34 * 158.8 + 31.5 = 44 + 31.5 = 75.5 Million



COMPARE THIS FORECAST WITH THE OTHER TWO

                                   1998          2008

LINEAR TREND:                       94.3         162.3
EXPONENTIAL TREND:                  93.9         255.4
WHICH TREND DO WE FORECAST?

THIS IS A SERIOUSLY DIFFICULT PROBLEM IN PRACTICE ---

IT REQUIRES BOTH SKILL AND EXPERIENCE ----



BUT THERE ARE SOME USEFUL RULES TO BEAR IN MIND
   GOODNESS OF FIT TO PAST HISTORY IS A USEFUL GUIDE

    BUT

   IT MUST NOT BE THE FINAL OR ONLY CRITERION


   POLYNOMIALS OF HIGH DEGEE WILL FIT PAST DATA WELL

    AND SO

   MAY GIVE A GOOD EXPLANATION OF PAST TRENDS

    BUT

   ARE VERY RARELY USED FOR FORECASTING
   THE TRENDS MOST COMMONLY USED FOR FORECASTING ARE

   FLAT TREND --- INDICATING NO CHANGE OF LEVEL OVER TIME

   LINEAR TREND --- INDICATING CONSTANT RATE OF CHANGE

    AND

   EXPONENTIAL TREND
               --- CONSTANT RATE OF PERCENTAGE CHANGE
THE IMPORTANT POINT IS

THAT THE TREND WE CHOOSE TO PROJECT INTO THE FUTURE WILL
REFLECT OUR BELIEF THAT THE FUTURE BEHAVIOUR OF THE DATA
WILL BE THE SAME IN KIND AS IT WAS IN THE PAST, AND WILL NOT
CHANGE OVER THE HORIZON FOR WHICH WE FORECAST.


THIS IS NOT A STATISTICAL CRITERION --- AND SO CANNOT BE
TESTED IN A REALLY OBJECTIVE WAY.


CHOICE OF A SENSIBLE TREND REQUIRES CONSIDERABLE CARE AND
REASONING, AND FINALLY, SOME FAITH.
SUPPOSE WE CAN CHOOSE A TREND AND FEEL CONFIDENT
THAT IT WILL BE CONTINUED IN THE FUTURE AS IT WAS IN THE
PAST …..

OUR FORECASTS ARE THE VALUES OF THE TREND AT THE FUTURE
TIMES ….
BUT ….
HOW CLOSE CAN WE EXPECT THEM TO BE TO THE ACTUAL VALUES
WHICH OCCUR THEN?

CLEARLY, THERE IS UNCERTAINTY --- BUT OUR FORECASTS ARE
REALLY OF USE ONLY IF WE CAN QUANTIFY THIS UNCERTAINTY,
OTHERWISE WE WILL NOT HAVE ANY IDEA HOW ACCURATE OR
RELIABLE THEY MAY BE.
WE CANNOT SOLVE THIS PROBLEM FOR FORECASTING AT THIS
STAGE OF THE CLASS.

HOWEVER, IT IS A GENERAL PROBLEM IN ALL STATISTICAL
INFERENCE AND WE WILL MEET IT SEVERAL TIMES IN THE SECOND
SEMESTER, SO WE BEGIN DISCUSSION HERE.

ALL FORECASTS ARE WRONG. WE CALL THE AMOUNT BY WHICH
THEY DEVIATE FROM THE CORRECT VALUE THE FORECAST ERROR.

WE WISH TO QUANTIFY THE UNCERTAINTY OR VARIABILITY IN
FORECAST ERRORS.

WE BEGIN BY CONSIDERING A MORE GENERAL PROBLEM.
THE AMBULANCE SERVICE IN A RURAL AREA IS PROVIDED FROM A
SINGLE CENTRE AND FOR MANY YEARS HAS BEEN ABLE TO
GUARANTEE AN AVERAGE RESPONSE TIME, i.e. THE TIME TO REACH
SOMEONE MAKING AN EMERGENCY CALL, OF 10 MINUTES.

HOWEVER, IN THE LAST TWO YEARS THE AREA HAS BECOME
POPULAR WITH COMMUTERS AND SO THERE HAS BEEN A LOT OF
NEW BUILDING AND SUBSTANTIAL CHANGES IN TRAFFIC PATTERNS.

THE SERVICE TAKES A SAMPLE OF 200 CALLS FROM ITS RECORDS
FOR 1998 AND WISHES TO USE THEM TO MEASURE ITS
PERFORMANCE AGAINST THE “10 MINUTE AVERAGE” CLAIM.
       RESPONSE TIMES (in MINUTES)
 3.4    5.7    6.6    6.9    7.0    7.0    7.2    7.4    7.4    7.6
 7.6    7.7    8.0    8.1    8.2    8.2    8.2    8.3    8.4    8.5
 8.6    8.7    8.7    8.7    8.8    8.8    8.9    8.9    9.0    9.0
 9.0    9.1    9.2    9.2    9.3    9.3    9.4    9.5    9.6    9.6
 9.6    9.6    9.6    9.7    9.7    9.7    9.7    9.9    9.9    9.9
 9.9    9.9    9.9   10.0   10.0   10.1   10.1   10.1   10.2   10.2
10.2   10.2   10.2   10.2   10.3   10.3   10.4   10.4   10.4   10.4
10.4   10.5   10.5   10.5   10.5   10.5   10.5   10.5   10.5   10.5
10.6   10.6   10.6   10.7   10.8   10.8   10.8   10.9   10.9   10.9
11.0   11.0   11.0   11.0   11.0   11.1   11.1   11.1   11.2   11.2
11.2   11.2   11.2   11.2   11.3   11.3   11.3   11.3   11.4   11.4
11.4   11.4   11.5   11.5   11.5   11.5   11.5   11.6   11.6   11.6
11.6   11.6   11.7   11.7   11.7   11.8   11.8   11.8   11.8   11.8
11.9   11.9   11.9   12.0   12.1   12.1   12.1   12.2   12.2   12.2
12.3   12.3   12.3   12.3   12.3   12.4   12.5   12.5   12.5   12.6
12.6   12.7   12.7   12.7   12.7   12.7   12.8   12.8   12.8   12.8
12.9   12.9   12.9   12.9   13.0   13.0   13.1   13.1   13.1   13.1
13.1   13.1   13.2   13.3   13.3   13.3   13.4   13.5   13.5   13.6
13.6   13.6   13.7   13.7   13.8   13.8   13.9   14.1   14.2   14.3
14.4   14.5   14.7   14.8   15.0   15.0   15.5   15.8   15.8   16.0


                           MEAN = 11.12 Minutes
              STANDARD DEVIATION = 2.031 Minutes
In order to quantify the uncertainty present in a large set of data like these response times we usually find that it is not
enough simply to quote the mean and standard deviation. We need to find out how this uncertainty is distributed across
the range of values, here from 3.4 minutes to 16 minutes. A useful way to summarize this is to divide the range into
short intervals and simply count how many observations fall in each one. These counts are called FREQUENCIES and
the resulting information is a FREQUENCY TABLE. Using intervals of length 1 minute, the frequency table for the
response times is shown below.

Range Frequency

 2- 3               0
 3- 4               1
 4- 5               0
 5- 6               1
 6- 7               4
 7- 8               7
 8- 9              18
 9 - 10            24
10 - 11            40
11 - 12            39
12 - 13            32
13 - 14            21
14 - 15             9
15 - 16             4
16 - 17             0
With a little effort, we can easily see how the response times are distributed over the range from this table. Most lie
between 8 and 14 minutes but there are some shorter and some longer times. Interpretation of such frequency tables is
greatly aided by a simple display called a HISTOGRAM in which the frequencies are represented by the lengths of bars,
as in a bar chart.


Range Frequency

 2- 3         0
 3- 4         1          X
 4- 5         0
 5- 6         1          X
 6- 7         4          XXXX
 7- 8         7          XXXXXXX
 8- 9        18          XXXXXXXXXXXXXXXXXX
 9 - 10      24          XXXXXXXXXXXXXXXXXXXXXXXX
10 - 11      40          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
11 - 12      39          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
12 - 13      32          XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
13 - 14      21          XXXXXXXXXXXXXXXXXXXXX
14 - 15       9          XXXXXXXXX
15 - 16       4          XXXX
16 - 17       0
To aid in interpretation we often compute not simply the frequencies but also the RELATIVE FREQUENCIES, which
are defined thus:

          RELATIVE FREQUENCY = FREQUENCY / NUMBER OF DATA

It is, of course, simply the fraction of the data which falls in each interval, rather than the actual count. Again, we often
simplify the appearance of these values by transforming them into PERCENTAGES, thus:

          PERCENTAGE = 100 * RELATIVE FREQUENCY

These are shown below for the above frequency table of response times.
                      Relative
Range       Frequency Frequency      Percentage

 2- 3               0            0           0.0
 3- 4               1       0.005            0.5
 4- 5               0            0           0.0
 5- 6               1       0.005            0.5
 6- 7               4        0.02            2.0
 7- 8               7       0.035            3.5
 8- 9              18        0.09            9.0
 9 - 10            24        0.12           12.0
10 - 11            40          0.2          20.0
11 - 12            39       0.195           19.5
12 - 13            32        0.16           16.0
13 - 14            21       0.105           10.5
14 - 15             9       0.045            4.5
15 - 16             4        0.02            2.0
16 - 17             0            0           0.0
Range     Percentage

 2- 3           0.0
 3- 4           0.5      X
 4- 5           0.0
 5- 6           0.5      X
 6- 7           2.0      XXXX
 7- 8           3.5      XXXXXXX
 8- 9           9.0      XXXXXXXXXXXXXXXXXX
 9 - 10        12.0      XXXXXXXXXXXXXXXXXXXXXXXX
10 - 11        20.0      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
11 - 12        19.5      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
12 - 13        16.0      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
13 - 14        10.5      XXXXXXXXXXXXXXXXXXXXX
14 - 15         4.5      XXXXXXXXX
15 - 16         2.0      XXXX
16 - 17         0.0


The importance of relative frequency (or percentage) rather than frequency lies in the definition of PROBABILITY. If
we consider a simple experiment such as the toss of a coin, it is clear that we cannot predict the outcome. We know it
will be HEADS or TAILS but cannot say which, no matter how often we perform this experiment. However, we know
for certain that if we toss the coin a large number of times HEADS will occur half of the time, i.e. the RELATIVE
FREQUENCY of HEADS will be 0.5 or 50%. We say that the PROBABILITY that HEADS occurs is 0.5 or 50%. The
PROBABILITY that a particular event occurs or a particular value is observed is the limiting value of the relative
frequency of its occurrence in a long run of identical repetitions of the process.
Thus, if the sample of response times shown above is representative of all current response times then we can estimate
the probability of a response time falling between 10 and 12 minutes as 20 + 19.5 = 39.5%. Similarly, we would estimate
the probability of an ambulance being late, i.e. a response time in excess of 10 minutes as 72.5%, and the probability of
being more than 5 minutes late is 2%.
We can construct HISTOGRAMS in Excel and the one for response times appears below. Note, however, that although
a picture of the distribution emerges, it is oddly structured. It is really a bar chart and the bars are well separated, but
represent consecutive numerical ranges and so should actually touch. We move towards a more useful representation by
considering the curve which envelopes the histogram, as in the next graph.

                                         HISTOGRAM of RESPONSE TIMES (Minutes)

              45


              40


              35


              30
  FREQUENCY




              25


              20


              15


              10


              5


              0
                   2- 3   3- 4   4- 5   5- 6   6- 7   7- 8   8- 9   9 - 10 10 - 11 11 - 12 12 - 13 13 - 14 14 - 15 15 - 16 16 - 17
                                               HISTOGRAM of RESPONSE TIMES (Minutes)

                  25




                  20
     PERCENTAGE




                  15




                  10




                  5




                  0
                       2- 3   3- 4   4- 5   5- 6   6- 7   7- 8   8- 9   9 - 10 10 - 11 11 - 12 12 - 13 13 - 14 14 - 15 15 - 16 16 - 17




This curve is called the PROBABILITY CURVE of the HISTOGRAM or the data. If we imagine having more and more
data we could make the intervals into which we counted the data smaller and smaller and achieve a finer and finer
resolution of the HISTOGRAM in the form of the PROBABILITY CURVE. That is often what we try to do in general
with data, i.e. identify the most suitable form of PROBABILITY CURVE for the data. We would then use mathematical
tools and tables to compute the required probabilities.

 In general, that is a difficult problem in practice, but there is one special case where it is relatively simple. Fortunately,
this is also the commonest case we meet in practice, beingespecially useful for data such as these response times which
are aimed at a target value and miss symmetrically around that value. It is the NORMAL PROBABILITY CURVE.
                               NORMAL PROBABILITY CURVE for RESPONSE TIMES




               0              5              10                 15          20             25
                                               RESPONSE TIMES




This graph displays the NORMAL PROBABILITY CURVE which corresponds to the response time data, i.e. has the
same mean and standard deviation. The vertical line is placed at 11.12, the mean, and the rest of the values are
symmetric around this. The width of the curve depends on the standard deviation, here 2.031, and we see that there are
virtually no response times less than 5 minutes or greater than about 17 minutes. We can do much better than this,
however, for the NORMAL PROBABILITY CURVE has a remarkable property: it depends on knowledge of only the
MEAN and STANDARD DEVIATION. Thus, given these two quantities and the fact that the data have a NORMAL
PROBABILITY CURVE we can compute the probabilities of response times in any ranges we wish directly.

As an illustration of this, we simply note three basic results relating to all NORMAL PROBABILITY CURVES.
68%    OF DATA LIE WITHIN   1   STANDARD DEVIATION OF THE MEAN

                      [……………I……………]
         MEAN – ST.DEV    MEAN    MEAN + ST.DEV




 95%   OF DATA LIE WITHIN   2   STANDARD DEVIATIONS OF THE MEAN

              […………………………I…………………………]
        MEAN – 2*ST.DEV MEAN     MEAN + 2*ST.DEV




99.7%   OF DATA LIE WITHIN   3   STANDARD DEVIATIONS OF THE MEAN

       [………………………………………I………………………………………]
 MEAN –3* ST.DEV     MEAN           MEAN + 3*ST.DEV
     AMBULANCE RESPONSE TIMES:                                           MEAN =      11.12   MINUTES
                                                                         ST. DEV =    2.03   MINUTES


 68% OF RESPONSE TIMES WILL LIE IN THE RANGE
     11.12 – 2.03    TO         11.12 + 2.03

     i.e.                                 [ 9.09 ------------13.15 ]

 95% OF RESPONSE TIMES WILL LIE IN THE RANGE
      11.12 – 2 * 2.03       TO          11.12 + 2 * 2.03

     i.e.                      [ 7.06 ------------------------15.18 ]

99.7% OF RESPONSE TIMES WILL LIE IN THE RANGE
       11.12 – 3 * 2.03       TO          11.12 + 3 * 2.03

     i.e.           [ 5.03 ------------------------------------17.21 ]

				
DOCUMENT INFO