Forecasting by zhouwenjuan


"It's tough to make predictions, especially about the
    future." (Yogi Berra)
This is another area that can have lots of mathematics.
  Perhaps more important ideas are:
a) Garbage in, garbage out.
b) The longer the forecast, the less reliable it is.
c) Instability in the past suggests instability in the future.
d) Think about an analogy rather than an equation.
 (From the Wharton site).
 Judgmental vs. statistical

                              one or more variables;
one person vs. groups;
                              data based or theory
Delphi techniques,            based;
role-playing,                 linear models, or more
markets, etc.                 complex
Some things can’t be predicted
There is no point thinking about how to predict a roulette
  wheel, or the order of a shuffled deck. If there were,
  casinos wouldn’t exist.
What about the stock market?
“… by this period I thought that the only sure way of
  making money from the stock market was to write a
  book about it. I tried this with Granger and
  Morgenstern (1970), but this was not a financially
  successful strategy.” Clive Granger, Int. J.
  Forecasting, 1992, p. 3-13 “Forecasting stock market
  prices: lessons for forecasters”. (The paper does hold
  out some hope, in fairness).
    What about earthquakes?

This is where they are; you can count them and guess
how many more weak ones will show up.
         Earthquake damage
This is estimated by multiplying the probability of an
earthquake times the expected damage.
Expected damage is done by looking at the number of
buildings and the type of construction.
At one level, you would just use gross estimates;
For better planning, you would actually look at specific
FEMA (Federal Emergency Management Agency) has a
system named HAZUS that attempts to predict damages.
             Profit forecasting
Similarly, business forecasts are “probability of success
times expected profit if we do win”.
Thus, ideally one wants to know how many people will buy
something at a range of prices.
This is very difficult: you can try test-marketing, or running
ads and seeing how many replies come back, but it may
interfere with the final sales plan. Works best with
products like books: you can try publishing one at an
oddball price and see what happens.
    Simplest: tomorrow like today
 You might think this is obviously inadequate, but…
 Mean percentage error forecasting by surveys of consumer
 intention compared with “same as last six months”:
    Item                    Consumer      Last six
                            plans         months
    New car sales           16.7          12.2
    Used car sales          20.9          8.3

    New home sales          28.1          10.7
    Lived-in home sales     21.4          11.3

Lee, Elango & Schnaars, Int. J. Forecasting 1997, pp 127-135
but see also Armstrong et al., same journal, 2000, pp. 383-397
     GIGO: many bad predictions
Club of Rome (1972): Malthusian collapse about now.
Paul Ehrlich: widespread famine in 1975. Ehrlich, in 1980, bet
economist Julian Simon that the prices of chrome, copper,
nickel, tin and tungsten would rise by 1990; they all fell.

Edward Yardeni, Chief Global Economist and Investment
Strategist of Deutsche Bank Securities, … December 16, 1999
on CNBC he stated: "Y2K will cause enough disruption to
cause a fairly intense downturn in the economy in the first six
months of 2000…"

"You ain't goin' nowhere... son. You ought to go back to driving
a truck.“ (Grand Ole Opry manager, firing Elvis Presley)
                   Simple math
Average: the world stays the same. The last year’s
average is next month’s expected value.
Moving average: weight more recent data in making the
Linear: the change each month (or year) is the same.
Cyclic: there is some kind of pattern, typically seasonal.
Combined model: add up some combination of the above.

Often you add in randomness.
             Sample linear fit

Semiconductor sales 1976-89 plotted against durable
equipment investments (if companies are making
investments, they will also buy computers).
 Why not plot against year?

Wouldn’t be a good linear fit because of decline in
1986, and it doesn’t look cyclical either.
Excel will fit linear functions
                        The data are
                        new car sales in
                        Maryland for the
                        last few years.
Linear fit
Cyclical data
Another cyclical series
            More complex math
Multiple regression: fitting data to many variables.
Neural nets: having a computer program which “learns”
the data pattern.
Theories: having some model which lets you use more
complex data.

In general, more complex models don’t help enough; it’s
just not possible to produce really accurate forecasts, and
at least the simple ones can be understood.
               Neural nets

Each node is a linear combination of the nodes pointing
to it. Weights are adjusted so that when you feed in
certain input values, the outputs will be as trained for.
           Why neural nets?
You don’t have to have any theory: you just feed in the
numbers and the network will give results.
They supposedly represent “artificial intelligence” and
mimic biological learning systems. A great deal of hype
has been devoted to them.
Software is readily available to take input and output
data and try to create a neural net that will implement
the transformation. The standard algorithm is called
“back propagation”.
They derive from a suggestion of Minsky’s called
“Perceptrons” but Minsky somehow missed the basic
idea back in the 1960s.
       Why not neural nets?
You don’t get any insight: it’s not possible to look at the
internal nodes and find a “meaning” to them.
They actually have fairly low representational power;
there are better statistical methods out there.
They often don’t work well. E.g. “Forecasting sales data
with neural nets” by Chatfield suggests that they often
go very wrong.
Getting the right training data is difficult: it has to be
quite voluminous and evenly distributed over the
possible input values (otherwise the program learns the
most likely output values rather than the relation of input
to output).
             Expert systems
Another AI spinoff, and sort of the reverse of neural nets:
collecting rules about an organization or process.
Typically these are “production rules” of the form
  if xxxx then yyyy
The original idea was that you could “debrief” the expert
people in a company and encapsulate their knowledge
in a set of rules.
Expert systems were a big fad in the 1980s
In practice obtaining the right information proved difficult
as was knowing which rule to prefer in complex
situations. Nevertheless, there are expert systems for
           Example: MYCIN
MYCIN was an early expert system for the diagnosis of
blood diseases. A typical rule might be:
IF the stain of the organism is gram negative
  AND the morphology of the organism is rod
     AND the aerobicity of the organism is anaerobic
        THEN there is strongly suggestive evidence (0.8)
that the class of the organism is Enterobacter iaceae

Collecting this knowledge is and was always the biggest
bottleneck for expert systems. Imagine, for example,
asking an art expert “how do you recognize a Vermeer?”
           Inference engine
Along with the rules, you need a program which applies
them. It can either work forward (knowing a, b, and c,
what rules can use those facts to deduce something
else); or backwards (if we would like to prove k, and the
rules that imply k require us to know i and j, what rules
would help prove h, i and j)?
Off-the-shelf products exist for this; whereas nobody can
really tell you what rules you need for your industry.
The products aren’t that hard because it is almost
impossible to have an expert system with more than a
few thousand rules: a bigger list becomes
unmanageable (you can’t see what’s going wrong). By
computer standards, this is easy.
   Why use expert systems?
You do get some insight: the system will tell you what
rules it used to get its answer.
Relatively routine tasks can be done automatically and
accurately when the rules can be written down.
Compared with programming in a normal language, you
don’t have to think as much about control flow.
Successful examples are things like filling out tax forms
and configuring computer systems.
Expert systems can also apply rules to make forecasts.
They can combine human rules and time-series
calculations. They are also able to use probabilities to
make “fuzzy” forecasts.
Why not use expert systems?
The lack of control flow may mean that the system does
something stupid and you can’t see how to fix it.
Collecting the expertise is hard, and the major reason
why projects fail.
The whole area has a bad reputation left over from the
period of the 1980s known as “AI winter”.
There are not a lot of success stories to imitate.
Expert systems are extremely narrow in their abilities; it
has proven impossible to take two small expert systems
and combine them to make a single expert system which
can solve two different kinds of problems.
          Theoretical modeling
The “S-curve”: product introduction, growth, leveling off.
Everything goes through such phases. Most of the money
is made in the growth period; the question is how much
advantage you get from having been the introducer.
                 Moore’s law
Special for semiconductors: power doubles in 18 months.
Moore’s law (game consoles)
Theoretical model: Club of Rome
And the model predicted..
                  Right about now
                  food and industrial
                  output collapse,
                  followed by
                  widespread famine
                  and population
                  decline; the vertical
                  and horizontal
                  scales of this chart
                  are deliberately
Find a similar product and hope your sales track.
Chart below are shipments for January in each year.
El Niño (red) and La Niña (blue)

Analogy: in this case, just to a past instance of El Nino.
                 Events matter

From the “Financial Forecast Center”, May 2001 (courtesy of
the Internet Archive). The actual Dow Jones average on
October 31, 2001 was 9075.
Even without unusual events: in Jan. 2004 predicted the euro
in June would be worth $1.08, correct answer was $1.21.
    You can survey either ordinary people or experts. Below
    is from Wired magazine, Dec. 1995.
                Half of LC First virtual Free net access in Virtual reality
                is digital large library public libraries   in libraries
Ken Dowlin      2050        2020         2005                1997

Hector Garcia   2065        Unlikely     2000                2010
Cliff Lynch     2020        2005         Unlikely            1997
Ellen Poisson   2050        2030         2005                2020
Bob Zich        2030        2010         2005                2000
Summary         2043        2016         2003                2005
       Surveying the unknown
People are not good at questions like “would you buy a …”
if they have never seen one and don’t know what it is. So
asking about new products is often more about how you
word the question than what might actually happen.
In any case, asking about intentions is always lower quality
data than looking at actual results.
Even experts often blow it. The California Energy
Commission forecast natural gas prices in 1995 as likely to
grow 3.6% per year. In early 2001 they went up 20% in
two days (ok, this wasn’t a blown forecast, it was enemy
      Don’t forecast in groups
Focus groups have their uses, but forecasting surveys
should use individuals working separately. In a group, one
or two louder or more aggressive individuals are likely to
sway the overall opinion.
Should one try to weight different opinions by perceived
amount of expertise? Probably not: ability at forecasting
seems extremely variable.
        “Delphi” methodology
Assemble a group of people who might know something.
Give them a questionnaire asking for predictions.
Average the results. Circulate these to the group again.
See if people change their minds.

Many problems: interactions between trends, bias in the
questionnaire, and the experts may not be good
1996: California forecast oil prices
                      In case you didn’t
                      notice, oil prices per
                      barrel in 2000 were
                      $27 and are currently
                      This was a Delphi
                      forecast which more or
                      less said that things
                      would stay the same.
• Finding analogies is probably as good as you can do.
• Too much math isn’t that helpful.
• You can combine different forecasts, but it’s best to give
  them as a range.
• Try to understand unusual events.

You can’t beat the market. Or the roulette wheel.

To top