EPS Output

Document Sample
EPS Output Powered By Docstoc
					EPS Output

            A Forecaster’s Approach

8/30/2010           EPS Training Edmonton   1
   EPS problems for the meteorologist
   A simple conceptual model
   Re-phrasing what we did yesterday
   Ensemble examples
   Uncertainty
   Clustering using Principal Component
8/30/2010        EPS Training Edmonton     2
Where does the MT fit?
   Project Phoenix has demonstrated that by
    focusing on meteorology and not on
    models in the first 18 to 24 hours, it is
    very easy to show huge improvements
    over first-guess SCRIBE forecasts.
   The impact on day 2 is uneven.
   How do we determine the point where the
    forecaster’s analysis and diagnosis no
    longer adds value?
8/30/2010         EPS Training Edmonton         3
Find the ensemble of the day
   Already have trouble marrying reality and model
    outputs from a handful of models after that initial
   What do we do when confronted with output
    from 10, 20, 100 ensembles?
   Kain et al (2002) showed that forecasters may
    not have a lot skill at determining the “model of
    the day”.
   How does the forecaster, if this is true, decide on
    which of potentially dozens of ensembles to

8/30/2010             EPS Training Edmonton           4
Information Bottleneck
   Front end
        Vast amounts of output that must be
         disseminated, visualized, analyzed, …
   Back end
        Once WE know what’s going on, how do we
         express that to the public?
        WeatherOffice
        Public forecast
        SCRIBE

8/30/2010               EPS Training Edmonton      5
Do Users Want Determinism?
   We assume that users want
    uncertainties spelled out in the forecast.
   What if all they want is to know
    whether it’s going to rain tomorrow?
    Can I go to the beach

8/30/2010         EPS Training Edmonton      6
A New Tool
   When you get a new tool, the first place you
    go is the owner’s manual
   There isn’t one for EPS.
        We need to write one.
        That means that the meteorologists MUST get
        This is not just a Services issues
               We are users of these outputs
               Just as public clients are consulted, so should the

8/30/2010                       EPS Training Edmonton                 7
    A Thought Experiment
Take a bag and put in ten
pieces of paper, each
numbered 1 through 10.
Ask ten people to draw a
piece of paper from the
bag, but before they do so,
ask them what number
they think they’ll draw.

  8/30/2010          EPS Training Edmonton   8
    A Thought Experiment
If 5 out of the 10 say that
they think the number will
be 3, does that mean that
there’s a 50% chance that
the number drawn will be

  8/30/2010          EPS Training Edmonton   9
Model Space vs. Real Space

       Real Space, R                       Model Space, M

The forecaster’s role: evaluate R  M then take the necessary steps to
maximize that area.

8/30/2010                     EPS Training Edmonton                      10
Model Space vs. Real Space
   Reliability is desired
   It cannot be assumed
   Links between the two spaces have to
    be forged
        Statistical post-processing
               Based on past performance.
               Past performance does not necessarily extend
                to the current situation.
        Analysis and diagnosis
8/30/2010                   EPS Training Edmonton              11
An Example

8/30/2010    EPS Training Edmonton   12
      A Joke…
Did you hear the joke about the
lost Swiss mountaineers.
Completely confused, they reach
the top of a peak and one of
them takes out his map and
compass and triangulates on
three nearby peaks. One of his
partners anxiously asks him, "Do
you know where we are?"

"Yes," says the triangulator. "See that mountain over there? We're right
                               on top of it."
If the model and reality disagree, it might be a good idea to go with
      8/30/2010                    EPS Training Edmonton                   13
                                     The whole basis for creating EPS in the
                                     first place is the notion that when you
                                     perturb the model’s initial conditions,
                                     play with its physics and
                                     parameterizations, and alter boundary
                                     conditions, if there are any, you get
                                     different solutions from the model.

In deterministic modeling there are no other solutions. You get one
to work with. The distribution of the solutions is a delta function.
    8/30/2010                    EPS Training Edmonton                         14
      The solution PDF

                                                                            Better solution
In reality, there are an infinite                0.2

number of solutions that fall into                          Our Solution?
some unknown distribution. We                   0.15

                                                                              Our Solution?
don’t know it modality, its height
and width, whether it’s skewed or

                                                           Our Solution?
not. This distribution changes from             0.05

model run to model run and at
each step down the timeline.                      0
                                                       0         5     10        15     20

We don’t know where our one deterministic solution fits within this
distribution. We assume that it’s in a favourable part of the distribution, but
that need not be the case.

There is no reason that reality must appear within this distribution. We hope
that it will because our models are pretty good, but it doesn’t have to!!
      8/30/2010                     EPS Training Edmonton                                    15
Sampling the underlying PDF
   That’s what we’re attempting to do with EPS:
    sample the underlying distribution. If we can
    capture the nuances of the underlying
    distribution by generating multiple solutions,
    we can make some statements about
    probabilities and uncertainties.
   Only about the solutions, though. We can say
    nothing about reality!!

8/30/2010           EPS Training Edmonton        16
Some Statistics
   Consider a random sample taken from an unknown
    distribution. It turns out that the maximum likelihood
    estimator for the mean is the sample’s mean.
        The sample of the underlying PDF represented by the
         ensembles is not random, yet research has shown that, over
         time, the ensemble mean is the better solution.
   The maximum likelihood estimator for the variance is
    proportional and very nearly equal to the sample
    variance, though it tends to under-forecast the true
        The ensemble spread tends to be under-dispersive, behavior
         we expect from the sample variance.

8/30/2010                  EPS Training Edmonton                 17
       Ensemble Pathways (Modes)
Think of ensemble solutions as pathways down the timeline. When all the
solutions are tightly packed (i.e. they have a low variance) we can say that the
ensembles are favoring a single pathway; the individual ensembles are moving
down the same path but some move down the centre of the path, some down the
right side, some down the left, and some meander along it.

                                       If all the ensemble members follow the
                                       same path, we can say that there is a
                                       100% probability that the real solution
                                       is following the same path.

       8/30/2010                    EPS Training Edmonton                        18
      The Fork in the Road
                                                  What happens when the paths
                                                  branch? What if 9 members of
                                                  a 10 member ensemble go
                                                  down the right-hand path and
                                                  only 1 goes down the left?

                                                 There’s a 90% chance of the
                                                 model solution going down the
                                                 right path, and a 10% chance of
                                                 it going down the left.

The trap waiting for the forecaster is that he may well take the most simplistic
option, blindly following the right path because more of the ensembles are
taking it, when in fact, the outlier on the left path might be the most interesting
simply because of its extreme nature.
      8/30/2010                      EPS Training Edmonton                            19
     The River Delta

                                    Now imagine the case where each
                                    ensemble follows a different path,
                                    like a river delta. Each ensemble, no
                                    matter how extreme, has an equal
                                    chance of being the correct one.

This is the rub for the forecaster. On any given day, each ensemble member
has same probability of occurring as the others. They are all based on the
same rules of physics. It is only by looking at their output in terms of
pathways that we can realistically talk about probabilities.

      8/30/2010                   EPS Training Edmonton                      20
8/30/2010   EPS Training Edmonton   21
8/30/2010   EPS Training Edmonton   22
8/30/2010   EPS Training Edmonton   23
8/30/2010   EPS Training Edmonton   24
            From Biswas et al, 2006

                  Hurricane Katrina
                  Costliest and one of the five deadliest hurricanes

                        First landfall near the border of Miami-
                        Dade county and Broward county
                                    Final landfall near Louisiana /
                                    Mississippi border

                                        Around 1400 fatalities

8/30/2010              EPS Training Edmonton                           25
8/30/2010   EPS Training Edmonton   26
8/30/2010   EPS Training Edmonton   27
8/30/2010   EPS Training Edmonton   28
Usefulness of the Ensemble Spread
   If you watch charts of the ensemble spread, a pattern emerges: a lot
    of the spread occurs in areas where we know that models will have
        Strong gradients
        Rapidly moving systems
        Essentially any area with strong spatial or temporal gradients.

8/30/2010                        EPS Training Edmonton                     29
   Without the assumption of reliability…
        Uncertainty is really the degree of agreement, or
         the lack thereof, among the various ensembles.
        From the pathway POV, the more pathways that
         exist through model space, the more we are
         unsure of what the model is really telling us.
        Uncertainty is then measured by the pathway
         spread and the probability that the pathway will
         be well traveled

8/30/2010                EPS Training Edmonton               30
             10 Member Ensemble
                Pathways                                     Uncertainty
   All ten members following 1 path. Ensemble                          Low
   spread gives information about width of
   2 paths. 9 members following 1 one path,       Still low. Most of the time the outlier will be
   the other a second path. Pathway spread          an outlier. Have to check to make sure.
   becoming important. If the spread is small,
   not much of a problem. As it increases, so
   does our uncertainty.
   2 paths. 5 members going down each. Our         Moderate. How do we evaluate the two?
   uncertainty grows, especially if the pathway
   spread is large.
   10 paths, one member going down each.                      High. All bets are off.
   Uncertainty is maximized if the pathway
   spread is large. In this case, the ensemble
   mean and the pathway mean are the same.

8/30/2010                           EPS Training Edmonton                                       31
      Where do we add value?
       Time           Forecaster                         Ensembles
Short-Term        Meteorology dominates             Can play a role by identifying alternate
                                                    pathways that the meteorologist can
                  through on-going analysis         explore or by supporting the analysis
                  and diagnosis.                    and diagnosis that he has done

Medium-Term       Application of analysis and       Statistically post-processed forecasts
                                                    would be driven off the ensemble
                  diagnosis is becoming             mean. Higher probability pathways
                  limited.                          would be favoured, but there would
                                                    still be opportunities for the
                                                    meteorologist to explore lower
                                                    probability outliers and intervene when

Long-Term         Very limited intervention by      Ensembles dominate.
                  the meteorologist, except to
                  quantify uncertainty

      8/30/2010             EPS Training Edmonton                                         32
Managing the data stream
   SPC meteorologists have a tremendous workload.
        In PNR, we forecast for 52% of the country
        This area gets more severe weather than almost all the
         other regions combined.
        We start with the worst SCRIBE forecasts in the country
        We do it with 2 people sliding, one in Winnipeg and the
         other in Edmonton.
   How can we successfully integrate EPS output into the
    SPC, given its high maintenance, when workloads are
    already so high?

8/30/2010                   EPS Training Edmonton                  33
Reducing Dimensionality
   Many statistical methods for accomplishing this
        Cluster Analysis
        Tubing
        Bayesian Techniques
        Factor Analysis
        Principle Component Analysis
   While they use different approaches, they all attempt
    to identify statistically significant pathways, or modes

8/30/2010                  EPS Training Edmonton          34
Principle Component Analysis
   Definition: a procedure for transforming a set
    of correlated variables into a new set of
    uncorrelated variables. This transformation is
    a rotation of the original axes to new
    orientations that are orthogonal to each other.

                The blue lines are the two principle
                components. Note that they are
                orthogonal to each other

8/30/2010            EPS Training Edmonton             35
How do we calculate them?

   To find the principle components in any
    dataset, you need to
        find the Eigenvalues and Eigenvectors of its
         covariance or correlation matrix
        The Eigenvectors and their individual factor
         loadings define how to transform the data from x,
         y to the new coordinate system.

8/30/2010               EPS Training Edmonton            36
Eigenvalues and Eigenvectors
   Consider the square matrix A . We say
    that λ is an eigenvalue of A if there
    exists a non-zero vector x such that Ax
    = λx. In this case, x is called an
    eigenvector (corresponding to λ), and
    the pair (λ ,x) is called an eigenpair for

8/30/2010          EPS Training Edmonton         37
What Kind of Matrix?
   The matrix we use for calculating the eigenvectors and
    eigenvectors can be a number of different things
        A matrix of correlation coefficients
        A matrix of covariances
   I construct a covariance matrix.
   The matrix gives a measure of the how interrelated the
    members are.
        The matrix is real and symmetric
               Element (1,2) is equal to element (2,1) and so-forth
        The diagonals are variances of each member
        The size of the matrix is the number of ensembles

8/30/2010                          EPS Training Edmonton               38
 Variance and Covariance

The variance is really a special case of the covariance
    and is the covariance of a variable with itself

 8/30/2010             EPS Training Edmonton              39
Once the Eigenvalues and
Eigenvectors are calculated

   The Eigenvectors and their individual factor
    loadings define how to transform the data from
    x, y to the new coordinate system.
   We rank the Eigenvectors in order of decreasing
   The Eigenvector with the highest Eigenvalue
    gives the first principle component, the next
    highest gives us the second PC, etc.
   The Eigenvalues are also the variances of the
    observations in each of the new coordinate axes.

8/30/2010            EPS Training Edmonton         40
What we end up with …
   We've extracted a set of principle components from our
    ensemble output
   These are orthogonal and are ordered according to the
    proportion of the variance of the original data that each
   The goal is to reduce the dimensionality of the problem by
    retaining a (small) subset of factors.
   The remaining factors are considered as either irrelevant
    or nonexistent (i.e. they are assumed to reflect
    measurement error or noise).

8/30/2010               EPS Training Edmonton               41
PC Retention
   The number of PC's to retain is a non-trivial exercise and there is no
    single method that is entirely successful.
   Retaining too few PC's results in under-factoring and a loss of signal.
   Retain too many and noise creeps back in (under-filtering) and you
    also increase computation times.
   Keeping in mind that the simplest approach is often the best, I use the
    Kaiser/Guttman criterion..
        The normalized eigenvalue should be between 0 and n (the number of
         members in the ensemble). Since we cannot reduce the dimensionality of
         the problem to anything less than 1, we use this as the criteria: we retain
         only those PC's that have eigenvalues > 1.
   Each PC can be thought of as a pathway through model space.
        The amount of variance explained by each component gives us a measure
         of how well traveled the path is.
        It also provides a measure of when we need to move from a deterministic
         framework to a probabilistic one.

8/30/2010                        EPS Training Edmonton                                 42
PCA Concerns
   PCA explores the linear relationships in the data..
        Non-linear factors are not considered.
        This shouldn't be a problem since we're running the
         algorithm on specific fields (i.e. We're looking at msl
         pressures, 500 mb heights, QPF's).
        There might be a concern if we were comparing 500 mb
         heights and QPF's (and you can do that with PCA
   Sometimes higher order components are difficult to
    interpret physically (how do you interpret a negative QPF,
    for example).
   Since noise is shunted into the higher PC's, each
    successive component will be more and more noisy.

8/30/2010                  EPS Training Edmonton                   43
Varimax Rotation

   One lingering problem is that it becomes increasingly difficult to put
    successive PC's into physical terms. How do you interpret a QPF value
    that might end up being negative after a coordinate rotation?
   Our principle components do not exist in real space, but in component
    space and we need to describe what we see there in physical terms.
   The solution is to perform yet one more coordinate rotation, this one
    intended to maximize the variance between each PC: a so-called
    Varimax Rotation
        Developed by Kaiser in 1958
        The goal is to obtain a clear pattern of factor loadings characterized
         by high loadings of some factors and low loadings of others.

8/30/2010                      EPS Training Edmonton                         44
           Unrotated and Rotated Factor Loadings

Variable        Factor 1     Factor 2
WORK_1         0.654384     0.564143                  For the unrotated case, the factor
WORK_2         0.715256     0.541444                   loadings are all of approximately the
WORK_3         0.741688     0.508212                   same for the first PC.
HOME_1          0.63412    -0.563123
                                                      For the second, you have a mixture
                                                       of positive and negative values
Expl.Var       2.891313         1.791                 After rotation, some factors are much
Prp.Totl       0.481885        0.2985                  closer to zero in one PC and they are
                                                       maximized in the other and vice-
                                                       versa. All are now positive.
Variable      Factor 1       Factor 2                 Since the individual factor loadings
WORK_1       0.862443       0.051643                   are now different, so are the
WORK_2       0.890267       0.110351                   Eigenvalues. They are much closer
WORK_3       0.886055       0.152603                   together.
HOME_1       0.062145       0.845786
HOME_2        0.10723       0.902913
                                                      The rotated PCs may not be
HOME_3       0.140876       0.869995                   orthogonal anymore, so we can no
Expl.Var     2.356684       2.325629                   longer say that they are uncorrelated,
Prp.Totl     0.392781       0.387605                   but at least we can interpret them.

       8/30/2010                        EPS Training Edmonton                            45
8/30/2010   EPS Training Edmonton   46
8/30/2010   EPS Training Edmonton   47
8/30/2010   EPS Training Edmonton   48
Cool Facts About PCA Ensembles
   The principle components map out the relevant ensemble pathways
    through model space.
   If there is only one PC (i.e. one pathway), that PC is little different than
    the ensemble mean.
        This is good behavior since we know that the ensemble mean does produce
         forecasts that are less wrong. We don’t want solutions that show that the
         ensemble mean has no merit.
        Differences between the two are likely due to noise: the mean has it, the
         PC has it stripped out.
   Situations where there are more than one PC have multiple pathways
    and the ensemble mean should not even be considered.
        Careful here … too few ensembles may lead to a single PC when more
         ensembles may produce more PC’s.
        The variance explained by each PC gives a measure of how “well-traveled”
         the pathway is.
        The PCA analysis should tell the forecaster immediately when to and when
         not to use tools like the mean.

8/30/2010                       EPS Training Edmonton                           49