MOS Performance

Document Sample
MOS Performance Powered By Docstoc
					          MOS Performance
• MOS significantly improves on the skill of
  model output.
• National Weather Service verification
  statistics have shown a narrowing gap
  between human and MOS forecasts.
Cool Season Mi. Temp – 12 UTC Cycle
               Average Over 80 US stations
  7

  6

  5

  4

  3

  2
  '66-67 '71-72 '76-77 '81-82 '86-87 '91-92 '96-97 '01-02
                       Cool Season

                24-h GUID.             24-h LCL
                48-h GUID.             48-h LCL
                                                        Prob. Of Precip.– Cool Season
                                                      (0000/1200 UTC Cycles Combined)

                                       0.7

                                                      Guid POPS 24 hr             Local POPS 24 hr

                                       0.6            Guid POPS 48 hr             Local POPS 48 hr
Brier Score Improvement over Climate




                                       0.5




                                       0.4




                                       0.3



                                       0.2




                                       0.1




                                        0
                                        1966   1969      1972     1975   1978   1981    1984     1987   1990   1993   1996   1999   2002
                                                                                         Year
MOS Won the Department
Forecast Contest in 2003
For the First Time!
   Average or Composite MOS
• There has been some evidence that an average or
  consensus MOS is even more skillful than
  individual MOS output.
• Vislocky and Fritsch (1997), using 1990-1992
  data, found that an average of two or more MOS’s
  (CMOS) outperformed individual MOS’s and
  many human forecasters in a forecasting
  competition.
              Some Questions
• How does the current MOS performance…driven
  by far superior models… compare with NWS
  forecasters around the country.
• How skillful is a composite MOS, particularly if
  one weights the members by past performance?
• How does relative human/MOS performance vary
  by forecast projection, region, large one-day
  variation, or when conditions vary greatly from
  climatology?
• Considering the results, what should be the role of
  human forecasters?
              This Study
• August 1 2003 – August 1 2004 (12 months).
• 29 stations, all at major NWS Weather
  Forecast Office (WFO) sites.
• Evaluated MOS predictions of maximum and
  minimum temperature, and probability of
  precipitation (POP).
National Weather Service locations used in the study.
            Forecasts Evaluated
• NWS Forecast by real, live humans
• EMOS: Eta MOS
• NMOS: NGM MOS
• GMOS: GFS MOS
• CMOS: Average of the above three MOSs
• WMOS: Weighted MOS, each member is weighted
  by its performance during a previous training period
  (ranging from 10-30 days, depending on each
  station).
• CMOS-GE: A simple average of the two best MOS
  forecasts: GMOS and EMOS
  The Approach: Give the NWS the Advantage!
• 08-10Z-issued forecast from NWS matched against
  previous 00Z forecast from models/MOS.
   – NWS has 00Z model data available, and has added
     advantage of watching conditions develop since 00Z.
   – Models of course can’t look at NWS, but NWS looks at
     models.
• NWS Forecasts going out 48 (model out 60) hours, so
  in the analysis there are:
   – Two maximum temperatures (MAX-T),
   – Two minimum temperatures (MIN-T), and
   – Four 12-hr POP forecasts.
Temperature Comparisons
                   Temperature




MAE (F) for the seven forecast types for all stations,
all time periods, 1 August 2003 – 1 August 2004.
                 Large one-day temp changes




MAE for each forecast type during periods of large temperature
change (10F over 24-hr), 1 August 2003 – 1 August 2004.
Includes data for all stations.
MAE for each forecast type during periods of large
departure (20F) from daily climatological values,
1 August 2003 – 1 August 2004.
Number of days
each forecast is the
most accurate, all
stations.

 In (a), tie situations
are counted only
when the most
accurate
temperatures are
exactly equivalent.       Looser Tie Definition
In (b), tie situations
are cases when the
most accurate
temperatures are
within 2F of each
other.
Number of days
each forecast is
the least
accurate, all
stations.
In (a), tie situations are
counted only when the
least accurate
temperatures are exactly     Looser Tie Definition
equivalent. In (b), tie
situations are cases when
the least accurate
temperatures are within
2F of each other.
                   Highly correlated time series




Time series of MAE of MAX-T for period one for all stations, 1 August
2003 – 1 August 2004. The mean temperature over all stations is
shown with a dotted line. 3-day smoothing is performed on the data.
                              Cold spell




Time series of bias in MAX-T for period one for all stations, 1
August 2003 – 1 August 2004. Mean temperature over all
stations is shown with a dotted line. 3-day smoothing is
performed on the data.
  MAE for all stations, 1 August 2003 – 1 August
  2004, sorted by geographic region.
MOS Seems to have the most problems at high elevation stations.
Bias for all stations, 1 August 2003 – 1 August 2004,
sorted by geographic region.
Precipitation Comparisons
Brier Scores for Precipitation for all stations for the entire
study period.
Brier Score for all stations, 1 August 2003 – 1 August
2004. 3-day smoothing is performed on the data.
                 Precipitation




Brier Score for all stations, 1 August 2003 – 1 August
2004, sorted by geographic region.
Reliability diagrams for period 1 (a), period 2 (b),
period 3 (c) and period 4 (d).
NWS Main MOS site:

http://www.nws.noaa.gov/mdl/synop/products.shtml
                   Ensemble MOS

 Ensemble MOS forecasts are based on ensemble runs of the GFS
model included in the 0000 UTC ensemble suite each day. These
runs include the.operational GFS, a control version of the GFS (run
at lower resolution), and 10 pairs (positive and negative) of bred
perturbation runs (20 members). The operational GFS MOS
prediction equations are applied to the output from each of the
ensemble runs to produce separate bulletins in the same format as
the operational message.
            Gridded MOS
•The NWS needs MOS on a grid for many
reasons, including for use in their IFPS
analysis/forecasting system.
•The problem is that MOS is only available at
station locations.
•A recent project is to create Gridded MOS.
•Takes MOS at individual stations and spreads it
out based on proximity and height differences.
Also does a topogaphic correction dependent on
reasonable lapse rate.
     Gridded MOS SITE

http://www.nws.noaa.gov/mdl/synop/gmos.html
Current “Operational” Gridded MOS
Grid-Based Model Bias Removal:




Model biases are a reality   We need to get rid of them
        Grid-Based Bias Removal
• In the past, the NWS has attempted to remove
  these biases only at observation locations (MOS,
  Perfect Prog)--exception…gridded mos recently
• Removal of systemic model bias on forecast grids
  is needed. Why?
   – All models have significant systematic biases
   – NWS and others want to distribute graphical
     forecasts on a grid (IFPS)
   – People and applications need forecasts
     everywhere…not only at ASOS sites
   – Important post-processing step for ensembles
A Potential Solution: Obs-Based
   Grid Based Bias Removal
• Based on observations, not analyses.
• Base the bias removal on observation-site
  land use category, elevation, and
  proximity. Land use and elevation are the
  key parameters the control physical biases.
Spatial differences in bias
                          The Method
 Calculate model biases at observation locations by interpolating model
  forecasts to observation sites.
 Identify a land use, elevation, and lat-lon for each observation site.
• Calculate biases at these stations hourly. Thus, one has a data base of
  biases.
• For every forecast hour: At every forecast grid point search for nearby
  stations of similar land use and elevation and for which the previous
  forecast value is close to that being forecast at the grid point in question..
    – E.g., if the forecast temperature was 60, only use biases for nearby stations of
      similar land-use/elevation associated with forecasts of 55-65.
• Collect a sufficient number of these (using closest ones first) to average
  out local effects (roughly a half dozen). Average the biases for these sites
  and apply the bias correction to the forecast.
Raw 12-h Forecast   Bias-Corrected Forecast
Sal Lake City
Bozeman
                   The End




http://www.atmos.washington.edu/~jbaars/mos_vs_nws.html

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:12/12/2011
language:
pages:47