This paper outlines the forecast model used to forecast the

Document Sample
This paper outlines the forecast model used to forecast the Powered By Docstoc
					This paper outlines the forecast model used to forecast the General Assistance Caseload.

Description of Program:
Washington State’s General Assistance Program (GA) provides cash grant up to $339 a
month to low-income adults (ages 18-64 years) who are unable to work due to a physical
or mental impairment. Adults receiving a cash grant are also eligible for some type of
state funded medical care.

The General Assistance Program is comprised of several, separate programs to which
clients are assigned based on the type and severity of incapacity:
        General Assistance Unemployable (GA-U) provides benefits to adults who have
an impairment that is expected to last at least 90 days.
        Adults who have been or who are expected to be disabled for 12 months or more
and are likely to qualify for federal disability aid under the Supplemental Security
Income (SSI) program transition from the GA-U program to the General Assistance-
Expedited Medicaid (GA-X) program.
        General Assistance Aged (GA-A) provides benefits for aged adults with few
resources who do not qualify for Social Security or Social Security Insurance. There are a
small number of clients on similar programs for the blind, disabled, and institutionalized.
(GA-B, GA-D, GA-R, or GA-K).

The number of recipients of the General Assistance monthly grant can either be measured
as persons or as assistance units. Although in the vast majority of cases an assistance unit
refers to an individual person, in some cases both spouses of a married couple qualify for
benefits and are counted as one assistance unit. Between 1997 and 2004 the number of
people has been between one-and-a-half and three-and-a-half-percent higher than the
number of assistance units. By tradition, the Caseload Forecast Council forecasts
assistance units rather than people.

Entry and Exit rates:
An entry exit model is used to forecast the caseload. First, an entry rate is calculated as
the number of new cases this month as a percent of the eligible population last month.
The eligible population is calculated as all Washington residents between the ages of 18
and 64 less the number of people already on the caseload. The exit rate is calculated as
the number of exits from the program this month as a percentage of the total caseload last
                                                                                                       Entry rate - historical data
















Entry rates have been on an overall upward trend after the fall which took place in the
early part of 2002. In 2004, entry rates have leveled off somewhat after growing strongly
during the first few months. From January to March 2004, entry rates grew by 25.8%,
then rates declined by 10.3% over the next two months. From May 2004 through the
most current data point, October 2004, entry rates have remained fairly steady, averaging

                                                                                              Exit rate - historical data

















Exit rates began a decline during the second quarter of 2002. The decline in exit rates
coincided with the end of the decline and subsequent rise in entry rates. The rising entry
rates and declining exit rates led to a rise in the caseload. The decline in exit rates never
reversed itself, but did level off in early 2003. Since that time, exit rates have experienced
more variation than entry rates, but less trend. The variation of exit rates, especially over
the past year has been around a fairly constant average.

The data on entry and exit rates and caseload data is compiled over time from
administrative data sources. There are some adjustments and additions over time as forms
are processed into the next month or clients are reclassified. Revision is greatest with the
most recent months. A process called lag adjustment is used to try to estimate the
amount by which a data point will ultimately be adjusted. Consider, for example,
caseload data through December 2004 was available in January 2005. The data was
subsequently updated in February 2005 and March 2005. Each new run of the data
contained more client records for each month.

Let’s take a look at the data from July 2004. In January, this data was six months old, in
that the first look at the July data had occurred in August, the second in September, the
third in October, the fourth in November, and the fifth in December. From January 2005
to March 2005, the number of recorded cases from July 2004 changed little. This would
indicate that for General Assistance, the process of adding new records changes little
after six or seven months.

The first look at the data for December 2004, however, occurred in January 2005.
February was the second look and we see that

The data for July had already been pulled in August, September, October, November, and
December, before the pulling in January. The client count, thus, changed little from the
January to February to March data runs. A more recent month of data, like December
2004, though was seen for the first time in January 2005. In February the data had been
revised upwards by 607, or by 2.5% and then in March by an additional 68 or by 0.27%.

             Jan 2005     Feb 2005    Mar 2005
   Jul-04      24,001       24,006      24,010
  Aug-04       24,283       24,290      24,293
  Sep-04       24,507       24,521      24,527
  Oct-04       24,785       24,817      24,830
  Nov-04       24,769       24,861      24,885
  Dec-04       24,642       25,249      25,317

The lag adjustment process added 714 clients to the January run of the December
caseload. This created an estimate for the caseload in December 2004 of 24,642+714 =
25,356. What this tells us is that as of January 2005, our best estimate of what the
caseload was for December 2004 will eventually settle on is 25,356. The magnitude of
the lag adjustment is based on the average amount that each month is adjusted from when
the data is first compiled to when all data is in and it no longer changes.

Time Series Models:
Time series modeling is used to forecast both entry and exit rates. SAS software fits a
variety of different types of time series models to the data including ARIMA models,
smoothing models, and random walk or constant trend models. It then allows the user to
rank the models based on different best fit criteria. Usually a hold out sample is identified
to test the fit of the model. So, for example, the most recent six months of data would not
be used to fit the model. The accuracy of the forecast could then be tested against these
six data points.

Seasonality and non-stationarity are also accounted for by different SAS models and, so,
these aspects of the time series are accounted for in choosing the model with the best fit.

Still, a considerable amount of forecaster judgment goes into the selection of the model.
The major considerations are
         1 – How much historical data should be included?
         2 – Should an indicator be used to account for an anomalous period?
         3 – Should outlying data points be used?

First, the choice of historical data or the length of the time series - time series models
assume that the data has been generated by some given stochastic process. This
assumption is violated if there has been some fundamental change in the process
generating the data. In such a case, the model should be developed only for the period
after the change. A permanent change in the data generating process is distinct from the
problem of an anomalous period. In the case of an anomalous period or an outlying data
point, it is assumed that one process has produced all the data, but that the process didn’t
work for one or more periods. In such a case, the best procedure is to use one model for
the entire time series, but to use a dummy variable or smoothing technique to account for
the odd periods.

To illustrate the difference between these problems and the most appropriate response to
each, consider the following three scenarios1:

        First scenario: The GA program changes its entry criteria in May 2001. Once the
change is implemented it continues on indefinitely into the future. In such a case, we
would not expect a change in exit rates, but entry rates may behave quite differently than
they did prior to May 2001. If this were the case, the past behavior of entry rates would
not be useful in predicting the future behavior. We would just want to use entry rates
from May 2001 on to predict future entry rates.

        Second scenario: Suppose that from January 2003 through June 2003 a major
state hospital that had traditionally served the medical needs of indigent people closed

 These three scenarios are all purely fictitious and were developed only to illustrate points about model
due to a catastrophic fire. Although patients were transferred to other hospitals, many
homeless or near homeless people are hard to reach and probably put off care.
Furthermore, administrative structures in place to process applications for GA from this
hospital did not operate. In June 2003 the hospital reopens. An indicator variable would
probably be the most appropriate here, because the underlying process does not change, it
is just interrupted.

        Scenario three: All data points are steady and consistent when the most recent
month of data comes in showing a drop of 15%. No reason for the drop can be identified
by the workgroup. In this case, we just don’t know. The data point is very odd given the
history and until we get more information the best policy is probably to ignore that data
point. Alternatively, if we think the data point is giving us some useful information we
might just dampen the effect of the data point on our trend to be safe.

Tests of robustness:
Due to the nature of statistical models, the models may generate quite different forecasts,
but their goodness of fit measures might be quite close. In an extreme case suppose that
forecast A predicted an upward trend while forecast B predicted a downward trend. The
Akaike Information Criterion (AIC), which is a commonly used measure for goodness of
fit, might be quite similar for the two forecasts. If a slight change in one of the data points
would cause the AIC to change which forecast it chose, we would say the forecast is not
very robust. That is, for a robust model, a small change in data would not produce a large
change in the forecast.

If the forecast is not robust, a number of other attributes of the time series can be
examined to help the forecaster improve the forecast. It may be that the data has too
much variation relative to its trend to forecast successfully. It may be that the model is
shifting fundamentally in the current period, etc. Autocorrelation and partial
autocorrelation functions can be examined to learn more about the nature of the time
series. Beyond technical solutions, discussions with program or legislative staff may shed
more light on why the data is behaving as it is. All such factors need to be examined
together to identify the best possible forecast.

The role of the technical workgroup:
Even before the first scheduled meeting of a season, the expertise of the technical
workgroup members is crucial to guide the forecaster in the forecasting process. Out of
meeting communication is necessary both with program and legislative staff. Program
staff provide information on:
        - reasons behind historical movements in caseloads
        - current or expected future changes in work practices or internal policies that
           could affect the caseload.
        - The pace of implementation of policies
        - Data issues
        - Et cetra
Since the information provided by program staff is so crucial and because some staff
members cannot or should not attend formal technical workgroup meetings, the
forecaster’s contact with program staff can and should extend beyond the technical
workgroup members. For example, a certain change in the data in one specific CSO could
be explained quickly and easily by the director of that CSO, but there would be no reason
for this person to regularly attend technical workgroup meetings or even to be interested
in the official forecast.

Legislative staff know a lot about the implementation of policies that affect caseloads and
can keep the forecaster up-to-date on emerging legislative initiatives. Legislative staff
also often have a broader perspective and can view the specific caseload within the
context of other caseload changes. As many of the caseloads are intertwined either
causally or casually such information can be crucial in explaining and predicting caseload

Model Assessment:
Once the forecast is accepted by the technical workgroup, it is submitted to the formal
workgroup and then to the Council for approval. Once the forecast becomes official it is
used in the budget process.

Each month as new caseload data becomes available, the variance2 between the new data
and the forecast is calculated and published on the Caseload Forecast Council website. A
decision is made by the technical workgroup to update the forecast based on:

           Outside changes likely to change the level or trend of the caseload
                  - Information about upcoming changes to the program
                  - Legislation expected to impact the forecast
                  - Other changes likely to affect Caseload
           An examination of caseload variances
                  - Two or more variances larger than the average deviation from trend over
                  the past 12 months.
                  - Variances that move in one direction becoming progressively larger
                  - One observation with a large variance is generally not enough to
                  motivate a revision unless it is backed up with additional information that
                  explains the change.

Once a decision is made to revise a forecast, most often the entire model is re-run using
the new data. If the adjustment appears to be small, however, only the trend may be

    The variance is calculated as the ratio of the new caseload data and the forecast minus one.

Shared By: