Prof. Steve De Lurgio PROCEDURE FOR UNIVARIATE ARIMA ANALYSIS Before doing analysis study the series of interest. What is it? What are some of the cause and effect relationships of the series? What are some of the seasonal cause and effect relationships? How is the series measured? Is this one series or two or more series? That is, does the graph show one or more series from a statistical standpoint? Analyze plots of the series. Is the a lot of noise in the series? If so, then this may be a very difficult series to analyze using ARIMA. Also, remember the principle of parsimony. Complex ARIMA models are not expected, normally, there are not more than one or two terms in the model unless there is more complex seasonality. Simpler models forecast out of sample observations better than complex, over-fitted models. Identification 1) Plot Series, scatter diagrams, histograms, and standardized plots. Are summary statistics consistent with approximately a normal distribution? If not, consider alternative transformations. 2) Using a plot, confirm that expected patterns exist. Is there seasonality as expected? Why or why not? Is there variance stationarity? Why or why not? Is there level stationarity? Why or why not? 3) Look for outliers. Be cautious to ensure that seasonality is not the cause of outliers. Don't mistakenly adjust seasonal values, thinking they are outliers. 4) Investigate obvious outliers, document the outliers with full explanation? Under all circumstances, keep an adjusted copy of your unadjusted data. 5) Is the series stationary in level and variance as evidenced by plots of the series, autocorrelations, and partial autocorrelations. 6) If not stationary in variance, then transform the series as needed. Use logs or power transformations as necessary. 7) If not stationary in level, then investigate the appropriate level of differencing. Consider both seasonal and nonseasonal differencing. Be cautious to not over-difference. When in doubt, fit an autoregressive model with the appropriate φ value. φ should be close to 1.0, but not necessarily exactly one. When in doubt about differencing, you can develop two models, one with AR(p) and the other with appropriate differences. 8) When taking differences assure that the standard deviation goes down. After taking differences repeat steps 1) through 7), particularly investigation of the outliers created during the differencing process. 9)After taking differences, study auto and partial correlations for significant patterns. These should provide a hint of the underlying stochastic process. 10)Be sure to use and confirm ARMA ACF and PACF patterns using charts of profiles. Remember basic patterns and their behavior with positive and negative φ and Θ parameters. 11)Avoid a shotgun approach where you put several parameters in the relationship at the same time. You should build models iteratively. Estimation 12) Estimate the model suggested from the autocorrelations or partial autocorrelations. 13)If the estimation procedure has trouble converging on parameter estimates, then increase the number of iterations, with 40 or 50 being the maximum. However, investigate the cause of convergent problems, typically your model is too complex or has redundant parameters. 14) Confirm that the nonlinear estimation procedure converged and terminated its search as expected. 15) Request a correlation matrix between parameters. If the correlation between parameter estimates is very high, then they may be redundant parameters. Try dropping one at a time and both at a time. If there is no significant change in Sum of Squared Errors, then they do not belong in the model. ì Remember the concept of parsimony. Diagnostics 16) Remember a good model should have the following characteristics: Statistically significant parameter estimates. Stationary and invertible parameter estimates. Low residual standard error compared to the standard deviation of the original series. High coefficient of determination R2. Residuals which are white noise. No low-level or seasonal lag ACFs are more than 2Se from zero. Box-Pierce (also known as Q or Box-Llung) Statistic does not indicate that there is a pattern in ACFs. No low-level or seasonal lag PACFs are more than 2Se from zero. Plots of residuals reveal that there are not outliers or nonstationarity. If significant outliers exist, then your model may not be as efficient as it might. Your analysis may be flawed 17) Forecasted values (not fitted values) are reasonable, particularly based upon expert opinion. Forecast error variance profiles are reasonable. 18) Over fit a model to confirm that additional parameters are not necessary. 19) When necessary, generate several different alternative models of the series. Parsimony is the principle in choosing one model over the other, when everything else is equal. Forecast error variance profiles can differ significantly from one model to another. This is important in deciding to choose one model over another. Choose that model that fits the data best, confirms expert opinion about the underlying behavior of the series, and provides relatively accurate forecast for actual data which was withheld during the fitting process. However, the profile of forecast errors can be revealing about the appropriateness of one model over another. A model with tight 95% confidence intervals on parameter estimates is better than another model when everything else is equal. Starting Over With a Fresh Perspective When it is necessary to make sure that one has not overlooked an important alternative or more appropriate model, then it is suggested that the analyst consider alternative modeling strategies. For example, if outliers were not adjusted properly, then an inappropriate model may have been identified. CHECK FOR OUTLIERS. EXPLAIN OUTLIERS. ADJUST FOR OUTLIERS. Try different types of differences when this might yield alternative models. The second time you analyze a series goes much more quickly, possibly one fifth of the time the first analysis took. Be sure to complete each of the above steps for the series which you have chosen for analysis. Also, be sure to complete all the other requirements of your minor report! When completing the steps above, you should record the characteristics of important models, I have attached a suggested ARIMA log sheet. This should be attached to you final report. I cannot answer your questions without analysis of your log sheet. Religiously fill this sheet out. It will greatly reduce your analysis time. GOOD LUCK!!! BDS 545 FORECASTING Prof. Steve DeLurgio ADDITIONAL COMMENTS ON THE STEPS OF ARIMA ANALYSIS Be sure to study your ARIMA Handout in its entirety in regards to the steps of ARIMA analysis. To reinforce that presentation and to provide a slightly different perspective on ARIMA modeling please find below an Overview of ARIMA. While this overview addresses ARIMA modeling specifically, it is equally applicable to any type of forecast modeling process. OVERVIEW OF ARIMA ANALYSIS IDENTIFICATION PLOT THE SERIES AND ALL TRANSFORMATIONS Outlier Identification and Replacement - Don't Over Adjust, but be sure to adjust extreme values. Speculate on the best model. State why the series trends, has seasonality, walks randomly, or has cyclical variations. If you weren't using ARIMA analysis, what type of model would you use. PLOT ACFs, PACFs, and IACFs Is the series stationary? Take appropriate differences to achieve stationarity in mean and variance. Have you created any outliers in the process of achieving stationarity? Have outliers interfered with achieving stationarity or hidden the none stationarity? Have you over-differenced? You can tell if differences are needed by looking at plots of the series and by ACFs. ESTIMATION ITERATIVELY IDENTIFY, ESTIMATE, AND DIAGNOSE TENTATIVE MODELS Add parameters one at a time. Resist the urge to add several parameters at one time. It is legitimate to do so, but be sure to go back and repeat the analysis step by step, parameter by parameter to assure that you have not converged on the incorrect model. Iteratively add and delete components of the model until no new improvements are possible. Pay particular attention to the concept of parsimony. Track the level of R-sq as additional parameters are added to the model. Are you modeling the underlying process or its realization? Remember that an infinite order AR is a first order MA, and vice versa. Are the parameter estimates statistically significant? DIAGNOSES Are parameter estimates statistically significant? Are parameters uncorrelated? That is, are they redundant? Have you overfitted a model by modeling outliers? Are the parameter estimates stationary and invertible? Does the adjusted R-square increase when additional parameters were added? Use ACFs, PACFs, IACFs, plots of residuals, etc. to confirm the following: No significant correlations exist at low order lags. No significant correlations exist at or adjacent to seasonal lags. The Box-Pierce (chi-square) statistic is insignificant at about 20 to 30 lags. (Those of us using Forecast-Pro and RATS are forced to use the output of RATS at only one lag. Those using SAS have a whole range of lag values to analyze.) Are there extreme outliers in the residuals? Then, adjust the causes of these outliers. Compare the results of competing models. Remember that achieving white-noise is not the ultimate objective. Models are defensible even if there are some high correlations, just so long as R- square is relatively high with the final models. To better understand this, consider the fact that the results of using Winters, Fourier Series Analysis, or classical decomposition methods yield residuals (i.e. errors) that are extremely autocorrelated. THE BEST MODEL The best model will have desirable statistical and intuitive attributes. It will: 1) Have intuitive appeal. It will make sense from an intuitive and theoretical standpoint. Not all of the parameters may seem logical or intuitive, but differences and lags of the parameters should make sense. 2) Yield multi-period ahead forecasts that are logical, theoretical sound, and consistent with the past. 3) Be parsimonious, simple, but effective. 4) Have parameters that are statistically significant and uncorrelated. 5) Yield the best intuitive forecasts and forecast error variance profiles. 6) Be intuitive to managers. 7) Clearly represent the patterns in demand that we expected. 8) Yield out of sample forecasts that are defensible. Complicating the determination of the above is the fact that several different models may yield similar attributes. In such cases, discard all but one of the models if the models are really identical. If the model are telling us slightly different things about the series, then we should use several models if our goals is to achieve the highest accuracy of forecasts. IF YOU GET "NO" ANSWERS TO SOME OF THE ABOVE QUESTIONS, THEN IT IS TIME TO REEVALUATE YOUR APPROACH, AND TRY AGAIN. IF YOU REACH A ROAD BLOCK IN YOUR ANALYSIS, THEN YOU MAY HAVE TO STOP AND CONSULT ME. DO NOT CONSULT ME UNLESS YOU HAVE A LOG OF YOUR ANALYSES, I WANT TO HELP BUT CANNOT WITHOUT YOUR LOG COMPLETELY FILLED OUT!!! I HATE FILLING OUT LOGS, BUT IT WORKS. GOOD LUCK, SUCCESSFUL ANALYSES IS FUN!
Pages to are hidden for
"PROCEDURE FOR UNIVARIATE ARIMA ANALYSIS"Please download to view full document