# Stata Tutorial 2 by kg1VS9

VIEWS: 0 PAGES: 4

• pg 1
```									                                      Stata Tutorial 2

1. Load the data
cd "C:\Documents and Settings\Owner\Desktop"
insheet using survey.csv

2. Regression
reg friendhrs internet socialpos time2school

friendhrs      Coef.       Std. Err.          t     P>t         [95% Conf. Interval]

internet    -.3251806      .8691222      -0.37      0.711      -2.105497      1.455136
socialpos -.1412255        4.701476      -0.03      0.976      -9.771763      9.489312
time2school     .3710433 .2256273          1.64     0.111      -.0911332      .8332198
_cons       11.19527       6.42476       1.74       0.092      -1.965257      24.35579

reg friendhrs gamehrs socialpos time2school

friendhrs     Coef.         Std. Err.     t         P>t         [95% Conf. Interval]
gamehrs 1.838271        .7095352      2.59      0.015       .3824248      3.294117
socialpos -2.493186        4.314445      -0.58     0.568        -11.3457      6.359324
time2school     .492213 .2095013         2.35      0.026        .0623518      .9220743
_cons       3.930471       4.580964       0.86      0.398      -5.468892      13.32983

P value: the lower the p value, the less likely the result, assuming the null hypothesis, so the more
significant the result.

3. F test

F test tests the joint significance of the independent variables. When testing the significance of the
goodness of fit, our null hypothesis is that the independent variables jointly equal to zero.

F
RSSu / n  k
m  number of restrictions
k  parameters in unrestricted mod el

If our F-statistic is below the critical value we fail to reject the null and therefore we say the
goodness of fit is not significant.
test gamehrs socialpos time2school
(1) gamehrs = 0
(2) socialpos = 0
(3) time2school = 0
F( 3,      27)= 3.59 (check F table,         and find that critical value is 2.96)
Prob > F     = 0.026
m=3: number of restrictions is 3
n-k=27: df is 27

4. Predicting Y
Obtain predictions:
We have known the coefficient estimates and the x (independent variable) values, we want to find
the values for y.

predict friendhrshat
predict yhat

（note: the two command produce the same results, use “list” command to check）

Calculate standard errors of the predictions
predict e, stdp

5. Ramsey RESET / Davidson MacKinnon specification tests

The RESET test is designed to detect omitted variables and incorrect functional form. It proceeds
as follows:
Suppose we have

After doing OLS, we obtain coefficient estimates, and by using the prediction command which we
mentioned above, we obtain yhat.
Consider the artificial model:

A test for misspecification is a test of         against the alternative           .
Rejection of the null (which means       is different from zero) implies the original model is
inadequate and can be improved. Failure to reject the null says the test has not been able to detect
misspecifications.

Ramsey RESET test using powers of the fitted values of friendhrs
estat ovtest
Ho: model has no omitted variables
F(3, 24) =   2.74
Prob > F =    0.0654
The null is not rejected at 5% level.
If rejected, try to correct the model by including new independent variables or change the
functional form.

6. BP or White test for heteroskedasticity
One of the 4 assumptions for classical linear regression is homoskedasticity. i.e. the variance of
error terms are constant across observations. If the assumptions is violated (heteroskedasticity),
OLS estimator will be biased. We can use BP test or White test to check heteroskedasticity.

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
estat hettest
Ho: Constant variance
Variables: fitted values of friendhrs

chi2(1)     =          0.67
Prob > chi2 =       0.4135

The null hypothesis is not rejected, and the variances are constant.

(note: a large p value or a small chi2 value would indicate the null is not rejected,
homoskedasticity assumption holds; a small p value or a large chi2 value indicates
heteroskedasticity is present)

White test for heteroskedasticity
imtest, white
White's test for Ho: homoskedasticity
against Ha: unrestricted heteroskedasticity

chi2(8)     =           3.03
Prob > chi2 =        0.9324

Cameron & Trivedi's decomposition of IM-test
Source                    chi2       df            p

Heteroskedasticity             3.03      8      0.9324
Skewness                       3.11      3      0.3744
Kurtosis                       1.14      1      0.2859

Total                           7.28     12     0.8384
7. Robust standard errors

We can use robust standard errors to correct heteroskedaticity. Under contamination, RSE leads a
smaller bias.

reg friendhrs gamehrs socialpos time2school, robust

Robust
friendhrs       Coef.    Std. Err.      t        P>t               [95% Conf.     Interval]
gamehrs       1.838271   .747191       2.46     0.021          .3051614           3.37138
socialpos     -2.493186 3.74317        -0.67    0.511           -10.17354        5.187164
time2school     .492213 .2217612        2.22    0.035           .0371966         .9472295
_cons       3.930471     4.856262       0.81    0.425          -6.033755          13.8947

Reference:

Baltagi, B. (2001). Econometric Analysis of Panel Data, second edition, New York, John Wiley &
Sons.
Online resources:
www.nd.edu/~rwilliam/stats2/l25.pdf
http://homepages.nyu.edu/~sc129/econometrics_handouts/hetero_tests_stata.pdf
http://www.economics.soton.ac.uk/courses/ECON3012/Lecture2-2.pdf

```
To top