stata

Document Sample
stata Powered By Docstoc
					STATA COMMANDS
Note: Brackets indicate a variable name (do not include the brackets). A vertical bar indicates a mandatory
      choice.


WILDCARDS

        var* refers to all variables starting with "var"
        var? refers to all variables starting with "var" and with one additional character


VARIABLE MANAGEMENT

        CREATE A NEW VARIABLE
              generate [new variable name] = function

        DELETE A VARIABLE
              drop [variable name]

        CREATE A NORMALLY DISTRIBUTED VARIABLE
              generate [new variable name] = rnormal()

        SHOW DATA
              list [variable name]

        CONVERT STRING VARIABLE TO NUMERIC VARIABLE
             destring [string variable name], replace|generate

        CREATE A SEQUENCE OF DUMMIES BASED ON A CATEGORICAL VARIABLE
              tabulate [catvar], generate(dumvar)
              Note: The sequence of dummy variables (in this example) will be called dumvar1, dumvar2,
              dumvar3, etc.

        CONVERT LABELS TO NUMERIC IDENTIFIERS
             egen [new numeric identifier variable] = group([variable containing labels])

        CHANGE MAXIMUM NUMBER OF OBSERVATIONS
             set [number of observations]

        DECLARE DATA SET TO BE TIME SERIES
             tsset [date variable]

        DECLARE DATA SET TO BE 2-D PANEL
             tsset [cross-section variable] [date variable]

        USE A SUBET OF THE DATA
               regress ... if [variable] [condition]
               . indicates a missing observation and has a large value; hence, "if [variable] < ." omits missing
               variables
               & indicates "and"
               == indicates "equality"
               | indicates "or"
REGRESSION

     OLS REGRESSION
           regress [dependent variable] [regressor 1] [regressor 2] ... [regressor N]

     OLS REGRESSION WITH ARMA AND UNIT ROOT CORRECTION
            arima [dependent variable] [regressor 1] [regressor 2] ... [regressor N], arima([AR
              order],[integration order],[MA order])
            ar(n/m) specifies ar orders n through m; ar(n m) specifies ar orders n and m.
            use the option CONDITION after arima to specify conditional maximum likelihood instead of
              unconditional maximum likelihood; conditional ML is sometimes necessary when performing
              ARMA correction in a panel model.

     OLS REGRESSION WITH HETEROSKEDASTICITY CORRECTION
           regress [dependent variable] [regressor 1] [regressor 2] ... [regressor N], vce(hc3)

     PANEL REGRESSION (GLS when using random effects)
            xtreg [dependent variable] [regressor 1] [regressor 2] ... [regressor N], [option]
            For [option], use RE for random effects, BE for time-specific fixed effects, and FE for cross-
              sectional fixed effects.

     TWO STAGE LEAST SQUARES
            ivregress 2sls [dependent variable] ( [endogenous variable] = [list of instruments] ) [list of
             exogenous variables]
            regress [dependent variable] [list of endogenous and exogenous regressors] ( [list of
             exogenous regressors and instruments] )
            Both methods perform 2SLS, but the first method allows for the post-estimation endogeneity
             tests.

     NON-LINEAR LEAST SQUARES
            nl ([equation])
              [equation] takes the form (for example) y={alpha=0}+{beta=.5}*x
            Note: nl does not like missing data in the regressors. Eliminate with IF (for example)
              nl (y={alpha}+{beta1}*x1+{beta2}x2) if x1<. & x2<.
            Note: The missing variable indicator (.) evaluates to positive infinity

     CONSTRAINED REGRESSION
           constraint define [num] [variable] = [variable]
              [num] is the constraint number; [variable] refers to the coefficient attached to variable
           cnsreg [dependent variable] [list of independent variables], constraints([num])

     LOGIT REGRESSION
            logit [dependent variable name] [independent variable names]

     ORDERED LOGIT REGRESSION
          ologit [dependent variable name] [independent variable names]

     MULTINOMIAL LOGIT REGRESSION
           mlogit [dependent variable name] [independent variable names]

     TRUNCATED REGRESSION
          truncreg [dependent variable name] [independent variable name], [ll([lower truncation limit]) or
          ul([upper truncation limit])]
        CENSORED REGRESSION
              cnreg [dependent variable name] [independent variable name], censored([filter variable
                 name])
              Note: the filter variable is a vector where -1 indicates that the observation on the dependent
                 variable is left censored; 1 indicates observation is right censored, and 0 indicates observation
                 is not censored
              Note: left censored means "true measure is less than or equal to recorded measure", right
                 censored means "true measure is greater than or equal to recorded measure"


FITTED MEASURES, RESIDUALS, FORECAST STANDARD ERRORS FROM LAST REGRESSION

        Note: These commands generate residuals and forecasts based on the last run regression.

        OLS:
                    Prediction:
                     predict [new variable name]
                    Forecast Standard Error:
                     predict [new variable name], stdb
                    Residuals:
                     predict [new variable name], residuals

        Panel Data:
                 Residual plus fixed effects (total residual):
                    predict [new variable name], ue
                 Fixed effects (individual specific residual component):
                    predict [new variable name], u
                 Non-specific residual:
                    predict [new variable name], e

        NLS:
                    Prediction:
                     predictnl [new variable name] = predict()
                    Forecast Standard Error:
                     predict [new fitted variable name] = predict(), se([new se variable name])

        GRANGER CAUSALITY TEST

                    var [list of dependent variables], lags(A/B)
                     [A and B are the upper and lower limits on the lags]
                    vargranger
                     NOTE: The null hypothesis is no granger causality.

TESTS

        TEST FOR NORMALITY
               sktest [variable name]
               Note: The null hypothesis is normality.

        PORTMANTEAU (Q) TEST FOR SERIAL CORRELATION
             wntestq [variable name], lags(#)

        CORRELAGRAM
             corrgram [variable name]
     BREUSCH-PAGAN TEST FOR HETEROSKEDASTICITY
           hettest (run this after running a regression)

     TESTS FOR ENDOGENEITY
            estat endogenous (run this after running a regression)

     DICKEY-FULLER TEST FOR NON-STATIONARITY
           dfuller [variable name], [option]
           [option] = {noconstant, trend, drift}
           Note: The null hypothesis is non-stationarity.

     PHILLIPS-PERRON TEST FOR NON-STATIONARITY
            pperron [variable name], [option]
            [option] = {noconstant, trend, drift}
            Note: The null hypothesis is non-stationarity.
            Note: Use this test when testing a residual for non-stationarity.

     HAUSMAN TEST FOR RANDOM EFFECTS
          Note: The null hypothesis is that random effects are consistent and efficient.

     SERIAL CORRELATION TEST FOR PANEL DATA
           xtserial
           Note: This is a module that must be installed. See MODULES section.


TRANSFORMATIONS

         D.[variable name]
          First difference in the variable

         L.[variable name]
          Variable lagged one period


GRAPHING

     twoway (scatter [y1 variable] [y2 variable] ... [x variable])

     plot [y variable] [x variable]


MODULES

     SSC NEW
           List new modules available.

     SSC INSTALL [module name]
            Install an available module.

     FINDIT [module name]
            Locate a module by name.
OTHER

        SAVE COMMANDS AND OUTPUT
              log using [filename]
                Writes all subsequent commands and output to a file.
              log using [filename], noproc
                Writes all subsequent commands, but no output, to a file.
              log off
                Suspends logging.
              log on
                Resumes logging.
              log close
                Stops logging and closes the log file.

        RESTRICT OPERATION TO A SUBSET OF THE DATA
              [command] in [starting observation]/[ending observation]

        GENERATE CORRELATION MATRIX
             correlate [variable name, variable name, ...]

        FOR NEXT LOOP
              forvalues i=[start](step)[end] {
              generate x`i'=random
              }

        GENERATE DATE VARIABLES FROM NUMERIC VARIABLES
              generate [newvar] = date([var],mask)
              examples of mask: 5/12/2008 --> "MDY"
               May 12, 2008 --> "MDY"
               2008-05-12 10:18:00 --> "YMDhms"
               Wed May 12 --> "#MD"

        DECLARE A VARIABLE TO BE A DATE
              format [var] filter
              examples of filter: %td --> date
                %tw --> week
                %tm --> month

        FINDIT [command]
               Searches help and online databases for information on the command or statement.

        HELP [command]
               Provides help on a specific command.

        SEARCH [terms]
              Searches help text for the specified terms.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:18
posted:12/4/2011
language:English
pages:5