REG

Document Sample
REG Powered By Docstoc
					44.0 REG Command

   The REG command allows estimation of OLS models where lags of the
variables do not have to be explicitly set. Unlike the REGRESSION
command, the REG command loads data into memory. The size of the
largest problem is limited by the size of memory that can be
allocated. The REG command allows panel data models which are not
rectangular to be estimated by use of an identifier variable that
may be a character variable. The REG command allows saving of the
estimated coefficients, t scores, e'e, DW, number of observations
and R**2 in a DMF file along with an identifier variable. Residuals
can also be saved in an SCA FSAV file. The REG command allows
estimation of models for the complete sample in two important
situations: without (usual case) and with panel data. With panel
data, B34S will automatically handle the deletion of the appropriate
number of observations to handle lags as the estimation moves
across the panel.

   If high accuracy and PC models are desired, use the QR command or
use the call olsq( ) command in the MATRIX command which can do QR
estimation without setting up lags. If very large datasets are run and
specialized diagnostic tests are desired, use REGRESSION. If simple
regressions are desired where recursive residuals are needed, use the RR
command of the RR option in the call olsq( ) matrix command. The RR
command
can also run a simple OLS model where all the variables are explicitly
built. The ROBUST command can be used to test models with L1, MINIMAX
and OLS. Simular capability is in call olsq( ). The ROBUST command is
similar to the REG command except that the TEST sentence is not
supported. If just OLS is desired, the REG and REGRESSION commands
should be used unless the matrix command is employed.

Under the MATRIX command call olsq( ) allows RR models, minimax and L1
models and well as GLS with various options. The advantage is that the
output of the estimation can easily be further processed with the
capability built into the matrix programming language.

The general form of the REG command is:

    B34SEXEC REG options parameters$
         MODEL Yvar = Xvar1 Xvar2 $
         TEST xvar                  $
         BISPEC    options parameters$
         TRISPEC options parameters$
         POLYSPEC options parameters$
         REVERSE options parameters$
         B34SEEND$

REG options:

    NOINT      - Suppress constant.
PRINT -    Print panel OLS results.

CPRINT-    Prints panel OLS results without a new page for
           each panel to save space.

RESIDUALP-List residuals with lineprinter plot for complete sample.

PANEL     - Data is in panel form. If data is in rectangular form,
            NREG must be set or SUBKEY must be set.
            It is assumed that the data is in the form
            of observations for subset1 , subset2 ...
            If this is not the case, use the SORT command to
            put the data in the correct form prior to running
            REG command. If the data is NOT in rectangular
            form, SUBKEY must be used to delinate the
            panels.

SAVERES - Saves the residuals is an SCA FSAVE file on unit
          FSAVUNIT. For the complete sample the FSAV dataset name
          is RESIDUAL. For panels, the default is RES0001...
          The residual is saved as RESIDUAL along with OBSNUM
          Y and YHAT. The file is not rewound prior to saving.
          Use SCAINPUT command to rename these files. The keyword
          SPUNCHRES can be used in place of SAVERES. For panel
          data SAVERES takes a great deal of time doing I/O.

SAVECOEF- Saves Panel coefficients and associated statistics
          in a DMF file. The default dataset name is PCOEF.
          The panel regression number is saved in IDENT.
          If a SUBKEY is specified, it is saved. The DMF
          unit is COEFUNIT. The coefficients are saved
          with names BETA0001 BETA0002 BETA0003. The t scores
          are saved with names TSTAT001. Linkages
          between these names, which are needed because
          of the possibility of lags, and the underlying
          variables are listed in variable labels. e'e, R**2
          N, variance Y and the durbin watson values are saved
          with names EPE, RSQ, NOOB, VARY and DW.

ONLYSUB - Specifies that only   subsample regressions will
          be calculated. This   option is only used with
          PANEL data and will   save space since the complete
          dataset will not be   loaded.

ONLYFULL- Specifies that OLS models on the complete dataset are
          to be run for panel data but that panel regressions
          are not going to be run.

DMF       - Sets the DMF save format as UNFORMATTED. This is
            the default. This can also be set as FILEF=DMF. Note
            that is DMF is used, must allocate the DMF file
            as unformatted.

FDMF      - Sets the DMF save format as FORMATTED. This makes
                 a more portable file but requires more time and
                 makes files that are 3 times bigger. This can also be
                 set as FILEF=FMDF.

    ACOV      - Same as WHITE command.

    WHITE     - If set uses the White (1980) formula to calculate the
                SE. For further detail see Greene (2003) page 199.
                This option is similar to the SAS ACOV option on
                SAS PROC REG or the the RATS ROBUSTERRORS on the RATS
                command LINREG. The command ACOV can be used in place
                of WHITE. Davidson-MacKinnon (2004) pages 199-200 show
                alternative formulas. These are implemented in the
                matrix command call olsq as :white1, :white2 and
                :white3. See also Greene (2003) page 220.

Note:   The REG command will start writing DMF files at the position
        of the file. If you wish to add to files already on the
        DMF file, use the POSITION( ) parameter which is documented
        in the OPTIONS command. If the desire is to reuse the
        file, the CLEAN( ) command should be used.

REG parameters:

    IBEGIN=n1      - If the dataset is not panel, sets the first
                     observation to use in the analysis. If the
                     dataset is a panel, sets first observation
                     to use in the panel.

    IEND=n2        - If the dataset is not panel, sets the last
                     observation to use in the analysis. If the
                     dataset is a panel, sets last observation
                     to use in the panel.

    NREG=n3        - Number of observations in each region (sub
                     regression).

    SUBKEY=Vname - Sets variable, possible character, that identifies
                   the subregression.

    DMFUNIT=n4     - Sets the DMF coefficient save unit number. The
                     default is 60.

    DMFNAME=k      - Sets the DMF coefficient dataset name.
                     The default is PCOEF. The keyword DMFMEMBER
                     can be used in place of DMFNAME. Up to
                     10 characters can be specified.

    Note: The following parameters set frequency and starting dates
          for DMF files

    SETFREQ(R)      -   Sets base frequency. 1. = annual data. .1 =
                        data once per decade. R can be set as real OR
                        integer. If SETFREQ is passed -1, the Julian
                        internal date is reset to unused.

    SETYEAR(NN)     -   Sets base year for annual data. Frequency assumed
                        =1.

    SETMY(M1,Y1)    -   Sets base year for monthly data. Frequency assumed
                        =12.

    SETQY(Q1,Y1)    -   Sets base year for quarterly data. Frequency
                        assumed = 4.

    SETDMY(D1,M1,Y1) Sets base year for daily data. Frequency assumed
                     =365.

    FSAVUNIT=n5    - Sets the SCA FSAV residual save unit. The default
                     is 44. DMFUNIT and FSAVUNIT cannot be set to the
                     same unit.

    FSAVNAME=k     - Sets the SCA FSAVE residual dataset name.
                     For the complete sample, the name is RESIDUAL. For
                     panels the default is RES0001. The keywork FSAVMEMBER
                     can be used in place of FSAVNAME.

    CCOMMENTS(' ',' ')    - Sets comments for the DMF file saving
                            coefficients. Any number of 72 col
                            comments can be supplied.

    RCOMMENTS(' ',' ')    - Sets comments for the FSAV file saving
                            residuals. Any number of 72 col
                            comments can be supplied.

The MODEL sentence is required. If PANEL is not in effect, the
Hinich tests which are called by the BISPEC, TRISPEC and POLYSPEC
commands can be used.

MODEL sentence.

      MODEL Y = X1 X2 X3 X4$

      The MODEL sentence lists the left hand variable and the right hand
side variables. Unless NOINT is supplied, a constant will be
automatically added to the model. In addition to the usual
specification, the MODEL sentence in the REG command allows the lags to
be set in the command. The command

      MODEL Y = Y{1} X{0 to 3} Z{1}$

is the same as

      MODEL Y = LAGY X LAG1X LAG2X LAG3X LAG1Z$

except that in the former case the lag variables do not have to be
built. The advantage of this setup is that the 98 variable limit of
B34S is effectively lifted if the added variables are lags.
 TEST sentence

     The test sentence allows user to specify coefficients set to zero
so that exclusion restrictions can be tested. There can be up to 99
TEST sentences. Given the setup

      B34SEXEC REG$
      MODEL Y = LAGY X LAG1X LAG2X LAG3X LAG1Z$
      TEST X LAG1X$

The two test sentences test exclusion restrictions of setting the
coefficient of X and LAG1X to zero. If the sentence TEST X$ were given,
the sqrt of the F value would be the t of the X coefficient. Let u be
the original error term and v the restricted error term and there be g
restrictions.

F = (g,n-k) = ((v'v-u'u)/g)     / ((u'u/(n-k))

 BISPEC sentence.

     The BISPEC sentence performs various nonlinearity, gaussianity and
matringale tests suggested by Hinich. The form of the BISP sentence in
the BTIDEN, BTEST and MARS commands is the same. To save space, detail
for this sentence is only given under the BTIDEN command help file. If
the BISPEC sentence is given with no options or parameters, gaussianity
and nonlinearity tests will be performed using default settings. The
setting

      BISPEC IAUTO ITURNO   $

will perform tests for gaussianity and nonlinearity over a grid of
admissable values for the bandwidth.

TRISPEC sentence

     The TRISPEC command performs 4th order nonlinearity tests suggested
by Hinich. Further detail on this sentence is listed under the BTIDEN
command.

POLYSPEC sentence

     The POLYSPEC command performs various nonlinearity tests suggested
by Hinich within the sample. Further detail on this sentence is listed
under the BTIDEN command.

REVERSE sentence

     The REVERSE sentence performs various Time reversability tests
suggested
by Hinich and Rothman. Further detail in this sentence is listed under
the
BTIDEN command.
Examples.

1. User wants to run a regression on the complete sample and do
   nonlinearity tests. Autocorrelations of the residuals are performed
   using the ACF( ) parameter of the BISPEC sentence.

   b34sexec reg$
          model y= x z{1 to 20}$
          bispec iturno iauto acf(24)$
          b34seend$

2. User wants to run regression subsamples that are marked by the
   variable STOCK. Output of the regression is saved in DMF file
   myruns.dmf with name of runone. A formated dmf file is being used
   and any data in the file is erased prior to the run.
   The saved betas are reread into b34s and the results are sorted
   and the lowest 200 betas listed. Residuals are also saved.

   b34sexec options open('c:myruns.dmf') unit(60) disp=unknown$
            b34seend$
   b34sexec options clean(60)$ b34seend$
   b34sexec options open('c:myres.fsv') unit(44) disp=unknown$
            b34seend$
   b34sexec options clean(44)$ b34seend$
   b34sexec reg dmfunit=60 dmfmember=runone fdmf
                fsavunit=44 fsavname=rone
            panel subkey=stock savecoef saveres$
          model y= x z{1 to 20}$
          b34seend$
   b34sexec data fdmf dmfmember=runone$
          input ident beta0001 beta0002 se000001 se000002
                rsq epe dw rsq noob$
          b34seend$
   b34sexec sort$ by beta001$ b34seend$
   b34sexec list iend=200$ b34seend$

Using an unformatted dmf file the above job would be

   b34sexec options open('c:myruns.dmf') unit(60) disp=unknown
            form=unformatted$
            b34seend$
   b34sexec options clean(60)$ b34seend$
   b34sexec options open('c:myres.fsv') unit(44) disp=unknown$
            b34seend$
   b34sexec options clean(44)$ b34seend$
   b34sexec reg dmfunit=60 dmfmember=runone dmf
                fsavunit=44 fsavname=rone
            panel subkey=stock savecoef saveres$
          model y= x z{1 to 20}$
          b34seend$
   b34sexec data filef=dmf dmfmember=runone$
          input ident beta0001 beta0002 se000001 se000002
                rsq epe dw noob$
          b34seend$
   b34sexec sort$ by beta0001$ b34seend$
   b34sexec list iend=200$ b34seend$


3. User wants to run a regression on the complete sample and test
   if z{5 to 6} z{7} and z{1 to 10} are significant using three
   tests.

   b34sexec reg$
          model y= x z{1 to 20}$
          test z{5 to 6} $
          test z{7}       $
          test z{1 to 10} $
          b34seend$