# Backward Elimination procedure

Document Sample

```					Backward Elimination procedure

The scenario:

An investor wishes to track the NASDAQ 100 (QQQQ) index by purchasing up to 11
stocks which he has already pre-selected. He is only willing to tolerate volatility of 2.5%.
What is the minimum number of the 11 stocks that he must purchase in order to meet his
volatility requirement?

Solution:

Using backwards elimination, the error and R2 (adjusted) of different sets of stocks can
be compared. Collineararity matters, not in the sense that independent variables are
being explored, but in the sense that more variables will be kept in the model.

Results:

The procedure removed three variables (surprisingly GM was one of them). (Output
attached at end of this analysis.) A correlation matrix was run on the remaining variables
to check for high collinearity.
MMM         ASD        EBAY             GS         IBM         MER         MEL        YHOO

MMM         1.00000     0.24255     0.10982        0.32939     0.30720     0.38582     0.34506     0.19755
MMM                      <.0001      0.0205         <.0001      <.0001      <.0001      <.0001      <.0001

ASD         0.24255     1.00000     0.15376        0.32312     0.20317     0.32785     0.34731     0.23693
ASD          <.0001                  0.0011         <.0001      <.0001      <.0001      <.0001      <.0001

EBAY        0.10982     0.15376     1.00000        0.27219     0.17819     0.24400     0.19815     0.42391
EBAY         0.0205      0.0011                     <.0001      0.0002      <.0001      <.0001      <.0001

GS          0.32939     0.32312     0.27219        1.00000     0.32283     0.70268     0.45924     0.37330
GS           <.0001      <.0001      <.0001                     <.0001      <.0001      <.0001      <.0001

IBM         0.30720     0.20317     0.17819        0.32283     1.00000     0.37625     0.36774     0.19470
IBM          <.0001      <.0001      0.0002         <.0001                  <.0001      <.0001      <.0001

MER         0.38582     0.32785     0.24400        0.70268     0.37625     1.00000     0.56383     0.31523
MER          <.0001      <.0001      <.0001         <.0001      <.0001                  <.0001      <.0001

MEL         0.34506     0.34731     0.19815        0.45924     0.36774     0.56383     1.00000     0.27715
MEL          <.0001      <.0001      <.0001         <.0001      <.0001      <.0001                  <.0001

YHOO        0.19755     0.23693     0.42391        0.37330     0.19470     0.31523     0.27715     1.00000
YHOO         <.0001      <.0001      <.0001         <.0001      <.0001      <.0001      <.0001
MER is highly correlated with both MEL and GS (highlighted above). All other
correlations are far lower. To address this, a second regression was run with MER
removed. In both models the C-statistic equals k+1 however R2 fell from .6809 to .6645.
Following the parameters of the question strictly, i.e. choose the minimum number
meeting 2.5% volatility, it seems that keeping MER is prudent.
The SAS System                  15:25 Friday, May 26, 2006   1

The REG Procedure
Model: MODEL1
Dependent Variable: QQQQ QQQQ

Number of Observations Used             445

Backward Elimination: Step 0

All Variables Entered: R-Square = 0.6809 and C(p) = 12.0000

Analysis of Variance

Sum of             Mean
Source                       DF         Squares           Square      F Value       Pr > F

Model                        11         0.02327          0.00212         84.01      <.0001
Error                       433         0.01090       0.00002518
Corrected Total             444         0.03418

Parameter      Standard
Variable       Estimate         Error    Type II SS   F Value     Pr > F

Intercept   -0.00002310    0.00024104    2.31243E-7        0.01   0.9237
MMM             0.08373       0.02508    0.00028074       11.15   0.0009
ASD             0.04881       0.01608    0.00023202        9.21   0.0025
EBAY            0.07495       0.01093       0.00118       47.02   <.0001
GM              0.00535       0.00961    0.00000781        0.31   0.5780
GS              0.08354       0.02857    0.00021541        8.55   0.0036
IBM             0.20059       0.02677       0.00141       56.15   <.0001
HAL             0.02016       0.01201    0.00007095        2.82   0.0940
MER             0.11100       0.03332    0.00027947       11.10   0.0009
MEL             0.09640       0.02677    0.00032671       12.97   0.0004
WMI             0.05392       0.02518    0.00011545        4.58   0.0328
YHOO            0.09883       0.01450       0.00117       46.48   <.0001
Bounds on condition number: 2.5188, 178.19
------------------------------------------------------------------------------------------------------

Backward Elimination: Step 1

Variable GM Removed: R-Square = 0.6807 and C(p) = 10.3100

Analysis of Variance

Sum of             Mean
Source                       DF         Squares           Square      F Value       Pr > F

Model                        10         0.02326         0.00233           92.52     <.0001
Error                       434         0.01091      0.00002514
Corrected Total             444         0.03418

Parameter      Standard
Variable       Estimate         Error    Type II SS    F Value    Pr > F

Intercept   -0.00002725    0.00024073    3.220941E-7       0.01   0.9099
MMM             0.08380       0.02506     0.00028122      11.18   0.0009
ASD             0.04951       0.01602     0.00024010       9.55   0.0021
EBAY            0.07507       0.01092        0.00119      47.26   <.0001
GS              0.08246       0.02848     0.00021085       8.39   0.0040
IBM             0.20161       0.02668        0.00144      57.09   <.0001
HAL             0.02006       0.01200     0.00007023       2.79   0.0954
MER             0.11438       0.03274     0.00030687      12.20   0.0005
MEL             0.09657       0.02674     0.00032789      13.04   0.0003
WMI             0.05475       0.02512     0.00011943       4.75   0.0298
YHOO            0.09892       0.01448        0.00117      46.65   <.0001
The SAS System                  15:25 Friday, May 26, 2006   2

The REG Procedure
Model: MODEL1
Dependent Variable: QQQQ QQQQ

Backward Elimination: Step 1

Bounds on condition number: 2.4354, 149.65
------------------------------------------------------------------------------------------------------

Backward Elimination: Step 2

Variable HAL Removed: R-Square = 0.6786 and C(p) = 11.0986

Analysis of Variance

Sum of             Mean
Source                      DF         Squares           Square      F Value       Pr > F

Model                        9         0.02319          0.00258       102.07       <.0001
Error                      435         0.01098       0.00002525
Corrected Total            444         0.03418

Parameter      Standard
Variable      Estimate         Error    Type II SS    F Value    Pr > F

Intercept   0.00000999    0.00024019    4.371028E-8       0.00   0.9668
MMM            0.08454       0.02511     0.00028632      11.34   0.0008
ASD            0.05250       0.01595     0.00027337      10.83   0.0011
EBAY           0.07567       0.01094        0.00121      47.87   <.0001
GS             0.08588       0.02846     0.00022989       9.11   0.0027
IBM            0.20004       0.02672        0.00141      56.04   <.0001
MER            0.11849       0.03272     0.00033119      13.12   0.0003
MEL            0.09767       0.02679     0.00033558      13.29   0.0003
WMI            0.05537       0.02517     0.00012219       4.84   0.0283
YHOO           0.09995       0.01450        0.00120      47.52   <.0001
Bounds on condition number: 2.4217, 124.27
------------------------------------------------------------------------------------------------------

Backward Elimination: Step 3

Variable WMI Removed: R-Square = 0.6751 and C(p) = 13.9506

Analysis of Variance

Sum of             Mean
Source                      DF         Squares           Square      F Value       Pr > F

Model                        8         0.02307          0.00288       113.23       <.0001
Error                      436         0.01111       0.00002547
Corrected Total            444         0.03418

Parameter      Standard
Variable      Estimate         Error    Type II SS    F Value    Pr > F

Intercept   0.00003384    0.00024100    5.020536E-7       0.02   0.8884
MMM            0.08798       0.02517     0.00031124      12.22   0.0005
ASD            0.05546       0.01597     0.00030727      12.06   0.0006
EBAY           0.07768       0.01095        0.00128      50.36   <.0001
GS             0.08991       0.02853     0.00025300       9.93   0.0017
IBM            0.20236       0.02682        0.00145      56.94   <.0001
MER            0.12354       0.03278     0.00036178      14.20   0.0002
MEL            0.10614       0.02663     0.00040466      15.89   <.0001
YHOO           0.09958       0.01456        0.00119      46.76   <.0001
The SAS System              15:25 Friday, May 26, 2006    3

The REG Procedure
Model: MODEL1
Dependent Variable: QQQQ QQQQ

Backward Elimination: Step 3

Bounds on condition number: 2.4098, 100.13
------------------------------------------------------------------------------------------------------

All variables left in the model are significant at the 0.0250 level.

Summary of Backward Elimination

Variable                  Number      Partial         Model
Step    Removed      Label        Vars In     R-Square       R-Square    C(p)     F Value    Pr > F

1     GM           GM              10        0.0002        0.6807     10.3100       0.31   0.5780
2     HAL          HAL              9        0.0021        0.6786     11.0986       2.79   0.0954
3     WMI          WMI              8        0.0036        0.6751     13.9506       4.84   0.0283

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 23 posted: 3/18/2011 language: English pages: 7