Docstoc

Hypothesis Testing

Document Sample
Hypothesis Testing Powered By Docstoc
					                       Hypothesis Testing
 Steps in Hypothesis Testing:             Two-Tailed Test (Z-test @ 5%)
1. State the hypotheses
                                      Null hypothesis:                   = 0
2. Identify the test statistic
   and its probability                Alternative hypothesis:   0
   distribution                       where 0 is the hypothesised mean
3. Specify the significance
   level
4. State the decision rule
                                     Rejection area                         Rejection area
5. Collect the data and perform
   the calculations
6. Make the statistical decision
                                                      1.96 SE        1.96
7. Make the economic or                                              SE
                                                                0
   investment decision

                           One-Tailed Test (Z-test @ 5%)

                                                                             Rejection area
    Null hypothesis:          0
    Alternative hypothesis:  > 0
                                                                1.645 SE
                                                           0
        Hypothesis Testing – Test Statistic & Errors

                                       Test Statistic:

                                    sample statistic - hypothesis ed value
                Test statistic 
                                   standard error of the sample statistic


                                                             Test Concerning a Single Mean
     Type I and Type II Errors                                                          X - μ0
                                                         Test statistic,Z or t 
• Type I error is rejecting the null                                                     sX
  when it is true. Probability =                                    s
  significance level.                                        sx 
                                                                    n       Use σ x if available
• Type II error is failing to reject
  the null when it is false.
• The power of a test is the                      Decision              H0 true      H0 false
  probability of correctly rejecting
  the null (i.e. rejecting the null when          Do not reject null    Correct      Type II error
  it is false)                                    Reject null           Type I       Correct
                                                                        error
           Hypothesis about Two Population Means
               Normally distributed populations and independent samples
 Examples of hypotheses:
   H0 : μ1  μ2  0 versus Ha : μ1  μ2  0
   H0 : μ1  μ2  5 versus Ha : μ1  μ2  5                        (x1  x2 )  (μ1  μ2 )
                                              Test statistic,t 
   H0 : μ1  μ2  0 versus Ha : μ1  μ2  0                         standard error
   H0 : μ1  μ2  3 versus Ha : μ1  μ2  3
   etc......


   Population variances unknown but              Population variances unknown and
         assumed to be equal                           cannot be assumed equal

                     s2 s2
    Standard error    
                     n1 n2
       (n1  1)s1 2  (n2  1)s2 2                                 s12 s22
    s                                            Standard error     
              n1  n2  2                                          n1 n2
s2 is a pooled estimator of the common
variance
Degrees of freedom = (n1 + n2 - 2)
          Hypothesis about Two Population Means
      Normally distributed populations and samples that are not independent -
                               “Paired comparisons test”
 Possible hypotheses:
    H0 : μd  μd0 versus Ha : μd  μd0
                                                                      d  μd0
    H0 : μd  μd0 versus Ha : μd  μd0              Test statistic
                                                                        sd
    H0 : μd  μd0 versus Ha : μd  μd0



           Symbols and other formula                           Application
                               1 n                  • The data is arranged in paired
d     sample mean difference   di
                               n i1                  observations
μd0  hypothesized value of the difference          • Paired observations are
                                                      observations that are
 2
sd  sample variance of the sample differences di     dependent because they have
                                                      something in common
                                            s
sd  standard error of the mean difference  d      • E.g. dividend payout of
                                              n
                                                      companies before and after a
degrees of freedom  n  1
                                                      change in tax law
           Hypothesis about a Single Population Variance
 Possible hypotheses:
              2                     2
 H0 : σ 2  σ 0 versus Ha : σ 2  σ 0
                                                    Test statistic, χ      2
                                                                                
                                                                                  n  1s2
              2                     2
 H0 : σ 2  σ 0 versus Ha : σ 2  σ 0
                                                                                    σ 02
              2                     2
 H0 : σ 2  σ 0 versus Ha : σ 2  σ 0
                                                                                 Assuming normal population




             Symbols                                                            Chi-square distribution is
                                                                                    asymmetrical and
 s2    = variance of the sample
                                        Obtained                                   bounded below by 0
         data                           from the Chi-
 02   = hypothesized value of          square                                                Obtained from
                                        tables. (df, 1                                        the Chi-square
         the population variance                                                              tables. (df, /2)
                                        - /2 )
 n     = sample size
 Degrees of freedom = n – 1

NB: For one-tailed test use                             Lower           Higher
                                                     critical value   critical value
or (1 – ) depending on
whether it is a right-tail or
                                               Reject H0    Fail to reject H0     Reject H0
left-tail test.
          Hypothesis about Variances of Two Populations
 Possible hypotheses:                                            The convention is to always put
                                                                 the larger variance on top
        2    2              2    2
  H0 : σ1  σ2 versus Ha : σ1  σ2
        2    2              2    2                                    2
  H0 : σ1  σ2 versus Ha : σ1  σ2                                   s1
                                           Test statistic,F 
                                                                      2
        2    2              2    2
  H0 : σ1  σ2 versus Ha : σ1  σ2                                   s2
                                                                          Assuming normal populations




 Degrees of freedom:                 F Distributions
                                                                            Obtained from the
                                     are asymmetrical
                                                                            F-distribution table
numerator = n1 - 1,                  and bounded below
                                                                            for:
                                     by 0
denominator = n2 - 1                                                            - one tailed test
                                                                            /2 - two tailed test




                                                                      Critical
                                                                       value


                                                 Fail to reject H0               Reject H0
                                            Correlation Analysis
                                                             Sample Covariance and Correlation Coefficient
          Scatter Plots                                                                    n
y
                                    x
                                                                                           Xi  X Yi  Y 
                           x
                               x
                                        x
                                                                  Sample covariance  i1
                       x       x
                                                                                                 n 1
         x         x x
                           x                                Correlation coefficient measures the direction and
                                x
          x
               x
                       x                                    extent of linear association between two variables
    x              x
        x      x
         x x                                                                                             covariance x, y
                                            x                Sample correlatio n coefficien t, rx, y 
                                                                                                              sx sy
                                                             s  sample standard deviation      1.0  rx , y   1.0




                           Testing the Significance of the Correlation Coefficient
                                                r n2
         Test statistic t                                               Set Ho:  = 0, and Ha:  ≠ 0
                                                 1 r   2
                                                                          Reject null if |test statistic| >
    critical t                 Degrees of freedom = (n - 2)
     Parametric and nonparametric tests

           Parametric tests:                        Nonparametric tests:
 •   rely on assumptions regarding the      •   either do not consider a particular
     distribution of the population, and        population parameter, or
 •   are specific to population             •   make few assumptions about the
     parameters.                                population that is sampled.
All tests covered on the previous          Used primarily in three situations:
slides are examples of parametric           •   when the data do not meet
tests.                                          distributional assumptions
                                            •   when the data are given in ranks
                                            •   when the hypothesis being
                                                addressed does not concern a
                                                parameter (e.g. is a sample random
                                                or not?)
                                                   Linear Regression

   Basic idea: a linear relationship between two variables, X and Y. i
                                                                  Y                                           b0  b1 X 1   i
   Note that the standard error of estimate (SEE) is in the same units as ‘Y’ and
   hence should be viewed relative to ‘Y’.

Y, dependent
variable

     Yi                                                                    x
                                                                                                       Mean of i values = 0
                           i error term                                        x
                                 or residual                   x
                                                                                     x


      ˆ
     Yi                                                                              x
                                                       x
                                                                       x
                                    x
                                               x                                                 ˆ ˆ ˆ
                                                                                                Yi  b0  b1 X i
                                                           x
                       x
                                                                   x

                   x         x
                                    x                                               Least squares regression finds the
                                                                                    straight line that minimises
                                                   x



               x       x
                                        x                                                εi2
                                                                                          ˆ     ( sum of the squared errors,SSE)

                                                                                                          X, independent
                                                                                                                 variable
                                                                           Xi
           The Components of Total Variation




                                         n      2
                        Total variation    Y  Y   SST
                                               i   
                                          i1      
                      n        2 n                                        n     2
                           i i  
Unexplained variation   Y  Y   εi  SSE
                                       ˆ               Explained variation    Yi  Y   SSR
                                                                                       
                       i1       i1                                       i1       
               ANOVA, Standard Error of Estimate & R2


         Sum of squares regression (SSR)

            Sum of squared errors (SSE)                    Standard Error of Estimate

                                                                       n
             Sum of squares total (SST)
                                                                     ε
                                                                      ˆ
                                                                      i=1
                                                                            i
                                                                                2
                                                                                        SSE
                                                          SEE =                     =
                                                                      n-2               n-2


 Coefficient of determination
 R2 is the proportion of the total
variation in y that is explained by                      Interpretation
         the variation in x
                                      When correlation is strong (weak, i.e. near to
                                      zero)
    2SSR SST - SSE                    • R2 is high (low)
 R =     =
     SST   SST                        • Standard error of the estimate is low (high)
       Assumptions & Limitations of Regression Analysis


              Assumptions                               Limitations
1. The relationship between the          1.   Regression relations change over
   dependent variable, Y, and the             time (non-stationarity)
   independent variable, X, is linear    2.   If assumptions are not valid, the
2. The independent variable, X, is not        interpretation and tests of
   random                                     hypothesis are not valid
3. The expected value of the error       3.   When any of the assumptions
   term is 0                                  underlying linear regression are
4. The variance of the error term is          violated, we cannot rely on the
   the same for all observations              parameter estimates, test
   (homoskedasticity)                         statistics, or point and interval
                                              forecasts from the regression
5. The error term is uncorrelated
   across observations (i.e. no
   autocorrelation)
6. The error term is normally
   distributed

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:6
posted:12/4/2011
language:English
pages:12