# Hypothesis Testing

Document Sample

```					                       Hypothesis Testing
Steps in Hypothesis Testing:             Two-Tailed Test (Z-test @ 5%)
1. State the hypotheses
Null hypothesis:                   = 0
2. Identify the test statistic
and its probability                Alternative hypothesis:   0
distribution                       where 0 is the hypothesised mean
3. Specify the significance
level
4. State the decision rule
Rejection area                         Rejection area
5. Collect the data and perform
the calculations
6. Make the statistical decision
1.96 SE        1.96
7. Make the economic or                                              SE
0
investment decision

One-Tailed Test (Z-test @ 5%)

Rejection area
Null hypothesis:          0
Alternative hypothesis:  > 0
1.645 SE
0
Hypothesis Testing – Test Statistic & Errors

Test Statistic:

sample statistic - hypothesis ed value
Test statistic 
standard error of the sample statistic

Test Concerning a Single Mean
Type I and Type II Errors                                                          X - μ0
Test statistic,Z or t 
• Type I error is rejecting the null                                                     sX
when it is true. Probability =                                    s
significance level.                                        sx 
n       Use σ x if available
• Type II error is failing to reject
the null when it is false.
• The power of a test is the                      Decision              H0 true      H0 false
probability of correctly rejecting
the null (i.e. rejecting the null when          Do not reject null    Correct      Type II error
it is false)                                    Reject null           Type I       Correct
error
Normally distributed populations and independent samples
Examples of hypotheses:
H0 : μ1  μ2  0 versus Ha : μ1  μ2  0
H0 : μ1  μ2  5 versus Ha : μ1  μ2  5                        (x1  x2 )  (μ1  μ2 )
Test statistic,t 
H0 : μ1  μ2  0 versus Ha : μ1  μ2  0                         standard error
H0 : μ1  μ2  3 versus Ha : μ1  μ2  3
etc......

Population variances unknown but              Population variances unknown and
assumed to be equal                           cannot be assumed equal

s2 s2
Standard error    
n1 n2
(n1  1)s1 2  (n2  1)s2 2                                 s12 s22
s                                            Standard error     
n1  n2  2                                          n1 n2
s2 is a pooled estimator of the common
variance
Degrees of freedom = (n1 + n2 - 2)
Normally distributed populations and samples that are not independent -
“Paired comparisons test”
Possible hypotheses:
H0 : μd  μd0 versus Ha : μd  μd0
d  μd0
H0 : μd  μd0 versus Ha : μd  μd0              Test statistic
sd
H0 : μd  μd0 versus Ha : μd  μd0

Symbols and other formula                           Application
1 n                  • The data is arranged in paired
d     sample mean difference   di
n i1                  observations
μd0  hypothesized value of the difference          • Paired observations are
observations that are
2
sd  sample variance of the sample differences di     dependent because they have
something in common
s
sd  standard error of the mean difference  d      • E.g. dividend payout of
n
companies before and after a
degrees of freedom  n  1
change in tax law
Hypothesis about a Single Population Variance
Possible hypotheses:
2                     2
H0 : σ 2  σ 0 versus Ha : σ 2  σ 0
Test statistic, χ      2

n  1s2
2                     2
H0 : σ 2  σ 0 versus Ha : σ 2  σ 0
σ 02
2                     2
H0 : σ 2  σ 0 versus Ha : σ 2  σ 0
Assuming normal population

Symbols                                                            Chi-square distribution is
asymmetrical and
s2    = variance of the sample
Obtained                                   bounded below by 0
data                           from the Chi-
02   = hypothesized value of          square                                                Obtained from
tables. (df, 1                                        the Chi-square
the population variance                                                              tables. (df, /2)
- /2 )
n     = sample size
Degrees of freedom = n – 1

NB: For one-tailed test use                             Lower           Higher
critical value   critical value
or (1 – ) depending on
whether it is a right-tail or
Reject H0    Fail to reject H0     Reject H0
left-tail test.
Hypothesis about Variances of Two Populations
Possible hypotheses:                                            The convention is to always put
the larger variance on top
2    2              2    2
H0 : σ1  σ2 versus Ha : σ1  σ2
2    2              2    2                                    2
H0 : σ1  σ2 versus Ha : σ1  σ2                                   s1
Test statistic,F 
2
2    2              2    2
H0 : σ1  σ2 versus Ha : σ1  σ2                                   s2
Assuming normal populations

Degrees of freedom:                 F Distributions
Obtained from the
are asymmetrical
F-distribution table
numerator = n1 - 1,                  and bounded below
for:
by 0
denominator = n2 - 1                                                            - one tailed test
/2 - two tailed test

Critical
value

Fail to reject H0               Reject H0
Correlation Analysis
Sample Covariance and Correlation Coefficient
Scatter Plots                                                                    n
y
x
 Xi  X Yi  Y 
x
x
x
Sample covariance  i1
x       x
n 1
x         x x
x                                Correlation coefficient measures the direction and
x
x
x
x                                    extent of linear association between two variables
x              x
x      x
x x                                                                                             covariance x, y
x                Sample correlatio n coefficien t, rx, y 
sx sy
s  sample standard deviation      1.0  rx , y   1.0

Testing the Significance of the Correlation Coefficient
r n2
Test statistic t                                               Set Ho:  = 0, and Ha:  ≠ 0
1 r   2
Reject null if |test statistic| >
critical t                 Degrees of freedom = (n - 2)
Parametric and nonparametric tests

Parametric tests:                        Nonparametric tests:
•   rely on assumptions regarding the      •   either do not consider a particular
distribution of the population, and        population parameter, or
•   are specific to population             •   make few assumptions about the
parameters.                                population that is sampled.
All tests covered on the previous          Used primarily in three situations:
slides are examples of parametric           •   when the data do not meet
tests.                                          distributional assumptions
•   when the data are given in ranks
•   when the hypothesis being
parameter (e.g. is a sample random
or not?)
Linear Regression

Basic idea: a linear relationship between two variables, X and Y. i
Y                                           b0  b1 X 1   i
Note that the standard error of estimate (SEE) is in the same units as ‘Y’ and
hence should be viewed relative to ‘Y’.

Y, dependent
variable

Yi                                                                    x
Mean of i values = 0
i error term                                        x
or residual                   x
x

ˆ
Yi                                                                              x
x
x
x
x                                                 ˆ ˆ ˆ
Yi  b0  b1 X i
x
x
x

x         x
x                                               Least squares regression finds the
straight line that minimises
x

x       x
x                                                εi2
ˆ     ( sum of the squared errors,SSE)

X, independent
variable
Xi
The Components of Total Variation

n      2
Total variation    Y  Y   SST
 i   
i1      
n        2 n                                        n     2
     i i  
Unexplained variation   Y  Y   εi  SSE
ˆ               Explained variation    Yi  Y   SSR
      
i1       i1                                       i1       
ANOVA, Standard Error of Estimate & R2

Sum of squares regression (SSR)

Sum of squared errors (SSE)                    Standard Error of Estimate

n
Sum of squares total (SST)
ε
ˆ
i=1
i
2
SSE
SEE =                     =
n-2               n-2

Coefficient of determination
R2 is the proportion of the total
variation in y that is explained by                      Interpretation
the variation in x
When correlation is strong (weak, i.e. near to
zero)
2SSR SST - SSE                    • R2 is high (low)
R =     =
SST   SST                        • Standard error of the estimate is low (high)
Assumptions & Limitations of Regression Analysis

Assumptions                               Limitations
1. The relationship between the          1.   Regression relations change over
dependent variable, Y, and the             time (non-stationarity)
independent variable, X, is linear    2.   If assumptions are not valid, the
2. The independent variable, X, is not        interpretation and tests of
random                                     hypothesis are not valid
3. The expected value of the error       3.   When any of the assumptions
term is 0                                  underlying linear regression are
4. The variance of the error term is          violated, we cannot rely on the
the same for all observations              parameter estimates, test
(homoskedasticity)                         statistics, or point and interval
forecasts from the regression
5. The error term is uncorrelated
across observations (i.e. no
autocorrelation)
6. The error term is normally
distributed

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 9 posted: 12/4/2011 language: English pages: 12