# CHI-SQUARE & F-TEST

Document Sample

```					CHI-SQUARE & F-TEST
• ISI:
– Kaitan antara Chi-Square dengan ANOVA
(F test)
Chi-square and F Distributions

Children of the Normal
Distributions
• There are many theoretical
distributions, both continuous and
discrete.
• We use 4 of these a lot: z (unit normal),
t, chi-square, and F.
• Z and t are closely related to the
sampling distribution of means; chi-
square and F are closely related to the
sampling distribution of variances.
Chi-square Distribution (1)
(X  X )     ( X  )     ( y  )
z            ;z          ;z            z score
SD                         
( y   )2
z 
2
z score squared
2

z 2  (2 )
1
Make it Greek

What would its sampling distribution look like?
Minimum value is zero.
Maximum value is infinite.
Most values are between zero and 1;
most around zero.
Chi-square (2)
What if we took 2 values of z2 at random and added them?
( y1   ) 2          ( y2   ) 2              ( y1   ) 2       ( y2   ) 2
z 
2
;z 
2
 (22)                                      z12  z2
2
                      
1            2        2             2
2                  2
Same minimum and maximum as before, but now average
should be a bit bigger.

Chi-square is the distribution of a sum of squares.
Each squared deviation is taken from the unit normal:
N(0,1). The shape of the chi-square distribution
depends on the number of squared deviates that are
Chi-square 3
The distribution of chi-square depends on
1 parameter, its degrees of freedom (df or
v). As df gets large, curve is less skewed,
more normal.
Chi-square (4)
• The expected value of chi-square is df.
– The mean of the chi-square distribution is its
degrees of freedom.
• The expected variance of the distribution is
2df.
– If the variance is 2df, the standard deviation must
be sqrt(2df).
• There are tables of chi-square so you can find
5 or 1 percent of the distribution.
• Chi-square is additive.  (2v v )   (2v )   (2v )
1   2       1        2
Distribution of Sample
Variance
s   2

 ( y  y)   2     Sample estimate of population variance
N 1           (unbiased).

Multiply variance estimate by N-1 to
( N  1) s 2
 (2N 1)                      get sum of squares. Divide by
2
population variance to normalize.
Result is a random variable distributed
as chi-square with (N-1) df.

We can use info about the sampling distribution of the
variance estimate to find confidence intervals and
conduct statistical tests.
Testing Exact Hypotheses
H0 : 2   0
2
Test the null that the population
variance has some specific value. Pick
alpha and rejection region. Then:
( N  1)s 2    Plug hypothesized population
   2
( N 1)   
0
2          variance and sample variance into
equation along with sample size we
used to estimate variance. Compare
to chi-square distribution.
Example of Exact Test
Test about variance of height of people in inches. Grab 30
people at random and measure height.
H 0 :  2  6.25; H 1:  2  6.25.
Note: 1 tailed test on
N  30; s 2  4.55             small side. Set alpha=.01.
Mean is 29, so it’s on the small
(29)(4.55)
 29 
2
 21.11 side. But for Q=.99, the value
6.25           of chi-square is 14.257.
Cannot reject null.
H 0 :  2  6.25; H 1:  2  6.25.
N  30; s 2  4.55           Note: 2 tailed with alpha=.01.
Now chi-square with v=29 and Q=.995 is 13.121 and
also with Q=.005 the result is 52.336. N. S. either way.
Confidence Intervals for the
Variance
We use s to estimate  . It can be shown that:
2           2

 ( N  1) s 2         ( N  1) s 2 
p 2               2  2               .95
  ( N 1;.025)
                       ( N 1;.975) 

Suppose N=15 and s 2 is 10. Then df=14 and for Q=.025
the value is 26.12. For Q=.975 the value is 5.63.
 (14 )(10 )      (14 )(10 ) 
p             
2
  .95
 26 .12            5.63 


p 5.36   2  24 .87  .95
Normality Assumption
• We assume normal distributions to figure
sampling distributions and thus p levels.
• Violations of normality have minor
implications for testing means, especially as
N gets large.
• Violations of normality are more serious for
testing variances. Look at your data before
conducting this test. Can test for normality.
The F Distribution (1)
• The F distribution is the ratio of two
variance estimates:
s12 est. 12
F 2 
s2 est. 2 2

• Also the ratio of two chi-squares, each
divided by its degrees of freedom:
 (2v ) / v1   In our applications, v2 will be larger
F 2    1

 ( v ) / v2
2
than v1 and v2 will be larger than 2.
In such a case, the mean of the F
distribution (expected value) is
v2 /(v2 -2).
F Distribution (2)
• F depends on two parameters: v1 and
v2 (df1 and df2). The shape of F
changes with these. Range is 0 to
infinity. Shaped a bit like chi-square.
• F tables show critical values for df in
the numerator and df in the
denominator.
• F tables are 1-tailed; can figure 2-tailed
if you need to (but you usually don’t).
Variances
• Suppose      H 0 :  12   2 ; H1 :  12   2
2                 2

– Note 1-tailed.
• We find
N1  16; s12  5.8; N 2  16; s2  1.7
2

• Then df1=df2 = 15, and
s12 5.8                   Going to the F table with 15
F 2      3.41
s2 1.7                    and 15 df, we find that for alpha
= .05 (1-tailed), the critical
value is 2.40. Therefore the
result is significant.
• The F distribution is used in many
statistical tests
– Test for equality of variances.
– Tests for differences in means in ANOVA.
– Tests for regression models (slopes
relating one continuous variable to another
like SAT and GPA).
Relations among Distributions
– the Children of the Normal
• Chi-square is drawn from the normal.
N(0,1) deviates squared and summed.
• F is the ratio of two chi-squares, each
divided by its df. A chi-square divided
by its df is a variance estimate, that is,
a sum of squares divided by degrees of
freedom.
• F = t2. If you square t, you get an F
with 1 df in the numerator.
t  F(1,v)
2
(v)

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 28 posted: 5/2/2012 language: English pages: 17