# Chi Square Test

Document Sample

```					Chi Square Test

Dealing with categorical
dependant variable
So Far:

Categorical    •T-test
IV         •ANOVA

Continuous
DV
Continuous     •Correlation
IV         •Regression

Categorical    Categorical
DV             IV         •CHI Square
Pearson Chi-Square:

•Frequencies
No mean and SD           2 statistics
No assumption of normality
Non-parametric test
Chi-Square test for goodness of fit
-Is the frequency of balls with
different colors equal in our bag?

Observed Frequencies

50         30         30         10

Expected Frequencies

25%        25%        25%        25%
Chi-Square test for goodness of fit

Observed Frequencies             Total
50        30       30        10        120

Expected Frequencies

25%      25%       25%       25%        120   =
Expected Frequencies

30        30       30        30        H0
Chi-Square test for goodness of fit

Observed Frequencies

50          30           30          10
Expected Frequencies

30          30           30          30

2        Difference
( f0  fe )
2     
fe             Normalize
2           2           2           2
 2  (50  30)  (30  30)  (30  30)  (10  30)  26.6
30          30          30          30

Chi-Square test for goodness of fit

 26.6
2                        (1/2)k /2 k /2 1 x /2
x       e
(k /2)

df  C 1 4 1 3    

Fixed = 25%               Total
25%        25%      25%           ?                 100
Chi-Square test for goodness of fit
Critical value = 7.81
 2 26.6
26.6
df  C 1 4 1 3

2(3,n=120) = 26.66,
p< 0.001
Chi-Square test for Goodness of fit

•Chi-Square test for goodness of fit is like
one sample t-test
•You can test your sample against any
possible expected values

25%      25%      25%       25%       H0

10%      10%      10%       70%       H0
Chi-Square test for independence
•When we have tow or more sets of
categorical data (IV,DV both categorical)

FO       None     Obama     McCain
Male      10      50       35       95

Female     15      60       40       115

25       110       75
210
Chi-Square test for independence
•Also called contingency table analysis
•H0: There is no relation between gender and voting
preference (like correlation)
OR
•H0: There is no difference between the voting
preference of males and females (like t-test)

•The logic is the same as the goodness of fit test:
Comparing observed freq and Expected freq if the
two variables were independent
Chi-Square test for independence
FO
None   Obama   McCain
Male      10    50      35      95

Female     15    60      40      115

25     110      75      210

FE       None   Obama   McCain
Male

Female
12%    52%     36%      100%
Chi-Square test for independence
In case of independence:
FE        None       Obama                 McCain
Male           12%          52%          36%           95

Female          12%          52%          36%           115
12%              52%          36%          100%

Finaly:
None             Obama        McCain
FE
Male         11.4         49.4         34.2

Female       13.8         59.8         41.4
Chi-Square test for independence
•Anotehr way:

fc  fr columnrow
fe         
n        total
FE       None      Obama   McCain
95 x 25                    95
Male       210

Female
210
25
Chi-Square test for independence
•Now we can calculate the chi square value :

FO     10            50         35
2
15            60         40
( f0  fe )
2     
FE     11.4         49.4       34.2                           fe
13.8         59.8       41.4

2            2
 2  (10 11.4)  (15 13.8)  ...  0.35
11.4       13.8
df  (C 1)(R1) (31)(21) 2

Chi-Square test for independence

df  (C 1)(R1) (31)(21) 2

FE       None     Obama
                             McCain
Male      11.4    49.4    Fixed    95

Female    Fixed   Fixed   Fixed    115

25       110      75      210
Chi-Square test for independence

2(2, n=210) = 0.35, p= 0.83
There is no significant effect of
gender on vote preference

Or

We cannot reject the null
hypothesis that gender and vote
preference are independent
Effect size in Chi square

•For a 2 x 2 table -> Phi Coefficient

   2
Correlation between two
                      categorical variables
n
Phi of 0.1 small, 0.3 medium, 0.5 large

   •For larger tables -> Cramer’s V coeffiecient

   2
Df* is the smallest of C-1,
V                *          R-1
ndf
Assumptions of Chi Square

•Independence of observations
each subject in only one category

•Size of expected frequencies:
be cautious with small cell frequencies

•No assumption of Normality:
Nonparametric test
Likelihood ratio test: an alternative

•Instead of using Chi-Square, when dealing with
categorical data we can calculate log likelihood ratio:

fo
G  2  f0 ln( )
fe
•A ration of observed and expected frequencies

Likelihood ratio test: an alternative

FO      10          50          35
15          60          40
fo
G  2  f0 ln( )
FE      11.4       49.4        34.2                         fe
13.8        59.8        41.4

15
G  2  (10  ln( 10 ) 15  ln( )  ...) 0.355
11.4           13.8

•Follows a Chi-square distribution with df of (R-1)(C-1)
Chi Square test with rank ordered data
Anxiety Level
SAG
1       2       3
1     10      50      35                    1   1   1
2     15      60      40                    2   3   2
3   2   2
•Rank order your data for the two variables
•Get the correlation of the two variables:    4   2   1
Spearman r                                  5   1   1
•Calculate chi Square as follows:
6   2   1
7   1   2
M 2 (N 1)r2

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 2 posted: 8/28/2012 language: Latin pages: 23