Docstoc

Chi Square Test

Document Sample
Chi Square Test Powered By Docstoc
					Chi Square Test




Dealing with categorical
  dependant variable
So Far:

              Categorical    •T-test
                  IV         •ANOVA

Continuous
   DV
              Continuous     •Correlation
                  IV         •Regression


Categorical    Categorical
    DV             IV         •CHI Square
Pearson Chi-Square:




•Frequencies
  No mean and SD           2 statistics
No assumption of normality
  Non-parametric test
Chi-Square test for goodness of fit
         -Is the frequency of balls with
         different colors equal in our bag?



          Observed Frequencies

    50         30         30         10

           Expected Frequencies

   25%        25%        25%        25%
 Chi-Square test for goodness of fit

      Observed Frequencies             Total
50        30       30        10        120



      Expected Frequencies

25%      25%       25%       25%        120   =
      Expected Frequencies

30        30       30        30        H0
     Chi-Square test for goodness of fit

                    Observed Frequencies

             50          30           30          10
                    Expected Frequencies

             30          30           30          30

                                  2        Difference
                    ( f0  fe )
          2     
                         fe             Normalize
                    2           2           2           2
      2  (50  30)  (30  30)  (30  30)  (10  30)  26.6
               30          30          30          30

      Chi-Square test for goodness of fit

 26.6
  2                        (1/2)k /2 k /2 1 x /2
                                    x       e
                            (k /2)

df  C 1 4 1 3    




                           Fixed = 25%               Total
 25%        25%      25%           ?                 100
   Chi-Square test for goodness of fit
                       Critical value = 7.81
 2 26.6
                                               26.6
df  C 1 4 1 3


2(3,n=120) = 26.66,
p< 0.001
  Chi-Square test for Goodness of fit

•Chi-Square test for goodness of fit is like
one sample t-test
•You can test your sample against any
possible expected values



  25%      25%      25%       25%       H0

  10%      10%      10%       70%       H0
     Chi-Square test for independence
•When we have tow or more sets of
categorical data (IV,DV both categorical)


FO       None     Obama     McCain
 Male      10      50       35       95

Female     15      60       40       115

         25       110       75
                                     210
    Chi-Square test for independence
•Also called contingency table analysis
•H0: There is no relation between gender and voting
preference (like correlation)
   OR
•H0: There is no difference between the voting
preference of males and females (like t-test)


•The logic is the same as the goodness of fit test:
Comparing observed freq and Expected freq if the
two variables were independent
     Chi-Square test for independence
FO
         None   Obama   McCain
 Male      10    50      35      95

Female     15    60      40      115

         25     110      75      210

FE       None   Obama   McCain
 Male

Female
         12%    52%     36%      100%
          Chi-Square test for independence
In case of independence:
FE        None       Obama                 McCain
  Male           12%          52%          36%           95

 Female          12%          52%          36%           115
             12%              52%          36%          100%

Finaly:
                   None             Obama        McCain
          FE
          Male         11.4         49.4         34.2

          Female       13.8         59.8         41.4
       Chi-Square test for independence
•Anotehr way:

      fc  fr columnrow
 fe         
         n        total
  FE       None      Obama   McCain
           95 x 25                    95
  Male       210

  Female
                                      210
            25
       Chi-Square test for independence
•Now we can calculate the chi square value :

FO     10            50         35
                                                                       2
       15            60         40
                                                         ( f0  fe )
                                               2     
FE     11.4         49.4       34.2                           fe
       13.8         59.8       41.4

                              2            2
               2  (10 11.4)  (15 13.8)  ...  0.35
                        11.4       13.8
     df  (C 1)(R1) (31)(21) 2
  
          Chi-Square test for independence


      df  (C 1)(R1) (31)(21) 2

     FE       None     Obama
                             McCain
     Male      11.4    49.4    Fixed    95

     Female    Fixed   Fixed   Fixed    115

              25       110      75      210
       Chi-Square test for independence

2(2, n=210) = 0.35, p= 0.83
There is no significant effect of
gender on vote preference

Or

We cannot reject the null
hypothesis that gender and vote
preference are independent
              Effect size in Chi square

     •For a 2 x 2 table -> Phi Coefficient


                  2
                               Correlation between two
                             categorical variables
               n
     Phi of 0.1 small, 0.3 medium, 0.5 large

   •For larger tables -> Cramer’s V coeffiecient

                  2
                                  Df* is the smallest of C-1,
     V                *          R-1
             ndf
         Assumptions of Chi Square


•Independence of observations
      each subject in only one category

•Size of expected frequencies:
      be cautious with small cell frequencies


•No assumption of Normality:
     Nonparametric test
   Likelihood ratio test: an alternative

•Instead of using Chi-Square, when dealing with
categorical data we can calculate log likelihood ratio:



                            fo
            G  2  f0 ln( )
                            fe
•A ration of observed and expected frequencies
  
     Likelihood ratio test: an alternative

FO      10          50          35
        15          60          40
                                                            fo
                                            G  2  f0 ln( )
FE      11.4       49.4        34.2                         fe
       13.8        59.8        41.4


                                 15
G  2  (10  ln( 10 ) 15  ln( )  ...) 0.355
                11.4           13.8


•Follows a Chi-square distribution with df of (R-1)(C-1)
  Chi Square test with rank ordered data
         Anxiety Level
                                              SAG
       1       2       3
  1     10      50      35                    1   1   1
  2     15      60      40                    2   3   2
                                              3   2   2
•Rank order your data for the two variables
•Get the correlation of the two variables:    4   2   1
  Spearman r                                  5   1   1
•Calculate chi Square as follows:
                                              6   2   1
                                              7   1   2
                  M 2 (N 1)r2

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:8/28/2012
language:Latin
pages:23