The Chi-Square

Document Sample
The Chi-Square Powered By Docstoc
					              The Chi-Square

            Tests for Goodness of Fit
                       And
                 Independence




              The Chi-Square
I.     Introduction
II.    Expected versus Observed Values
III.   Distribution of X 2
IV.    Interpreting SPSS printouts of Chi-Square
V.     Reporting the Results of Chi-Square
VI.    Assumptions of Chi-Square




                Introduction
Often when we are testing hypotheses, we only
  have frequency data. Our hypothesis concern
  the distributions of the frequencies across
  various categories.

Examples:
 Are there an equal number of males and
  females in a group?
 Are Republicans more likely to be
  Fundamentalist Christians than Democrats?




                                                   1
             Introduction
With these data we have the number of
 people of a certain type in a category.
 This is qualitative, not quantitative date.
 The scale of measurement is nominal.

Compare this to age as a variable. Age is
 a quantitative variable, measured on a
 ratio scale.




             Introduction
If one were to ask are Republicans older
  than Democrats, then one could
  measure the age of a sample of people
  in each group, calculate the means of
  each sample, and test if the difference
  in the sample means is statistically
  significant (i.e., the sample means
  represent a difference in the population
  mean).




             Introduction
Compare this to the question: “Are
 Republicans more likely to be males
 than Democrats?” Our sample would
 contain a number of males and females.
 We would not want to calculate a mean
 gender.




                                               2
             Introduction
Age and Party Affiliation
     Republican      Democrat
     M = 51.2        M = 47.5
Appropriate statistical test:
     Independent samples t test.

          M1-M2
  t=     -------------    df = are (n 1 -1) + (n2-1)
            sM1-M2




             Introduction
Gender and Party Affiliation
              Males               Females
  Republicans 58                 42

  Democrats 70                   80



Appropriate statistical test:
    Chi-Square




   Expected versus Observed
            Values
With the Chi Square, you test the
 distribution of scores across the groups
 against a hypothetical distribution (the
 Ho, or null hypothesis).

For example, the null hypothesis might be
  that males and females are equally
  likely to be Republican and Democrate.




                                                       3
   Expected versus Observed
            Values
For example, in a sample of 100
  Republicans, the null hypothesis might
  be that there would be 50 males and 50
  females.
Expected values:
                Males     Females
Republicans:     50         50




   Expected versus Observed
            Values
However, what if you know the
 population is 60 percent female, then
 the expected values should be as
 follows:
               Males     Females
Republicans:    40          60




   Expected versus Observed
            Values
In any random sample of 100 people, I will not
  observe exactly 60 females and 40 males,
  any more than I get exactly 50 heads in a
  100 coin tosses.

Chi Square measures the difference between
  the observed values and the expected values,
  and compares that difference to what one
  might expect by chance.

Chi-square = Χ2 =    (f o -fe)2
                       fe




                                                 4
        Expected versus Observed
                 Values
                          Males          Females
 Republicans:             58             42
     Observed
                          40             60
     Expected

                   Χ2 = (58-40)2 + (42-60)2
                          40           60

                   Χ2 =        8.1   +        5.4 = 13.5




                Distribution of X                 2




Large values of X 2 are unlikely to be observed by chance alone
 (null hypothesis).




                Distribution of X                 2




  Shape of the distribution depends on the degrees of freedom.




                                                                  5
           Distribution of X                 2


The degrees of freedom are determined by the
   number of rows and columns in the table.
If there is only one row,
   df = C-1
With more than one row,
   df = (R-1)(C-1)
   R = number of rows.
   C = number of columns.

In our example, df = 1




           Distribution of X                 2




With two dimensions: 2 X 2 Chi-Square


Gender and Party Affiliation (observed values)
                        Males       Females            Totals
     Republicans 58               42                   100

       Democrats         70            80              150

               Totals   128            122             250



       Null hypothesis: counts will be equally distributed
       Across the cells.




                                                                6
  With two dimensions: 2 X 2 Chi-Square


  Gender and Party Affiliation (expected values)
                          Males      Females                                   Totals
       Republicans 100*128/250 100*122/250
                                                                               100
                                            = 51.2              = 48.8

                Democrats                   150*128/250         150*122/250    150
                                            = 76.8              = 73.2
                             Totals         128                 122            250



                Use these values to calculate Chi Square:
                Χ2 =            (fo -fe)2
                                    fe




   Interpreting SPSS printouts of
             Chi-Square
  Data Structure:




 Interpreting SPSS printouts
 of Chi-Square
Case Processing Summary
                                  Cases
             Valid                Missing             Total
             N        Percent     N         Percent   N         Percent
Party * Gender        250         100.0%    0         .0%       250       10




 Party * Gender Crosstabulation
                                                       Gender
                                             male      female    Total
 Party        RepublicanCount                58         42       100
                       Expected Count        51.2      48.8      100.0

             Democrat Count                  70        80        150
                      Expected Count         76.8      73.2      150.0

             Total        Count              128       122       250
                          Expected Count     128.0     122




                                                                                        7
    Interpreting SPSS printouts
    of Chi-Square
                                           Compare this value to alpha (.05)
                   Chi-Square Tests
                             Value      df Asymp. Sig.       Exact Sig.     Exact Sig.
                                           (2-sided)         (2-sided)      (1-sided)
Pearson Chi-Square           3.084 a   1        .079
Continuity Correction        2.648     1        .104
Likelihood Ratio             3.094     1        .079
Fisher's Exact Test                                            .093            .052
Linear-by-Linear             3.072     1         .080
Association
N of Valid Cases             250



a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 48.80.
b. Computed only for a 2x2 table




                 Reporting the Results
    “A Chi Square test was performed to
      determine if males and females were
      distributed differently across the
      political parties. The test failed to
      indicate a significant difference, Χ2 (1)
      = 3.08, p = .079 (an alpha level of .05
      was adopted for this and all subsequent
      statistical tests).”




      Assumptions of Chi-Square
    1. Independence of Observations
         Each person contributes one score.


    2. Size of Expected Frequencies
         Fewer than 20% of the cells should have
            expected frequencies less than 5.




                                                                                         8