# Chi Square Test Example by tutorvistateam

VIEWS: 71 PAGES: 5

• pg 1
```									                    Chi Square Test Example
Chi Square Test Example

Chi-square is a statistical test commonly used to compare observed data with data we would
expect to obtain according to a specific hypothesis. For example, if, according to Mendel's
laws, you expected 10 of 20 offspring from a cross to be male and the actual observed
number was 8 males, then you might want to know about the "goodness to fit" between the
observed and expected.

Were the deviations (differences between observed and expected) the result of chance, or
were they due to other factors. How much deviation can occur before you, the investigator,
must conclude that something other than chance is at work, causing the observed to differ
from the expected.

The chi-square test is always testing what scientists call the null hypothesis, which states that
there is no significant difference between the expected and observed result.

The formula for calculating chi-square ( 2) is : 2= (o-e)2/e

That is, chi-square is the sum of the squared difference between observed (o) and the
expected (e) data (or the deviation, d), divided by the expected data in all possible categories.

Math.Tutorvista.com                                                     Page No. :- 1/5
For example, suppose that a cross between two pea plants yields a population of 880 plants,
639 with green seeds and 241 with yellow seeds. You are asked to propose the genotypes of
the parents.

If your hypothesis is true, then the predicted ratio of offspring from this cross would be 3:1
(based on Mendel's laws) as predicted from the results of the Punnett square. Then calculate
2 using this formula, as shown in Table B.1. Note that we get a value of 2.668 for 2. But what
does this number mean? Here's how to interpret the 2 value:

1. Determine degrees of freedom (df). Degrees of freedom can be calculated as the number of
categories in the problem minus 1. In our example, there are two categories (green and
yellow); therefore, there is I degree of freedom.

2. Determine a relative standard to serve as the basis for accepting or rejecting the
hypothesis. The relative standard commonly used in biological research is p > 0.05. The p
value is the probability that the deviation of the observed from that expected is due to chance
alone (no other forces acting). In this case, using p > 0.05, you would expect any deviation to
be due to chance alone 5% of the time or less.

3. Refer to a chi-square distribution table (Table B.2). Using the appropriate degrees of
'freedom, locate the value closest to your calculated chi-square in the table. Determine the
closestp (probability) value associated with your chi-square and degrees of freedom. In this
case (2=2.668), the p value is about 0.10, which means that there is a 10% probability that
any deviation from expected results is due to chance only.

Step-by-Step Procedure for Testing Your Hypothesis and Calculating Chi-Square

1. State the hypothesis being tested and the predicted results. Gather the data by conducting
the proper experiment (or, if working genetics problems, use the data provided in the
problem).

2. Determine the expected numbers for each observational class. Remember to use numbers,
not percentages.

Math.Tutorvista.com                                                   Page No. :- 2/5
Binomial Distribution Formula
Binomial Distribution Formula

In probability theory and statistics, the binomial distribution is the discrete probability
distribution of the number of successes in a sequence of n independent yes/no experiments,
each of which yields success with probability p.

Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial; when
n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis
for the popular binomial test of statistical significance.

The binomial distribution is frequently used to model the number of successes in a sample of
size n drawn with replacement from a population of size N.

If the sampling is carried out without replacement, the draws are not independent and so the
resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much
larger than n, the binomial distribution is a good approximation, and widely used.

The following is an example of applying a continuity correction: Suppose one wishes to
calculate Pr(X ≤ 8) for a binomial random variable X.

Math.Tutorvista.com                                                       Page No. :- 3/5
If Y has a distribution given by the normal approximation, then Pr(X ≤ 8) is approximated by
Pr(Y ≤ 8.5). The addition of 0.5 is the continuity correction; the uncorrected normal
approximation gives considerably less accurate results.

This approximation, known as de Moivre–Laplace theorem, is a huge time-saver (exact
calculations with large n are very onerous); historically, it was the first use of the normal
distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1738.
Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p) is a
sum of n independent, identically distributed Bernoulli variables with parameter p. This fact is
the basis of a hypothesis test, a "proportion z-test," for the value of p using x/n, the sample
proportion and estimator of p, in a common test statistic.

For example, suppose you randomly sample n people out of a large population and ask them
whether they agree with a certain statement. The proportion of people who agree will of
course depend on the sample.

If you sampled groups of n people repeatedly and truly randomly, the proportions would follow
an approximate normal distribution with mean equal to the true proportion p of agreement in
the population and with standard deviation σ = (p(1 − p)/n)1/2. Large sample sizes n are good
because the standard deviation, as a proportion of the expected value, gets smaller, which
allows a more precise estimate of the unknown parameter p.

Poisson approximation

The binomial distribution converges towards the Poisson distribution as the number of trials
goes to infinity while the product np remains fixed. Therefore the Poisson distribution with
parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is
sufficiently large and p is sufficiently small. According to two rules of thumb, this
approximation is good if n ≥ 20 and p ≤ 0.05, or if n ≥ 100 and np ≤ 10.[8]