Binomial Distribution

Document Sample
Binomial Distribution Powered By Docstoc
					                        Binomial Distribution
Binomial Distribution

In probability theory and statistics, the binomial distribution is the discrete probability
distribution of the number of successes in a sequence of n independent yes/no
experiments, each of which yields success with probability p.

Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli
trial; when n = 1, the binomial distribution is a Bernoulli distribution.

The binomial distribution is the basis for the popular binomial test of statistical

The binomial distribution is frequently used to model the number of successes in a sample of
size n drawn with replacement from a population of size N.

If the sampling is carried out without replacement, the draws are not independent and so the
resulting distribution is a hypergeometric distribution, not a binomial one.

However, for N much larger than n, the binomial distribution is a good approximation, and
widely used.
                                                 Know More About Exponential Distribution                                                 Page No. :- 1/4
The following is an example of applying a continuity correction. Suppose one wishes
to calculate Pr(X ≤ 8) for a binomial random variable X.

If Y has a distribution given by the normal approximation, then Pr(X ≤ 8) is
approximated by Pr(Y ≤ 8.5). The addition of 0.5 is the continuity correction; the
uncorrected normal approximation gives considerably less accurate results.

This approximation, known as de Moivre–Laplace theorem, is a huge time-saver
when undertaking calculations by hand (exact calculations with large n are very
onerous); historically,

it was the first use of the normal distribution, introduced in Abraham de Moivre's book
The Doctrine of Chances in 1738.

Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p)
is a sum of n independent, identically distributed Bernoulli variables with parameter p.

This fact is the basis of a hypothesis test, a "proportion z-test," for the value of p using
x/n, the sample proportion and estimator of p, in a common test statistic.

For example, suppose one randomly samples n people out of a large population and
ask them whether they agree with a certain statement.

The proportion of people who agree will of course depend on the sample. If groups of
n people were sampled repeatedly and truly randomly, the proportions would follow
an approximate normal distribution with mean equal to the true proportion p of
agreement in the population and with standard deviation σ = (p(1 − p)/n)1/2.

                                                         Learn More Frequency Distribution                                                 Page No. :- 2/4
 Large sample sizes n are good because the standard deviation, as a proportion of
the expected value, gets smaller, which allows a more precise estimate of the
unknown parameter p.


Suppose individuals with a certain gene have a 0.70 probability of eventually
contracting a certain disease. If 100 individuals with the gene participate in a lifetime
study, then the distribution of the random variable describing the number of
individuals who will contract the disease is distributed B(100,0.7).

Note: The sampling distribution of a count variable is only well-described by the
binomial distribution is cases where the population size is significantly larger than the
sample size. As a general rule, the binomial distribution should not be applied to
observations from a simple random sample (SRS) unless the population size is at
least 10 times larger than the sample size.

To find probabilities from a binomial distribution, one may either calculate them
directly, use a binomial table, or use a computer. The number of sixes rolled by a
single die in 20 rolls has a B(20,1/6) distribution. The probability of rolling more than 2
sixes in 20 rolls, P(X>2), is equal to 1 - P(X<2) = 1 - (P(X=0) + P(X=1) + P(X=2)).
Using the MINITAB command "cdf" with subcommand "binomial n=20 p=0.166667"
gives the cumulative distribution function as follows:                                                Page No. :- 4/4
      Thank You