VIEWS: 3 PAGES: 4 POSTED ON: 5/11/2012
Binomial Distribution Binomial Distribution In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial; when n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance. The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution is a good approximation, and widely used. In general, if the random variable K follows the binomial distribution with parameters n and p, we write K ~ B(n, p). Know More About Free Tutor Math.Tutorvista.com Page No. :- 1/4 The probability of getting exactly k successes in n trials is given by the probability mass function: for k = 0, 1, 2, ..., n, where is the binomial coefficient (hence the name of the distribution) "n choose k", also denoted C(n, k), nCk, or nCk. The formula can be understood as follows: we want k successes (pk) and n − k failures (1 − p)n − k. However, the k successes can occur anywhere among the n trials, and there are C(n, k) different ways of distributing k successes in a sequence of n trials. In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as Looking at the expression ƒ(k, n, p) as a function of k, there is a k value that maximizes it. This k value can be found by calculating and comparing it to 1. There is always an integer M that satisfies ƒ(k, n, p) is monotone increasing for k < M and monotone decreasing for k > M, with the exception of the case where (n + 1)p is an integer. In this case, there are two values for which ƒ is maximal: (n + 1)p and (n + 1)p − 1. M is the most probable (most likely) outcome of the Bernoulli trials and is called the mode. Note that the probability of it occurring can be fairly small. Cumulative distribution function The cumulative distribution function can be expressed as : where is the "floor" under x, i.e. the greatest integer less than or equal to x. It can also be represented in terms of the regularized incomplete beta function, as follows: Learn More Math Help Online For Free Math.Tutorvista.com Page No. :- 2/4 For k ≤ np, upper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality yields the bound and Chernoff's inequality can be used to derive the bound Moreover, these bounds are reasonably tight when p = 1/2, since the following expression holds for all k ≥ 3n/8. This approximation, known as de Moivre–Laplace theorem, is a huge time-saver when undertaking calculations by hand (exact calculations with large n are very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1738. Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p) is a sum of n independent, identically distributed Bernoulli variables with parameter p. This fact is the basis of a hypothesis test, a "proportion z-test," for the value of p using x/n, the sample proportion and estimator of p, in a common test statistic. For example, suppose one randomly samples n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If groups of n people were sampled repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 − p)/n)1/2. Large sample sizes n are good because the standard deviation, as a proportion of the expected value, gets smaller, which allows a more precise estimate of the unknown parameter p. Math.Tutorvista.com Page No. :- 4/4 Thank You Math.TutorVista.com