# Part 2 – The Binomial Distribution

Shared by:
-
Stats
views:
3
posted:
3/5/2010
language:
pages:
14
Document Sample

```							The conditional probability of A given B is

P( A | B ) = P( A  B )   P( B )

The probability of the intersection of A and B is

P( A  B ) = P( A )P( B | A )
= P( B )P( A | B )

Independent events

A and B are independent events if
P(A | B) = P(A) or P(B|A)=P(B)

If A and B are independent, then P( A  B ) =

1
EX : Target population = 35 people

Variables = gender and eye color
FEMALES                                 MALES

blue      blue     blue       blue       blue      blue      blue

non-blue   non-blue   non-blue   non-blue   non-blue non-blue   non-blue

non-blue   non-blue   non-blue   non-blue   non-blue non-blue   non-blue

non-blue   non-blue   non-blue   non-blue   non-blue non-blue   non-blue

non-blue   non-blue   non-blue   non-blue   non-blue non-blue   non-blue

Gender * Eye Color Cross tabulation

Ey e Color
Blue eyed Not blue eyed         Total
Gender     Female    Count                    3              12            15
% w ithin Gender    20.0%           80.0%        100.0%
Male      Count                    4              16            20
% w ithin Gender    20.0%           80.0%        100.0%
Total                Count                    7              28            35
% w ithin Gender    20.0%           80.0%        100.0%

2
EX: Target population = 35 people
Variables: gender and colorblindness

A = the randomly selected person is color blind
B = the randomly selected person is a female.
FEMALES                      MALES

CB       NV     NV     CB      CB         CB     NV

NV       NV     NV     NV      NV         NV     NV

NV       NV     NV     NV      NV         NV     NV

NV       NV     NV     NV      NV         NV     NV

NV       NV     NV     NV      NV         NV     NV

3
Gender * Color Blind Cros stabulation

Color Blind
Not Color
Color Blind      Blind      Total
Gender   Female    Count                        1            14        15
% w ithin Gender         6.7%         93.3%    100.0%
Male      Count                        3            17        20
% w ithin Gender       15.0%          85.0%    100.0%
Total              Count                        4            31        35
% w ithin Gender       11.4%          88.6%    100.0%

Bayes’ Formula

Bayes’ formula is of particular use in medicine. It can be used to
update conditional probabilities by using sample data when
available.

Bayes’ Formula:

P (B | A)P( A)
P( A | B) 
P (B | A)P ( A)  P(B | A )P ( A ) .

4
Ex. Albinism is inherited autosomal recessive, and it is decided by one
single gene. Use A for normal gene and a for albino gene. If the gene type
Is AA or Aa, then the person is normal; if the gene type is aa, then the
person is albino.

Suppose two normal parents have a son and a daughter. The son is albino
and the daughter is normal.

What are the genotypes of the parents?

Given that the daughter is normal, what is the chance that she is
homozygous?

5
Part 2 – The Binomial Distribution

Experiments with only two possible outcomes:

Each experiment is referred to as a trial.

Any experiment that has all the following properties is called a
binomial experiment:

1) The experiment consists of n identical trials.

2) Each trial results in one of two outcomes. We will label one
outcome a success and the other a failure.

3) The probability of success on a single trial is equal to  and 
remains the same from trial to trial.

4) The trials are independent: that is, the outcome of one trial
does not influence the outcome of any other trial.

5) The variable Y (sometimes called the binomial random
variable) is the number of success observed during the n
trials.
6
EX: Legionnaires disease is spread by air. If an ill individual
spends a lot of time in a building with a closed ventilation system,
the person can expose others in the building to the disease. This
was the case at a Chicago convention hotel several years ago
when everyone who was at the convention attended by the ill
individual was exposed to Legionnaires disease. All attendees
were tested for the disease. The number of individuals testing
positive is the variable of interest. Does this constitute a binomial
experiment? Assume that each person has the same chance of
testing positive.

7
Binomial Distribution

The distribution of the variable Y in a binomial experiment is called
the binomial distribution.

The probability of observing exactly k successes in a binomial
experiment with n trials where the probability of success on a
single trial is equal to  is given by

n!
P(Y=k) =     k!(n  k )!
 k (1   -  ) n k

n = number of trials

 = probability of success on a single trial

1- = probability of failure on a single trial

Y = number of successes in n trials

n! = n(n-1)(n-2)…(3)(2)(1)

when n = 1, n! = 1.

8
For n=2, n! = (2)(1) = 2.

For n=3, n! = (3)(2)(1) = 6.

NOTE: 0! = 1.
EX: Suppose a coin is tossed 4 times.

Tossing a head is a success so Y = number of heads.

All possible outcomes
HHHH HHHT HHTH HTHH THHH HHTT HTHT
HTTH THTH TTHH THHT TTTH TTHT THTT HTTT
TTTT

What is P(Y=3), the probability that we will have 3 successes in
our four trials?
Looking at the above illustration, you might be tempted to say that
four out of the 16 possible outcomes result in the event that Y = 3,
so P(Y=3) = 4/16 = 0.25.
This is the correct probability, but only because the probability of
success, , is 50%, making all of the 16 outcomes equally likely.
This wouldn’t be a correct assumption if the probability of success
were something else, say  = 25% or  = 30%.

4!                                      ( 4  3  2  1)
P (Y  0)                 0.50 0 (1  0.50 )4  0                            0.50 4  0.0625
0! ( 4  0)!                                (1) ( 4  3  2  1)
4!                                         ( 4  3  2  1)
P (Y  1)                0.50 1(1  0.50 )4 1                               0.50 4      0.25
1 ( 4  1)!
!                                              (1) (3  2  1)

4!                                     ( 4  3  2  1)
P (Y  2)                 0.50 2 (1  0.50 )4  2                         0.50 4        0.375
2! ( 4  2)!                                 (2  1) (2  1)
4!                                        ( 4  3  2  1)
P (Y  3)                  0.50 3 (1  0.50 )4  3                            0.50 4       0.25
3! ( 4  3)!                                     (3  2  1)(1)
9
4!                                     ( 4  3  2  1)
P (Y  4)                 0.50 4 (1  0.50 )4  4                          0.50 4  0.0625
4! ( 4  4)!                               ( 4  3  2  1)(1)

P(Y  3)?

P(A          B ) = P (A) + P (B)

P(Y  3) = P(Y = 3  Y = 4)
= P(Y=3) + P(Y=4)
= 0.25 + 0.0625
= 0.3125

P(Y < 3) = P(Y = 0  Y = 1  Y = 2)
= P(Y=0) + P(Y=1) + P(Y=2)
= 0.0625 + 0.25 + 0.0625
= 0.6875

Instead use P(X < a) = 1 – P(X                a)

P(Y < 3) = 1 - P(Y  3 )
= 1 – 0.3125
= 0.6875.

10
EX Suppose that 10% of the people in a certain country are
suffering from malnutrition. We plan to take a SRS of 30 people
from this. What is the probability that fewer than 3 of the people
we select will be suffering from malnutrition?

1. The experiment consists of n identical trials.

A trial is randomly selecting a person. Now once we select a
person, this person can’t be chosen again. But when there is a
very, very large population, removing one person – or even 30
people, does not change the population in any measurable way.
Therefore, we think of the population as unchanging. Hence,
each trial is identical because we are randomly selecting from
the same population.

2. Each trial results in one of two outcomes.

For each trial – the random selection of a person – there are in
fact two possible outcomes. Either we select someone who is
malnourished, or we don’t. Since we are interested in those that
are malnourished, we’ll just call malnourishment a success (not
to be morbid…)

3. The probability of success on any trial is always . This
is constant from trial to trial.

11
We know that on the first trial, the probability that we select
someone who is malnourished is 10%. But is that also true for
the second trial? Is there still a 10% chance of success?

To see, let’s assume there are a lot of people in the country –
say about 1,000,000. With a 10% malnourishment rate, there
should then be 100,000 malnourished people. The probability
of success (selecting a malnourished person) on the first trial is
100,000/1,000,000 = 10%. On the second trial, if we don’t put
the first selected person back into the pool, there are just
999,999 possible outcomes, and either 99,999 of them or
100,000 of them are malnourished (depending on the outcome
of the first trial). Thus, the probability of success is now either
99,999/999,999 = 0.0999991 or 100,000/999,999 = 0.1000001.
Although neither of these are exactly 10%, they’re both pretty
darn close. When we have situations where the population is
very large,  is close enough to constant to make the
calculations accurate. So condition 2 is satisfied.

4. The trials are independent.

We need to know whether P(malnourished on this trial |
malnourished on any previous trial) = P(malnourished on this
think that by getting a particular result in one trial, we will
change the probabilities for the subsequent trials? Not if we use
a SRS. By giving all members of the population an equal
chance of being selected on each trial, the result of the previous
trials are irrelevant. In fact, it can be shown (with methods more
advanced than the ones covered in this text) that as long as we
are selecting people with a SRS, these two probabilities are the
same. So condition 3 is satisfied.

12
Therefore, this IS a binomial experiment, and we can calculate
whatever probabilities we want.

In this example

n = 30 trials
 = 0.10 chance of success (heads) on any trial
1 -  = 0.90 chance of failure (tails) on any given trial

Using the binomial formula, we can calculate the probabilities of Y
= 0 to Y = 30 successes (malnourished) and get the following

P(Y = 0)    = 0.042 P(Y = 11)=       0.00007 P(Y = 22)   =   2.519E-16
P(Y = 1)    = 0.141 P(Y = 12)=       0.00001 P(Y = 23)   =   9.737E-18
P(Y = 2)    = 0.228 P(Y = 13)=      0.000002 P(Y = 24)   =   3.156E-19
P(Y = 3)    = 0.236 P(Y = 14)=     0.0000003 P(Y = 25)   =   8.415E-21
P(Y = 4)    = 0.177 P(Y = 15)=      3.19E-08 P(Y = 26)   =   1.798E-22
P(Y = 5)    = 0.102 P(Y = 16)=      3.33E-09 P(Y = 27)   =   2.960E-24
P(Y = 6)    = 0.047 P(Y = 17)=      3.04E-10 P(Y = 28)   =   3.524E-26
P(Y = 7)    = 0.018 P(Y = 18)=      2.44E-11 P(Y = 29)   =   2.700E-28
P(Y = 8)    = 0.00567 P(Y = 19)=    1.71E-12 P(Y = 30)   =   1.000E-30
P(Y = 9)    = 0.00156 P(Y = 20)=    1.05E-13
P(Y = 10)   = 0.0003 P(Y = 21)=     5.54E-15

Finally, we can calculate the probability that fewer than 3 of the
people we select will be suffering from malnutrition,
P(Y < 3)

P(Y < 3) = P(Y = 0  Y = 1  Y = 2)
= P(Y = 0) + P(Y = 1) + P(Y = 2)
= 0.0423912+ 0.1413039+ 0.2276562
= 0.4113513

13
EX

Suppose you have 20 close friends, and 4 of them are left-
handed. If you take a SRS of 6 friends, can you calculate the
probability that fewer than 3 of the people will be left-handed?

14

```
Related docs
Other docs by myx17334