# Binomial Distribution

Document Sample

```					Normal distribution
Overview of Data and Decisions so far
Standardising data
Normal distribution
Overview of Data and Decisions

Where are we now?
Standardising data
If you have a group of data, and calculate
what its mean and standard deviation are
then you can characterise any item in that
sample by a single “standardised” value that
describes how many standard deviations it
is away from the mean.
Calculating standardised values

x
z

The z value is the number of standard deviations
(that an item in a sample (x) is either below (-)
or above (+) the mean (.

i.e. z is the standardised value.

Like converting a different set of units
(celcius  fahrenheit or AUS\$  US\$)
Two ways of expressing the same reality
There’s a formula to convert backwards and
forwards between the original data and the
standardised data
Why standardise?
Useful for knowing “where” a piece of data
lies in a distribution
Useful for comparing different samples
Use for calculating probabilities if you
know that the sample comes from a
particular distribution like the normal
distribution
Discrete vs continuous
distributions
So far we have looked at a discrete
distribution (Binomial)
Now we will talk about a continuous
distribution (Normal)
What is the difference?
Normal distribution
The normal distribution is of special interest
for two main reasons.
1) it is an approximation to many probability
distributions, especially when the number of
observations is high.
2) it plays a key role in market research and
quality management.
Features of Normal probability
distributions
“Bell-shaped” curve
Each normal distribution can be differentiated
and represented by its mean and standard
deviation.
The highest point on the curve is the mean,
median, and mode.
Can theoretically have values from negative
infinity to positive infinity.
The Normal curve
The probability density function which defines
the normal curve is:

f x  
1      x    / 2 2
e
 2
where  is the mean,  is the standard
deviation,  = 3.14159, and e = 2.71828.
We can use calculus to calculate the area
under the curve.
“Standard normal” distribution
If you “standardise” any normally distributed data
set then get exactly the same distribution:
standard normal.
Lots of work has been done to figure out as much
information as possible about the standard normal
distribution.
So if you know the “standardised” value of any
piece of data in a normal distribution, you know
exactly the probability of being above or below it.
e.g z = 1 has 18% probability of being above and
82% probability of being below.
Cheese
Sales of cheese at a roadside stall is known
to have a mean of 20 kg per day and a
standard deviation of 6 kg per day.
Since cheese is relatively expensive and it
is important to sell fresh cheese each day,
the owner of the roadside stall wants to
stock the right amount. He wants the chance
of running out to be small but also doesn’t
want there to be much left over.
Cheese
To get a feel for how many to stock, he
wants to know:
What is the probability of selling less than
16 kg of cheese?
What is the probability of selling between 8
and 32 kgs of cheese?
How much cheese would you need to have
on hand at the start of the day to have less
than 10% chance of running out?
Graphing normal probabilities
for cheese

34    6            34
P(Sales < 34)      P(6 < Sales < 34)

11.5                            30
25
P(Sales > 11.5)   P(25 < Sales < 30)
Excel - Normal distribution
The function
=normdist(x, mean,
stddev, 1) gives you the
probability for the
region up to x in a
normal distribution with
the mean and stddev
specified.
Normal distribution
Excel template
Using Excel for normal dist

B                  We want
A         P(Sales < 16)

16

Mean
(20 kg)
Use Excel formula =normdist(16,20,6,1) to
get P(Sales < 16) = 0.2514
Determining the probability
of sales between 8 and 32
P(8 < Sales < 32) = P(-2 < z < 2)
Area up to a                         Area up to a
z-score                              z-score of
of -2 is                             2 is 0.9772
0.0228

-2            0          2
P(8 < Sales < 32) = 0.9772 - 0.0228 = 0.9544
Excel - Normal distribution
converting from probabilities

The function =norminv(p,
mean, stddev) gives you the
point on the distribution that
has a probability of p below
it (in a normal distribution
with the mean and stddev
specified).
Stocking policy
Area in upper tail of
= 0.9

x            What is x?
P(Sales < x) = 0.9

Use =norminv(0.9,20,6)
Result you get is 27.68. So
Need to stock about 28 kg of cheese each day.
Necessary assumptions for
using the Normal distribution
Bell-shaped frequency histogram
Mean of sample data must be good
estimate of population mean.
Standard deviation of sample data must be
good estimate of population standard
deviation.
Model selection

Does the situation meet the necessary
assumptions for a particular model?
Use collected data to evaluate the
validity of the model.
QQ plot
What is a QQ plot?
Informal method of detecting whether a
data set is normally distributed
Use Statpro
The closer the line is to 45 degrees, the
more normally distributed the data set
Heights
The height of Australian women is
normally distributed with a mean of 164 cm
and a standard deviation of 8 cm.
If we chose a woman at random in this
class, what is the probability that she would
be taller than 174 cm? (Are you completely
If I’m interested in recruiting the tallest 1%
of women for a basketball team, at what
height will I accept people?
Dyslexia
Estimates of rates of learning disabilities
vary but according to some estimates, about
4% of children are thought to have dyslexia.
A school with 130 children is considering
whether to run a special program for these
children. It decides that it will do it if more
than 5 children in their school qualify.
What is the probability that the school will
run a special program?
In a 1992 study, 415 children were tested on their
reading ability. Their scores were then compared
against a “norm-referenced group” (a large
random sample with normally distributed reading
abilities). Since 63 (or 15%) of the children
performed below the 25th percentile from the
norm-referenced group, the study concluded that
about 15% of all children are learning disabled.
(Study conducted by US National Institute for
Child Health and Human Development)
NCAA tournament
Discuss in small groups
Solve using Excel
Discuss as a class – applications to other
problems/situations?
Quick review: Binomial
assumptions
Only two outcomes, success or failure, on a
single trial.
Probability of outcome remains constant
from trial to trial -- statistical independence.
Quick review: Binomial
examples
The number of students getting financial aid out
of a sample of 10. On average, 10% of students
The number of Australian-made cars in a
company car-park with 40 spaces. On average,
80% of cars at this company are Australian-made
The number of defective items in a sample of 50.
Process produces about 5% defectives usually.
Normal distribution examples
The number of hours to complete a project.
The weight of a Mars bar in grams.
The number of kilos lost during the first
month of a weight loss program.
What did we do?
Calculated probabilities for the normal
distribution.
Discussed the assumptions needed to use the
normal distribution.
Talked about the normal approximation to the
binomial distribution
Excel functions covered today
=normdist()
=norminv()
QQ plot in Statpro for determining whether
a data set is normally distributed
Managerial applications
What did you learn today that makes a
difference to the way you manage?
What are the three most important things to
remember from today’s lecture?
Next class