Measures of Central
“to be or not to be
• Normal Distributions
• Skewness & Kurtosis
• Normal Curves and Probability
• Z- scores
• Confidence Intervals
• Hypothesis Testing
• The t-distribution
Is this normal ?
.5 Std. Dev = 160.68
Mean = 178.3
0.0 N = 6.00
100.0 200.0 300.0 400.0 500.0
Frequency Percent Valid Percent Percent
Valid 70.00 1 16.7 16.7 16.7
100.00 2 33.3 33.3 50.0
150.00 2 33.3 33.3 83.3
500.00 1 16.7 16.7 100.0
Total 6 100.0 100.0
N Valid 6
Std. Error of Skewness .845
Std. Error of Kurtosis 1.741
• Are your curves normal?
• Why do we care about normal curves?
• What do normal curves tell us?
The curves tell us something about the distribution
of the population
The curves allow us to make statistical inferences
regarding the probability of some outcomes
within some margin of error
The normal distribution
• A distribution is easily
depicted in a graph
where the height of the
line determined by the
frequency of cases for
the values beneath it.
• Most cases cluster
near the middle of a
distribution if close to
The Normal Curve
• Bell-shaped distribution or curve
• Perfectly symmetrical about the mean.
Mean = median = mode
• Tails are asymptotic: closer and closer to
horizontal axis but never reach it.
Skewness and Sample Distributions
Not all curves are normal, even if still bell-shaped
• Formula for skewness
Kurtosis (It’s not a disease)
• Beyond skewness, kurtosis tells us when
our distribution may have high or low
variance, even if normal.
• The kurtosis value for a normal distribution
will equal 3. Anything above this is a
peaked value (low variance) and anything
below is platykurtic (high variance).
Back to normal distributions
• The power of normal distributions, or those
close to it, is that we can predict where
cases will fall within a distribution
• For example, what are the odds, given the
population parameter of human height, that
someone will grow to more than eight feet?
• Answer, likely less than a .025 probability
• What does Andre the
Giant do to the sample
• What is the probability
of finding someone
like Andre in the
• Are you ready for
• Answer: Oh boy, yes!!
Normal Curves and probability
• We have answered the question of what
Andre and the Sumo wrestler would do to
• But what about the probability of finding
someone the same height as Andre in the
• What is the probability of finding someone
the same height as Dr. Peña or Dr.
More on normal curves and
Dr. Boehmer would be here Andre would be here
Z-Scores (no sleeping!!)
• We can standardize the central tendency
away from the mean across different
samples with z-scores.
• The basic unit of the z-score is the standard
(Xi X )
We can use the z-score to score each
observation as a distance from the
How far is a given observation from the
mean when its z-score = 2?
Answer: 2 standard deviations.
Approximately what percentage of cases
is a given case higher than if its z-score
Random Sampling Error
• Ever hear a poll report a margin of error? What
Random Sampling Error = standard deviation/ square
root of the sample size
N As the variance of the
population increases, so
does the chance that a
sample could not reflect the
• We often refer to both the random sampling
error with both the chance to err when
sampling but also the error of a specific
sample statistic, the mean. We typically
use the term Standard Error.
• A sample statistic standard error is the
difference between the mean of a sample
and the mean of the population from which
it is drawn.
Example: What if most humans were 200
pounds and only 1 million globally were 250
The random sampling error would be low
since the chance of collecting a sample
consisting heavily of those heavier humans
would be unlikely. There would not be
much error in general from sampling
because of the low variance.
• Example continued. Now, when we take a
sample, each sample has a mean. If a
population has low variance, so should the
samples. We should see this reflected in
low standard error in the mean of the
sample, the sample statistic.
• Of course, higher variance in the
population also causes higher error in
samples taken from it.
Some more notation
Distributions Mean Standard Dev.
observed data X s
Population μ σ
Repeated μ N
Random Sampling Error
Error in a Sample’s mean is the Standard Error s n
Central Limit Theorem
Remember that if we took an infinite number
of samples from a population, the means
of these samples would be normally
Hence, the larger the sample relative to the
population, the more likely the sample
mean will capture the population mean.
• We can actually use the information we
have about a standard deviation from the
mean and calculate the range of values for
which a sample would have if they were to
fall close to the mean of the population.
• This range is based on the probability that
the sample mean falls close to the
population mean with a probability of .95,
or 5% error.
How Confident Are You?
• Are you 100% sure?
• Social scientists use a 95% as a threshold
to test whether or not the results are
product of chance.
• That is, we take 1 out of 20 chances to be
• What do you MEAN?
We build a 95% confidence interval to make
sure that the mean will be within that
Confidence Interval (CI)
Y Z / 2 y
Y = mean
Z = Z score related with a 95% CI
σ = standard error
samplemean 1.96(or 2) * standarderror
Building a CI
• Assume the following
y 15 15
N 400 400
Why do we use 1.96?
Calculating a 95% CI
1. Let’s look at the class population
distribution of height
2. Is it a normal or skew distribution?
3. Let’s build a 95% CI around the mean
height of the class
Why do we care about CI?
• We use CI interval for hypothesis testing
• For instance, we want to know if there is
an income difference between El Paso
• We want to know whether or not taking
class at Kaplan makes a difference in our
Mean Difference testing
El Paso Las Cruces Boston