Chapter Four
Describing Data: Displaying and Exploring Data
GOALS
When you have completed this chapter, you will be able to:
ONE
Develop and interpret a dot plot.
TWO
Develop and interpret a stem-and-leaf display.
THREE
Compute and interpret quartiles, deciles, and percentiles.
FOUR
Construct and interpret box plots.
Goals
Chapter Four
Describing Data: Displaying and Exploring
Data
FIVE
Compute and understand the coefficient of variation and the
coefficient of skewness.
SIX
Draw and interpret a scatter diagram.
SEVEN
Set up and interpret a contingency table.
Goals
Dot Plot
Dot plots:
Report the details of each observation
Are useful for comparing two or more data sets
Dot Plot
This example gives the percentages of men and
women participating in the workforce in a recent
year for the fifty states of the United States.
Compare the dispersions of labor force
participation by gender.
Example 1
This example gives the percentages of men and
women participating in the workforce in a recent
year for the fifty states of the United States.
Compare the dispersions of labor force
participation by gender.
Example 1
(continued)
Percentage of women Percentage of men
participating participating
In the labor force for the In the labor force for the
50 states. 50 states.
Example 1 (continued)
Stem-and-leaf Displays
Stem-and-leaf Note: an advantage
display: A of the stem-and-leaf
statistical technique display over a
for displaying a set frequency
of data. Each distribution is we
numerical value is do not lose the
divided into two identity of each
parts: the leading observation.
digits become the
stem and the
trailing digits the
leaf.
Stem-and-leaf Displays
Stock prices on twelve
consecutive days for a major
publicly traded company 100
90
80
70
60
86, 79, 92, 84, 69, 88, 91 50
1 2 3 4 5 6 7 8 9 10 11 12
83, 96, 78, 82, 85.
Example 2
Stem and leaf display of stock prices
stem leaf
6 9
7 89
8 234568
9 126
Example 2 (Continued )
Quartiles
Divide a set of
observations
into four
equal parts.
Quartiles
Quartiles
Locate the median,
(50th percentile)
Quartiles (continued)
Quartiles
Locate the median,
(50th percentile)
the first quartile
(25th percentile)
Quartiles (continued)
Quartiles
Locate the median,
(50th percentile)
first quartile (25th percentile)
and the 3rd quartile
(75th percentile)
Quartiles (continued)
Quartiles
P
Lp = (n+1)
100
where
P is the desired percentile
Quartiles (continued)
Using the twelve stock prices, we can find the
median, 25th, and 75th percentiles as follows:
Quartile 3 L75 = (12 + 1) 75 = 9.75th observation
100
50
Median L50 = (12 + 1) = 6.50th observation
100
25 = 3.25th observation
Quartile 1 L25 = (12+1)
100
Example 2 (continued)
th
12 96 75 percentile
Q4 11 92 Price at 9.75 observation = 88 + .75(91-88)
10 91 = 90.25
9 88
Q3 8 86
50th percentile: Median
7 85
Price at 6.50 observation = 85 + .5(85-84)
6 84
= 84.50
Q2 5 83
4 82
3 79 25th percentile
Q1 2 78 Price at 3.25 observation = 79 + .25(82-79)
1 69 = 79.75
Example 2 (continued)
The Interquartile This distance will
range is the distance include the middle 50
between the third percent of the
quartile Q3 and the observations.
first quartile Q1.
Interquartile range = Q3 - Q1
Interquartile Range
For a set of
observations the third
quartile is 24 and the
first quartile is 10.
What is the quartile
deviation?
The interquartile range is
24 - 10 = 14. Fifty
percent of the observations
will occur between 10 and
24.
Example 3
A box plot is a graphical
display, based on quartiles,
that helps to picture a set of
data.
Five pieces of data
are needed to
construct a box
plot: the Minimum
Value, the First
Quartile, the
Median, the Third
Quartile, and the
Maximum Value.
Box Plots
Based on a sample of 20
deliveries,
Buddy’s Pizza determined the
following information. The
minimum delivery time was 13
minutes and the maximum 30
minutes. The first quartile was
15 minutes, the median 18
minutes, and the third quartile
22 minutes. Develop a box plot
for the delivery times.
Example 4
Example 4 continued
Min Q Median Q3 Max
1
12 14 16 18 20 22 24 26 28 30 32
Example 4 continued
The coefficient of variation is
the ratio of the standard
Relative dispersion deviation to the arithmetic
mean, expressed as a
percentage:
s
CV (100%)
X
Mean
Coefficient of Variation
Skewness is the
measurement of the
lack of symmetry of
the distribution.
The coefficient of
skewness can range A value of 0 indicates a
symmetric distribution.
from -3.00 up to 3.00
when using the following
formula: Some software packages use a
different formula which results
sk
(
3 X - Median ) in a wider range for the
coefficient.
s
Movie
Using the twelve stock prices, we find the mean to be
84.42, standard deviation, 7.18, median, 84.5.
Coefficient of variation
s
CV (100%) = 8.5%
X
Coefficient of skewness
3 (X - Median )
sk = -.035
s
Example 2 revisited
Scatter Variables must be at least interval scaled.
diagram: A
technique
used to show Relationship can be positive (direct) or
the negative (inverse).
relationship
between
variables.
Example
The twelve days of stock prices and the overall market
index on each day are given as follows:
Scatter diagram
Index
(000s) Price
Relationship between Market Index
8.0 96 and Stock Price
7.5 92 100
7.5 91 90
7.3 88 80
Price
7.2 86 70
7.2 85 60
50
7.1 84 5 6 7 8 9 10
7.1 83 Index
7.0 82
6.2 79
6.2 78
5.1 69
Example 2 revisited
A contingency table is
used to classify
observations according to
two identifiable
characteristics.
Contingency tables are used
when one or both variables are
nominally scaled.
A contingency table is a
cross tabulation that
simultaneously
summarizes two variables
of interest.
Contingency table
Weight Loss
45 adults, all 60 pounds
overweight, are randomly
assigned to three weight loss
programs. Twenty weeks
into the program, a
researcher gathers data on
weight loss and divides the
loss into three categories:
less than 20 pounds, 20 up
to 40 pounds, 40 or more
pounds. Here are the
results.
Example 5
Weight Less 20 up to 40
Loss than 20 40 pounds
Plan pounds pounds or more
Plan 1 4 8 3
Plan 2 2 12 1
Plan 3 12 2 1
Compare the weight loss under the three plans.
Example 5 continued