# Chapter 5 - Sample Surveys and Experiments - PowerPoint

Document Sample

```					                 Section 3.1 - Scatterplots

Objectives:

1.   Make a scatterplot and describe its basic shape in
terms of linearity, curvature, clusters, and outliers.
2.   Describe whether the trend in a scatterplot is positive or
negative.
3.   Describe whether the strength of the relationship is
strong, moderate, or weak and whether the strength is
constant across all values of x.
4.   Decide whether the pattern in a scatterplot can be
generalized to other cases and to propose possible
explanations for the pattern.
Section 2.1 - Scatterplots

Interpreting Scatterplots

A scatterplot shows the relationship between two
quantitative variables.

Example: Number of seats versus Flight length (miles) for various
passenger airplanes.
Section 2.1 - Scatterplots

Interpreting Scatterplots

Martin v Westvaco, revisited. In Display 3.1, each employee is
represented by a dot that shows the year of birth (vertical y-axis) plotted
against the year of hire (horizontal x-axis).

Moderate positive
association: employees
hired earlier were generally
born earlier, and employees
hired later were generally
born later.
Section 2.1 - Scatterplots

Interpreting Scatterplots

Martin v Westvaco, revisited. In Display 3.1, each employee is
represented by a dot that shows the year of birth (vertical y-axis) plotted
against the year of hire (horizontal x-axis).

Linear trend: visualize a
line going through the
center of the data from
lower left to upper right. As
you move to the right, the
points fan out or cluster less
closely around the line.
Section 2.1 - Scatterplots

Interpreting Scatterplots

Martin v Westvaco, revisited. Display 3.2, shows the ages of the
employees (vertical y-axis) plotted against the year of hire (horizontal x-
axis).

Moderate negative
association: employees
hired later were generally
younger at the time of
layoffs.
Section 2.1 - Scatterplots

Describing the Pattern in a Scatterplot

Bivariate quantitative data: shape, trend, strength

•   Identify cases and variables. Each dot (x, y) represents one case.
The x and y coordinates correspond to the values of the two
variables. Describe the plot as “y versus x”, describe the scale (units)
and range (min and max) of each variable.

•   Describe the shape of the relationship.
• Linearity: Are the dots scattered around a line or a curve?
• Clusters: Is there one cluster, or more?
• Outliers: Are there exceptions to the overall pattern?
Section 3.1 - Scatterplots

Describing the Pattern in a Scatterplot

Bivariate quantitative data: shape, trend, strength

•   Describe the trend.
• Positive: As x gets larger, y tends to get larger.
• Negative: As x gets larger, y tends to get smaller.

•   Describe the strength of the relationship. Strong, moderate, or weak,
depending on how closely the dots cluster around an imaginary line or
curve.
Section 2.1 - Scatterplots

Describing the Pattern in a Scatterplot

Bivariate quantitative data: shape, trend, strength

•   Does the pattern generalize to other cases?

•   Are there plausible explanations for the pattern? Is it reasonable to
conclude that one variable causes the other, or is there a lurking
variable that might be causing both?
Section 3.1 - Scatterplots

Describing the Pattern in a Scatterplot

Example: Dormitory Populations. The plot in Display 3.3 shows, for the 50
states, the number of people living in college dormitories versus the
number of people living in cities. Describe the pattern in the plot.
Section 3.1 - Scatterplots

Example: Dormitory Populations. Describe the pattern in the plot.

1. Identify the variables and cases.
Variables: Dormitory population versus Urban population
Dormitory population ranges from 0 to a high of around 175,000.
Urban population ranges from 0 to 32 million. Both are measured in
thousands
Cases: 50 U.S. states
Section 3.1 - Scatterplots

Example: Dormitory Populations. Describe the pattern in the plot.

2. Describe the overall shape.
A general linear trend becomes curved by the larger states. As the
urban population increases, the number of people living in dormitories is
does not grow as much as one would expect. California - the point
farthest to the right - is an outlier in both urban population and with
respect to the overall trend.
Section 3.1 - Scatterplots

Example: Dormitory Populations. Describe the pattern in the plot.

3. Describe the trend.
The trend is positive - states with large urban populations, tend to have
larger dormitory populations.
Section 3.1 - Scatterplots

Example: Dormitory Populations. Describe the pattern in the plot.

4. Describe the strength of the relationship.
The relationship varies in strength. Dots corresponding to smaller states
cluster around the line, while dots corresponding to larger states tend to
fan out. The overall strength or the relationship is moderate.
Section 3.1 - Scatterplots

Example: Dormitory Populations. Describe the pattern in the plot.

5. Does the pattern generalize to other cases?
The cases are not a sample from a larger group - there are only 50
states - so the relationship cannot be generalized to a larger group.
However, it is likely that the relationship will be the same in other years.
Section 3.1 - Scatterplots

Example: Dormitory Populations. Describe the pattern in the plot.

6. Are there plausible explanations for the pattern?
Do cities attract colleges? Not likely. The most likely explanation is a
lurking variable, the state’s population - as the population of the state
increases, the number of people living is cities increases, as does the
number of people living in dormitories.
Section 3.1 - Scatterplots

Example:
E4. SAT I scores. In 2005, the average SAT I math score across the U.S.
was 520. North Dakota students averaged 605, Illinois students averaged
606, and students from Iowa averaged 608. Why do students from the
Midwest do so well? It is easy to jump to a false conclusion, but the
Section 3.1 - Scatterplots

E4a. SAT I scores. Estimate the percentage of students in Iowa and North
Dakota who took the SAT I. New York had the highest percentage of
students who took the SAT I. Estimate that percentage and the average
SAT I math score for students in New York.
Section 3.1 - Scatterplots

E4a. SAT I scores. Estimate the percentage of students in Iowa and North
Dakota who took the SAT I. New York had the highest percentage of
students who took the SAT I. Estimate that percentage and the average
SAT I math score for students in New York.
SAT I Math Data
State    PercentÉ AverageÉ    <new>
1      IowaÊ                  5       608
2      IllinoisÊ             10       606
3      North DakotaÊ          4       605
4      Wis cons inÊ           6       599
5      MinnesotaÊ            11       597
6      South DakotaÊ          5       589
7      Kans asÊ               9       588
8      MissouriÊ              7       588
9      MichiganÊ             10       579
11     OklahomaÊ              7       563
12     TennesseeÊ            16       563
13     LouisianaÊ             8       562
15     AlabamaÊ              10       559
Section 3.1 - Scatterplots

E4a. SAT I scores. Estimate the percentage of students in Iowa and North
Dakota who took the SAT I. New York had the highest percentage of
students who took the SAT I. Estimate that percentage and the average
SAT I math score for students in New York.
SAT I Math Data
State       PercentÉ AverageÉ    <new>
37     VirginiaÊ               73      514
39     New YorkÊ              92       511
40     North CarolinaÊ        74       511
41     West VirginiaÊ         20       511
42     IndianaÊ               66       508
43     MaineÊ                 75       505
44     Rhode IslandÊ          72       505
45     PennsylvaniaÊ          75       503
46     DelawareÊ              74       502
47     TexasÊ                 54       502
48     South CaroliÉ          64       499
49     FloridaÊ               65       498
50     GeorgiaÊ               75       496
Section 3.1 - Scatterplots

E4b. SAT I scores. Describe the shape of the plot. Do you see any
clusters? Are there any outliers? Is the relationship linear or curved? Is the
overall trend positive or negative? What is the strength of the relationship?
Section 3.1 - Scatterplots

E4b. SAT I scores. Describe the shape of the plot. Do you see any
clusters? Are there any outliers? Is the relationship linear or curved? Is the
overall trend positive or negative? What is the strength of the relationship?

The trend is negative, moderate, and curved.
There are two clusters, separated by a gap (40 - 50%), corresponding to
two groups of states - low percentages/high scores and high
percentages/low scores. There is an outlier (WV with 20% and 511).
Section 3.1 - Scatterplots

E4c. SAT I scores. Is the distribution of the percentage of students taking
the SAT I bimodal? Explain how the scatterplot shows this. Is the
distribution of SAT I math scores bimodal?
Section 3.1 - Scatterplots

E4c. SAT I scores. Is the distribution of the percentage of students taking
the SAT I bimodal? Explain how the scatterplot shows this. Is the
distribution of SAT I math scores bimodal?
Percentage is bimodal - there is a cluster around 10% and another around
65 - 75%.
SAT I math scores is also bimodal, with a cluster around 560 - 580 and
another around 510.
Section 3.1 - Scatterplots

E4c. SAT I scores. Is the distribution of the percentage of students taking
the SAT I bimodal? Explain how the scatterplot shows this. Is the
distribution of SAT I math scores bimodal?
Percentage is bimodal - there is a cluster around 10% and another around
65 - 75%.
SAT I math scores may be slightly bimodal, with a cluster around 560 - 580
and another around 510 - 520.
Section 3.1 - Scatterplots

E4d. SAT I scores. The cases used in this plot are the 50 U.S. states in
2005. Would you expect the pattern to generalize to some other set of
cases? Why or why not?
Section 3.1 - Scatterplots

E4d. SAT I scores. The cases used in this plot are the 50 U.S. states in
2005. Would you expect the pattern to generalize to some other set of
cases? Why or why not?
The data are from all 50 states, so the pattern can’t be generalized to a
larger group.
Since SAT scores do not change much from year to year, it is likely that the
pattern would hold up if we used data from another year.
Section 3.1 - Scatterplots

E4e. SAT I scores. Suggest an explanation for this trend. Is there anything
correct?
Section 3.1 - Scatterplots

E4e. SAT I scores. Suggest an explanation for this trend. Is there anything
correct?
Students in Midwest states tend to take the ACT. Only a small percentage
of students in these states take the SAT. These are better students
applying to selective colleges outside the Midwest.

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 5 posted: 10/5/2012 language: English pages: 28
How are you planning on using Docstoc?