Documents
User Generated
Resources
Learning Center

# AP Statistics

VIEWS: 7 PAGES: 15

• pg 1
```									AP Statistics

Chapter 3 and 4 Test Review
Scatterplots
Display bivariate data: Explanatory
variable on x axis, response of y axis.
Describing associations: direction, form,
strength in context
Outlier: individual data point that falls
far outside overall pattern of graph
Influential Point: data point that
markedly change LSRL if removed
Least Squares Regression Line
Correlation Coefficient
ŷ = a + bx
-1 r  1
r tells you two things
a) slope: +r = + slope
-r = - slope
a =  - b             b) dispersion of data
around line of best fit.
Use 2-var stats to
The closer to 1 or –1 the less
get these numbers      variation from LSRL
Interpreting the LSRL
ŷ=a+bx
The slope of a regression line is usually
important for interpreting the data.
The regression equation relating number of
cavities to dental office visits gives
ŷ=6.7+.5x

Interpretation: For each additional trip to the
dentist (x) the LSRL predicts an increase of
.5 cavities.
Relating the range of motion and age for 12 patients recovering from knee surgery

Predictor Coef                   Stdev           T-ratio         P-value
Constant 107.58                  11.12           9.67            0.000
Age       -.8710                 .4146           2.10            0.062
N = 12 S = 10.42                           R-Sq: 30.6%       R-Sq(adj): 23.7%

1. What is the equation of the LSRL?
2. What is the predicted range of motion for a 25 year old
patient?
3. What is the value of the correlation coefficient?
Correlation Tidbits
Every regression line passes through the
point (,)
Correlation ( r ) is a unitless measurement
r measures the direction & strength only of a
linear relationship between two variables.
Like the mean and standard deviation r is
strongly affected by influential points. It is
not resistant
Coefficient of Determination r2
The percent of variation in the response
variable (y) that is explained by the least
squares regression of y on x

Remember to interpret in context:
68% of the variation in airfare can be
explained by change in the length of a flight
(distance).
Residuals
The difference between an observed value of
the response variable and the value predicted
by the regression line

More Simply: actual – predicted = residual
if the point is above the line + residual
if the point is below the line - residual
Modeling Nonlinear Data
A variable grows exponentially if an (x,log y)
transformation linearizes the data. y = a(bx)
If a (log x, log y) transformation is linear,
then the data is best modeled by a power
model y = a(xb)

A residual plot of the transformed data should
show a random scatter of points.
Make sure you understand
response question from Quiz
4.1.
You will probably see another
question like it tomorrow 
Interpreting Correlation and
Regression
Extrapolation: using a regression line
for prediction outside the domain of
values of the explanatory variable

Lurking variables: a variable that has an
important effect on the relationship
among the two variables, but is not
directly included in the study
Interpreting Association
1. Association  Causation
one variable causes changes in a
second variable

Ex: a drop in outdoor temperature
causes an increase in natural gas
consumption
2. Common response: both variables are
commonly responding to a third
lurking variable.

Ex. Both SAT verbal and SAT math
scores respond to a students ability
and level of knowledge.
3. Confounding: The effect on y of the
explanatory variable is hopelessly mixed up
with the effects on y of other variables

Ex: Minority students have lower average
test scores on college exams than white
students, but minorities (on average) grew
up in poorer neighborhoods and attended
poorer schools than the average white. The
effect of social and economic differences
mess up our ability to say race explains test
scores.
Final Tips for success
Review past quizzes and worksheets