Chapters 26-27 Packet Due: A day 3/31 Bday 4/1
Chapter 26 Chi-Square Templates
1. Births are not evenly distributed across the days of the week. Fewer babies are born on Saturday
and Sunday than on other days, probably because doctors find weekend births inconvenient. A
random sample of 700 births from local records shows this distribution across the days of the
Day Sun. Mon. Tue. Wed. Thu. Fri. Sat.
Births 84 110 124 104 94 112 72
Do 700 births give significant evidence that births are not equally probable on all days of the week?
2. A sample survey by the Pew Internet and American Life Project asked a random sample of adults
about use of the Internet and about the type of community they lived in. Here are the results:
Rural Suburban Urban
Internet users 433 1072 536
Nonusers 463 627 388
Is there a relationship between Internet use and community type? Give statistical evidence to
support your findings.
Multiple Choice: Questions 1 through 10 relate to the following setting.
The National Survey of Adolescent Health interviewed several thousand teens (grades 7 to 12). One question asked
was “What do you think are the chances you will be married in the next ten years?” Here is a two-way table of the
responses by sex:
Almost no chance 119 103
Some chance, but probably not 150 171
A 50-50 chance 447 512
A good chance 735 710
Almost certain 1174 756
1. The number of female teenagers in the sample is
2. The percent of the females in the sample who responded “almost certain” is about
3. The percent of the females in the sample who responded “almost certain” is
(a) higher than the percent of males who felt this way.
(b) about the same as the percent of males who felt this way.
(c) lower than the percent of males who felt this way.
(d) higher than the percent of males who responded “a good chance” or “almost certain.”
(e) higher than the other categories combined.
4. The expected count of females who respond “almost certain” is about
5. The term in the chi-square statistic for the cell of females who respond “almost certain” is about
6. The degrees of freedom for the chi-square test for this two-way table are
(a) 2. (b) 4. (c) 8. (d) 9. (e) 20.
7. The null hypothesis for the chi-square test for this two-way table is
(a) Equal proportions of female and male teenagers are almost certain they will be married in ten years.
(b) There is no difference between female and male teenagers in their opinions about their chances of being
married in ten years.
(c) There are equal numbers of female and male teenagers.
(d) There is a difference between female and male teenagers in their opinions about their chances of being
married in ten years.
(e) There is no association between the number of female teens who responded “almost certain” and the
number of male teens who feel this way.
8. The alternative hypothesis for the chi-square test for this two-way table is
(a) Female and male teenagers do not have the same opinions about their chances of being married in ten
(b) Female teenagers are more likely than male teenagers to think it is almost certain they will be married in
(c) Female teenagers are less likely than male teenagers to think it is almost certain they will be married in ten
(d) There is no association between the number of female teens who responded “almost certain” and the
number of male teens who feel this way.
(e) There is an association between the number of female teens who responded “almost certain” and the
number of male teens who feel this way.
9. Software gives the chi-square statistic as X2 = 69.8 for this table. The P-value is
(a) between 0.0025 and 0.001.
(b) between 0.001 and 0.0005.
(c) less than 0.0005.
(d) between 0.005 and 0.001.
(e) between 0.01 and 0.001.
10. The most important fact that allows us to trust the results of the chi-square test is
(a) the sample is large, 4877 teenagers in all.
(b) the sample is close to an SRS of all teenagers.
(c) all of the cell counts are greater than 100.
(d) the P-value is very small.
(e) the X2 test statistic is very large.
Chapter 27 Linear regression T
Multiple Choice Practice: Questions 1 to 9 are based on the following information.
Florida reappraises real estate every year, so the county appraiser’s Web site lists the current
“fair market value” of each piece of property. Property usually sells for somewhat more than the
appraised market value. Here are the appraised market values and actual selling prices (in
thousands of dollars) of condominium units sold in a beachfront building over a 19-month
Selling Appraised Selling Appraised
price value Month price value Month
850 758.0 0 790 605.9 13
900 812.7 1 700 483.8 14
625 504.0 2 715 585.8 14
1075 956.7 2 825 707.6 14
890 747.9 8 675 493.9 17
810 717.7 8 1050 802.6 17
650 576.6 9 1325 1031.8 18
845 648.3 12 845 586.7 19
Here is part of the Minitab output for regressing selling price on appraised value.
Predictor Coef SE Coef T P
Constant 127.27 79.49 1.60 0.132
appraisal 1.0466 0.1126 9.29 0.000
S = 69.7299 R-Sq = 86.1% R-Sq(adj) = 85.1%
Predicted Values for New Observations
Obs Fit SE Fit 95% CI 95% PI
1 967.3 21.6 (920.9, 1013.7) (810.7,
1. The equation of the least-squares regression line for predicting selling price from appraised value is
(a) price = 79.49 + 0.1126 appraised value.
(b) price = 127.27 + 1.0466 appraised value.
(c) price = 1.0466 + 127.27 appraised value.
(d) price = 127.27 + 69.7299 appraised value.
(e) price = 79.49 + 1.0466 appraised value.
2. What is the correlation between selling price and appraised value?
3. The slope of the population regression line describes
(a) the exact increase in the selling price of an individual unit when its appraised value increases by $1000.
(b) the average increase in selling price in a population of units when appraised value increases by $1000.
(c) the average selling price in a population of units when a unit’s appraised value is 0.
(d) the predicted increase in appraised value when the selling price increases by $1000.
(e) the exact increase in appraised value when the selling price increases by $1000.
4. Is there significant evidence that selling price increases as appraised value increases? To answer this question,
test the hypotheses
(a) H0: = 0 versus Ha: > 0.
(b) H0: = 0 versus Ha: ≠ 0.
(c) H0: = 0 versus Ha: > 0.
(d) H0: = 0 versus Ha: < 0.
(e) H0: > 0 versus Ha: = 0.
5. The P-value for this test is
(a) less than 0.001.
(b) between 0.001 and 0.005.
(c) between 0.005 and 0.05.
(d) between 0.05 and 0.10.
(e) greater than 0.10.
6. The regression standard error for these data is
(a) 0.1126. (b) 69.7299. (c) 79.49. (d) 21.6. (e) 967.3.
7. Confidence intervals and tests for these data use the t distribution with degrees of freedom
(a) 14. (b) 15. (c) 16. (d) 30. (e) 17.
8. A valid conclusion to our analyses is that
(a) there is insufficient evidence to conclude that the slope of the true regression line is not zero.
(b) there is insufficient evidence to conclude that the slope of the true regression line for these 19
condominium units is $10,466.
(c) there is strong evidence to conclude that selling price increases as the appraised value increases.
(d) there is weak evidence that is positive.
(e) there is evidence to conclude that the slope of the true regression line is significantly greater than zero.
9. A 95% confidence interval for the population slope is
(a) $1.0466 ± 0.2415.
(b) $1.0466 ± 149.5706.
(c) $1.0466 ± 0.2387.
(d) 967.3 ± 46.4.
(e) 967.3 ± 156.6.
Questions 1 to 3 refer to the following situation:
The effects of a toxic pollutant upon fish were examined by placing fish in a two-liter solution of
ater with various concentrations of the pollutant. The time (in minutes) until the fish showed
distress was recorded, at which time the fish were removed from the container. A total of 18
different experiments were performed. Note that the pollutant is measured on a logarithmic scale
where a change of one unit represents a tenfold increase in the pollution concentration. A
preliminary plot of the data showed that the relationship of time versus log(pollution) was
approximately linear. The output appears below:
SOURCE DF SUM OF SQUARES MEAN SQUARE F VALUE PR > F
MODEL 1 2.21459712 2.21459712 5.49 0.0324
ERROR 16 6.45556062 0.40347254
CORR. TOTAL 17 8.67015774
T FOR H0: PR > |T| STD ERROR OF
PARAMETER ESTIMATE PARAMETER=0 ESTIMATE
INTERCEPT 7.5641 3.82 0.0015 1.978
LOGPOLLUT -1.0269 -2.34 0.0324 0.438
1. The fitted regression line is
y = –1.03 + 7.56 x
y = 7.56 – 1.03 x
y = 3.28 – 2.34 x
y = 7.56 – 10.27 x
y = –1.03 + 75.64 x
2. A 95% confidence interval for the slope is
(a) 7.56 ± 1.96 (1.978)
(b) –1.03 ± 1.96 (0.438)
(c) 7.56 ± 2.1098 (1.978)
(d) –1.03 ± 2.1098 (0.438)
(e) –1.03 ± 2.1199 (0.438)
3. The appropriate null and alternative hypotheses to test the slope, the test statistic, and the P-value are
(a) H0: = 0, Ha: ≠ 0, t = –2.34, and P-value = 0.0324
(b) H0: = 0, Ha: ≠ 0, t = 3.82, and P-value = 0.0007
(c) H0: = 0, Ha: < 0, t = –2.34, and P-value = 0.0324
(d) H0: = 0, Ha: ≠ 0, t = 3.82, and P-value = 0.0015
(e) H0: = 0, Ha: < 0, t = –2.34, and P-value = 0.0162
Questions 4 and 5 are based upon the following:
Fitness can be measured by the rate of oxygen consumption during exercise, with more fit people having
higher rates. Unfortunately, this measurement is quite costly to obtain, and so an experiment was done to
see if this measurement could be predicted from the time it takes (in minutes) to run 1500 m. The
following output from JMP was obtained (the M and F refer to males and females respectively).
4. Which of the following is not correct?
(a) We are about 95% confident that the slope for these data is between –4.0 and –2.5.
(b) The fitted regression line is approximately y 82.42 3.31 (Runtime) .
(c) There is good evidence that there is a relationship between oxygen consumption and the run time.
(d) A person who runs 1500 m in 10 minutes would have an estimated oxygen consumption rate of
(e) The standard error = 0.36 measures how much the estimated slope would vary if another sample
of people were measured.
5. Which of the following is correct?
(a) The most relevant null hypothesis is that the estimated change in oxygen consumption for people
who take an additional minute to run 1500 m is 0.
(b) The most relevant null hypothesis is H0: = –3.31.
(c) The most relevant null hypothesis is that there is no relationship between the oxygen consumption
rate and the time to run 1500 m among all people.
(d) The most relevant null hypothesis is that we are 95% confident that the slope is between –4.04
(e) The most relevant null hypothesis is that we haven’t a clue what this question is about.
Part 2: Free Response
Answer completely, but be concise. Write sequentially and show all steps.
6. The standard procedure to reduce abnormally rapid heartbeats in humans is called the “diving reflex.”
This entails briefly submerging the patient’s face in cold water. The reflex, triggered by cold water
temperatures, is an involuntary neural response that shuts off circulation to the skin, muscles, and
internal organs to divert extra oxygen-carrying blood to the heart, lungs, and brain. A research
physician conducted an experiment to investigate the effects of various cold water temperatures on
the pulse rate of seven 6-year-old children. The temperature of the water (F) is the explanatory
variable, and the decrease in pulse rate (beats per minute) is the response variable. Here is computer
output for a regression analysis:
Predictor Coef Stdev t-ratio
Constant 57.912 5.674 10.21
H2Otemp -0.81138 0.09038 -8.98
s = 1.198 R-sq = 94.2% R-sq(adj) = 93.0%
(a) What is the equation of the least-squares regression line?
(b) The model for regression inference has three parameters: , , and . Estimate these parameters
from the output.
(c) What is the slope of the least-squares regression line? Interpret the slope in the context of this
(d) What is the value of the standard error of the slope? Interpret this value in context.
(e) What is the value of the standard error about the line? Interpret this value in context.
(f) What is the estimated heart rate when the water temperature is 60? What is the estimated heart
rate when the water temperature is 70? Explain why these two predictions make sense.
Chapter 27 Template:
1. Is the temperature of the water (F) useful for predicting the decrease in heart rate? Use a
significance level of 0.01.
2. Find the 95% confidence interval for the slope of the true regression line of decrease in heart rate
on water temperature. Interpret this interval.