Final Exam: Practice problems
Read each question carefully. Circle your final answer whenever possible. Show all
work to get credit!!!
1. Bags of fertilizer are filled such that the expected weight per bag is 40 lbs. with a
standard deviation of 2 lbs. The weights of the bags are normally distributed.
i.) What is the probability that the weight of a single, randomly selected bag of
fertilizer exceeds 41 lbs?
ii.) Now suppose a random sample of 16 bags is selected and the sample average
weight is determined. What is the probability that the sample average weight
of the bags is between 40 and 41 lbs?
2. (3 pts) All the values in a data set are multiplied by two. Circle the most appropriate
description of the effect of this change of the data.
(a) The mean and median are both doubled, the variance stays the same.
(b) The mean, median and variance are all doubled.
(c) The mean and median are both doubled, the variance is multiplied by 4
(d) There is no effect.
(e) None of the above.
3. The time to repair breakdowns of a certain type of lawn mower engine is normally
distributed with a mean time of 93 minutes. The manufacturer has made a new engine
and claims that breakdowns of the new improved model are easier to fix (takes less time).
You have been asked, as a consultant, to test the manufacturer’s claim of their new
i.) Circle the most appropriate null and alternative hypotheses? (3 points)
a.) H o : X 93 against H A : X 93
b.) H o : X 93 against H A : X 93
c.) H o : 93 against H A : 93
d.) H o : 93 against H A : 93
ii). You randomly sample 20 engines and find a sample mean repair time of 90
minutes with a sample standard deviation of 8 minutes. What is the value of the test
iii) Determine the critical (or rejection) region of a test of the hypothesis if 0.01 .
iv) Calculate the p-value of this hypothesis test.
v) Using 0.01 , circle the most appropriate conclusion to the test.
a.) At 0.01 , there IS NOT sufficient evidence to say the true mean repair time
for this new type of lawn mower engine is less than 93 minutes.
b.) At 0.01 , there IS sufficient evidence to say the true mean repair time for this
new type of lawn mower engine is less than 93 minutes.
c.) At .01 , there IS sufficient evidence to say the true mean repair time for this
new type of lawn mower engine is 93 minutes.
d.) At .01 , there IS NOT sufficient evidence to say the true mean repair time for
this new type of lawn mower engine is different from 93 minutes.
e.) At .01 , there IS sufficient evidence to say the true mean repair time for this
new type of lawn mower engine is different from 93 minutes.
4. The following is simple linear regression output from statcrunch for the regression of
life expectancy (the variable y, or var2) on per capital GDP (x, or var1), in thousands of
dollars), for 10 European countries in 2000.
Simple linear regression results:
Dependent Variable: var2
Independent Variable: var1
var2 = 68.71511 + 0.42002273 var1
Sample size: 10
R (correlation coefficient) = 0.8094
R-sq = 0.65515995
Estimate of error standard deviation: 0.4950725
Parameter Estimate Std. Err. DF T-Stat P-Value
Intercept 68.71511 2.3237698 8 29.570532 <0.0001
Slope 0.42002273 0.107736535 8 3.89861 0.0046
Analysis of variance table for regression model:
Source DF SS MS F-stat P-value
Model 1 3.7252655 3.7252655 15.199161 0.0046
Error 8 1.9607744 0.2450968
Total 9 5.68604
i. What percentage of raw
variation in life expectancy (var2) about its mean is explained by the fitted model?
(Answer here) ______________
ii. Use the linear regression equation above to predict the life expectancy for a country for
which the per capita GDP is 21.5 thousand dollars.
iii. Italy, which is one of the countries whose data was used to fit the regression line, has
a per capita GDP of 21.5 thousand dollars and a life expectancy of 78.51 years. Compute
the residual (or error) for Italy.
NOTE: CONTINUED ON THE NEXT PAGE
PROBLEM 4 CONTINUED…
iv. Give the specific interpretation of the slope, in the context of this problem.
v. Is the slope of this regression equation statistically significant at level 0.01 ? Why
or why not?
5. A 95% confidence interval for some unknown population mean was found to be
i.) The interpretation of the confidence interval is:
a.) In repeated applications of the process which was used to arrive at the interval,
95% of all resulting intervals would capture the true population mean.
b.) In repeated applications of the process which was used to arrive at the interval,
95% of all the population means are captured in the interval.
c.) Confidence intervals are the only type of statistical inference.
d.) We would need to know if the interval was based on the population standard
deviation or sample standard deviation to give an interpretation.
ii.) Suppose we were to conduct a hypothesis test using the same sample that the
confidence interval (143, 175) above is based on, testing H o : 150 against
H a : 150 , using a significance level ( ) of .05. What must the conclusion of
such a test be?
a.) We cannot say because we can only compare confidence intervals with one-tailed
b.) To reject the null hypothesis
c.) To not reject the null hypothesis
d.) We cannot say because there is not enough information given
6. Consider a discrete random variable, X, which is the number of classes taken during
the first semester for a freshman at some university. The associated probability mass
function is as follows:
x 3 4 5 6
f (x) .10 .50 .30 ?
i.) Find the missing probability, f(6), from the table. (3 points)
ii.) Find the expected number of classes that a randomly selected freshman takes
their first semester. (5 points)
7. Which of the following is/are population parameters? (Circle your answer)
d.) both a.) and c.) are correct
e.) none of the above
8. An interaction plot of this data shows lines which are piece-wise parallel. This
indicates which of the following (Circle your answer) :
a. There is no interaction between factors
b. There is interaction between factors
c. The interaction between the factors model is statistically significant
d. Both b. and c. are correct
e. None of the above are correct
9. Consider a continuous random variable X which has the following probability density
4 x3 0 x 1,
f ( x)
i) Compute the mean E ( X ) of X . (Write your answer as a fraction)
ii) Compute the standard deviation of X (Write your answer in terms of a
iii) Compute the cumulative distribution function of X .
iv) Compute P(1/ 4 X 3/ 4) . Write your answer as a fraction.
10. What is the third quartile for the data set 30, 38, 244, 26, 14, 24 (Circle your answer).
11. In a study to determine the effect of eating tomatoes and age on cholesterol level, a
researcher selects 200 males in each of three age groups: 30-39 years old, 40-49 years
old, 50-59 years old, for a total of 600 males. Each of the three age groups is further
divided so that half of it (100 males) receives one serving of tomatoes per week and the
other half receives 3 servings of tomatoes per week. The change in cholesterol level after
6 months in the program is recorded.
i. How many treatments are there in this experiment?
ii. Identify the experimental units of this experiment?
iii. In the context of this experiment, what is the interaction effect?
(a) The effect of where the males live.
(b) The effect of how many children the males have.
(c) The effect of whether or not the males are married.
(d) The combined effects of levels of age and levels of number of servings of tomatoes,
over and above their individual effects.
(e) None of the above
12. Suppose a random variable X has the following cumulative distribution function:
0 x 1
0.1 1 x 2
F ( x) 0.4 2 x 3.
0.9 3 x 4
(a) (4 pts) Find Pr X 3 .
(b) (4 pts) Find Pr X 2.7 .
(c) (4 pts) Find E X , the expected value of X .
(d) (4 pts) Compute Pr 1 X 3.9 .