# Random Variables

Document Sample

```					Cumulative Review

Review Problems
Problem 1
Which of the following statements are true?
I.      All variables can be classified as quantitative or
categorical variables.
II.     Categorical variables can be continuous
variables.
III.    Quantitative variables can be discrete variables.
A.   I only      The correct answer is (E). All variables can
B.   II only     be classified as quantitative or categorical
C.   III only    variables. Discrete variables are indeed a
D.   I and II    category of quantitative variables.
Categorical variables, however, are not
E.   I and III   numeric. Therefore, they cannot be classified
as continuous variables.
Problem 2
Mr. Kim says he might give two extra credit
assignments this week. The probability that
he give an EC assignment today is 0.4. If he
gives an EC assignment today, the probability
that he will give one tomorrow is 0.8. If he
doesn’t give one today, the probability that he
gives one tomorrow is 0.3.
Let the random variable X be the number EC
assignments that Mr. Kim gives this week. Find
the expected value of X.
Problem 2 - Solution
Let’s create a table to represent the situation:
Outcome              Assignments (X)                P(X = x)
No EC assignments                      0                   0.42
1 EC assignment                      1                   0.26
2 EC assignments                      2                   0.32

Use a tree diagram to find the probabilities:
EC     0.32
0.8
EC
0.4
0.2       None   0.08
0.3
EC     0.18
0.6
None
0.7       None   0.42

E(X) = 0(0.42)+1(0.26)+2(0.32)
E(X) = 0.9
You expect Mr. Kim to give an average of 0.9 EC assignments
per week.
Problem 3
Which of the following statements are true?
I.      The mean of a population is denoted by x.
II.     Sample size is never bigger than population size.
III.    The population mean is a statistic.
A.   I only
B.   II only
C.   III only
D.   All of the above
E.   None of the above.
The correct answer is (E), none of the above.
The mean of a population is denoted by μ; not x. When sampling with
replacement, sample size can be greater than population size. And the
population mean is a parameter; the sample mean is a statistic.
Problem 4
A researcher uses a regression equation to predict
home heating bills (dollar cost), based on home size
(square feet). The correlation between predicted bills
and home size is 0.70. What is the correct
interpretation of this finding?
A.   70% of the variability in home heating bills can be explained
by home size.
B.   49% of the variability in home heating bills can be explained
by home size.
C.   For each added square foot of home size, heating bills
increased by 70 cents.
D.   For each added square foot of home size, heating bills
increased by 49 cents.
E.   None of the above.
Problem 4 - Solution
The correct answer is (B). The coefficient of
determination measures the proportion of
variation in the dependent variable that is
predictable from the independent variable.
The coefficient of determination is equal to
R2; in this case, (0.70)2 or 0.49. Therefore, 49%
of the variability in heating bills can be
explained by home size.
Problem 5
In the context of regression analysis, which
of the following statements are true?
I.      When the sum of the residuals is greater than
zero, the model is nonlinear.
II.     Outliers reduce the coefficient of determination.
III.    Influential points reduce the correlation
coefficient.   The correct answer is (B). Outliers reduce
A.   I only           the ability of a regression model to fit the
B.   II only          data, and thus reduce the coefficient of
C.   III only         determination (r2). The sum of the
residuals is always zero, whether the
D.   I and II only
regression model is linear or nonlinear.
E.   I, II, and III   And influential points often increase the
correlation coefficient (r).
Problem 6
In the context of regression analysis, which
of the following statements is true?
I.      A linear transformation increases the linear
relationship between variables.
II.     A logarithmic model is the most effective
transformation method.
III.    A residual plot reveals departures from
linearity.
A.   I only
B.   II only
C.   III only
D.   I and II only
E.   I, II, and III
Problem 6 - Solution
The correct answer is (C). A linear
transformation neither increases nor decreases
the linear relationship between variables; it
preserves the relationship. A nonlinear
transformation is used to increase the
relationship between variables. The most
effective transformation method depends on
the data being transformed. In some cases, a
logarithmic model may be more effective than
other methods; but it other cases it may be less
effective. Non-random patterns in a residual
plot suggest a departure from linearity in the
data being plotted.
Problem 7
Let the random variable X represent the profit
made on a randomly selected day by a certain
store. Assume that X is normal with mean
\$360 and standard deviation \$50. We know
that on a randomly selected day the
probability is about 0.5 that the store will
make less than \$360. The probability is
approximately 0.6 that on a randomly selected
day the store will make less than        . Solve
for the missing amount of profit.
Problem 7 - Solution
Given: E(X) = μX = \$360 and SD(X) = σ = \$50
In order to determine the unknown value, we need to find
the z-score that corresponds to 0.6
• Use the calculator: invNorm(0.6) ≈ 0.2533
Now use the z-score formula to find our missing value.
x  x
z

x  360
0.2533 
50
12 .67  x  360
372 .67  x

On a randomly selected day the store will make less than
\$372.67 approximately 60% of the time.
Problem 8
Which of the following statements are true?
I.      A sample survey is an example of an
experimental study.
II.     An observational study requires fewer resources
than an experiment.
III.    The best method for investigating causal
relationships is an observational study.
A.   I only
B.   II only
C.   III only
D.   All of the above.
E.   None of the above.
Problem 8 - Solution
The correct answer is (E). In a sample survey,
the researcher does not assign treatments to
survey respondents. Therefore, a sample
survey is not an experimental study; rather, it
is an observational study. An observational
study may or may not require fewer resources
(time, money, manpower) than an
experiment. The best method for investigating
causal relationships is an experiment - not an
observational study - because an experiment
features randomized assignment of subjects to
treatment groups.
Problem 9
Which of the following statements are true?
I.      Blinding controls for the effects of confounding.
II.     Randomization controls for effects of lurking
variables.
III.    Each factor has one treatment level.
A.   I only
B.   II only
C.   III only
D.   All of the above.
E.   None of the above.
Problem 9 - Solution
The correct answer is (B). By randomly assigning
experimental units to treatment levels,
randomization spreads potential effects of lurking
variables roughly evenly across treatment levels.
Blinding ensures that participants in control and
treatment conditions experience the placebo effect
equally, but it does not guard against confounding.
And finally, each factor has two or more treatment
levels. If a factor had only one treatment level, each
participant in the experiment would get the same
treatment on that factor. As a result, that factor
would be confounded with every other factor in the
experiment.
Problem 10
Which of the following statements are true?
I.      A completely randomized design offers no
control for lurking variables.
II.     A randomized block design controls for the
placebo effect.
III.    In a matched pairs design, participants within
each pair receive the same treatment.
A.   I only
B.   II only
C.   III only
D.   All of the above.
E.   None of the above.
Problem 10 - Solution
The correct answer is (E). In a completely
randomized design, experimental units are
randomly assigned to treatment conditions.
Randomization provides some control for
lurking variables. By itself, a randomized
block design does not control for the placebo
effect. To control for the placebo effect, the
experimenter must include a placebo in one of
the treatment levels. In a matched pairs
design, experimental units within each pair
are assigned to different treatment levels.
Problem 11
A coin is tossed three times. What is the
probability that it lands on heads exactly one
time?
The correct answer is (D). If you toss a
A.   0.125   coin three times, there are a total of eight
B.   0.250   possible outcomes. They are: HHH, HHT,
C.   0.333   HTH, THH, HTT, THT, TTH, and TTT.
D.   0.375   Of the eight possible outcomes, three have
E.   0.500   exactly one head. They are: HTT, THT,
and TTH. Therefore, the probability that
three flips of a coin will produce exactly
one head is 3/8 or 0.375.
Problem 12
Which of the following is a discrete random
variable?
I.      The average height of a randomly selected
group of boys.
II.     The annual number of sweepstakes winners
from New York City.
III.    The number of presidential elections in the 20th
century.
A.   I only
B.   II only
C.   III only
D.   I and II
E.   II and III
Problem 12 - Solution
The correct answer is B. The annual number
of sweepstakes winners is an integer value
and it results from a random process; so it is
a discrete random variable. The average
height of a group of boys could be a non-
integer, so it is not a discrete variable. And
the number of presidential elections in the
20th century is an integer, but it does not
vary and it does not result from a random
process; so it is not a random variable.
Problem 13
Suppose X and Y are independent random
variables. The variance of X is equal to 16; and
the variance of Y is equal to 9. Let Z = X - Y.
What is the standard deviation of Z?
(A) 2.65
(B) 5.00
(C) 7.00
(D) 25.0
(E) It is not possible to answer this question,
based on the information given.
Problem 13 - Solution
Suppose X and Y are independent random variables.
The variance of X is equal to 16; and the variance of Y
is equal to 9. Let Z = X - Y.
The solution requires us to recognize that Variable Z
is a combination of two independent random
variables. As such, the variances ADD!!!
Var(Z) = Var(X) + Var(Y) = 16 + 9 = 25
SD2(X) = Var(X). Therefore, the standard deviation is
equal to the square root of 25, which is 5.
Problem 14
Which of the two events are most likely to be
independent?
a)   having a flat tire and being late for school
b)   getting an A in math and getting an A in science
c)   having a driver’s license and having blue eyes
d)   having a car accident and having 3 inches of
snow
e)   being a senior and leaving campus for lunch

The correct answer is C since having blue eyes will
have little affect on having a driver’s license.
Problem 15
Political analysts estimate the probability
that a female will run for the next
presidential election is 45% and the
probability of the governor of NY running is
20%. If their decisions are independent,
what is the probability that only the female
will run?
a)   9%     b) 11%     c) 25%      d) 36%      e) 45%

The correct answer is D since the P(both running) = .09
and using a venn diagram, we see that P(only female) =
.36
Problem 16
The city council has 6 men and 3 women. If
we randomly choose two to co-chair a
committee, what is the probability that they
will be the same gender?
a)   4/9    b) 1/2      c) 5/9    d) 5/8     e) 7/8

Create a tree diagram and determine the probabilities
Problem 17
Which of the following has a geometric
model?
a)   The number of cards of each suit in a 10-card hand
b)   The number of people we check until we find
someone with green eyes
c)   The number of cars inspected until we find three with
d)   The number of Democrats among a group of 20
randomly chosen registered voters
e)   The number of aces among the top 10 cards in a well
shuffled deck
The correct answer is B since we are trying to find the
first person with green eyes
Problem 18
Which of the following has a binomial
model?
a)   The number of cards of each suit in a 10-card hand
b)   The number of people we check until we find
someone with green eyes
c)   The number of cars inspected until we find three with
d)   The number of Democrats among a group of 20
randomly chosen registered voters
e)   The number of aces among the top 10 cards in a well
shuffled deck
The correct answer is D since we are trying to find the
number of Democrats only within a fixed number of 20
Problem 19
An ice cream stand reports that 12% of the
cones they sell are “jumbo” size. You want
to see what a jumbo cone looks like, so you
watch the stand for a while. What is the
probability that the first jumbo cone is the
fourth cone that you see sold?
a)   8%     b) 33%      c) 40%      d) 60%         e) 93%

The correct answer is A since this is a geometric
distribution. P(X = 4) ≈ .0817766
Problem 20
An ice cream stand reports that 12% of the
cones they sell are “jumbo” size. You want
to see what a jumbo cone looks like, so you
watch the stand for a while. What is the
probability that exactly one of the first six
cones sold is a jumbo?
a)   6%     b) 12%      c) 38%      d) 54%        e) 84%

The correct answer is C since this is a binomial
distribution. P(X = 1) ≈ .37997
Problem 21
A friend plans to toss a fair coin 200 times.
You watch the first 20 tosses and are
surprised to see 15 heads, but become bored
and leave. How many heads should you
expect when she is done with her 200 tosses?
a)   80    b) 100    c) 105     d) 110     e) 115

has 15 heads. We expect her to get 90 heads out of the
next 180 tosses, so we should expect her to get 105
Problem 22
On a physical fitness test, middle school boys are
awarded one point for each push-up and one point for
each sit-up. National results showed that boys average
18 push-ups with a standard deviation of 4 and 34 sit-ups
with a standard deviation of 11. The mean combined
score of each boy is 18 + 34 = 52. What is the standard
deviation of their combined scores?
a)   5.3 b) 11.7 c) 15 d) 137 e) can’t be determined

The correct answer is B since we know the standard
deviations of push-ups and sit-ups. Let P = push-ups
and S = sit-ups. Var(P) = 42 = 16 and Var(S) = 112 = 121
and Var(P + S) = Var (P) + Var(S) = 16 + 121 = 137, so
SD(P + S) ≈ 11.7
Problem 23 - a
Police reports about traffic accidents last year indicated
that 70% of the accidents involve speeding, 20% involve
alcohol, and 14% involve both.
a)   What is the probability that an accident involved
neither?
Solution:
Use a venn diagram to determine the probabilities:
S
So, the probability that the
Speeding          Alcohol          accident involved neither is .24

0.56    0.14     0.06

0.24
Problem 23 - b
Police reports about traffic accidents last year indicated
that 70% of the accidents involve speeding, 20% involve
alcohol, and 14% involve both.
b)   Are the risk factors independent?

Solution:
Use a venn diagram to determine the probabilities:
S
If S and A are independent, then
Speeding          Alcohol          P(S ∩ A) = P(S) * P(A)
P(S ∩ A) = .14
0.56    0.14     0.06
P(S)*P(A) = (.2)(.7) = .14
So, we are not able to confirm
0.24
or deny independence!!!
Problem 24
In a class of 100 students, the grades on a statistics
test are summarized in the following frequency
table. What is the median?
91–100       11
81–90        31
71–80        42
61–70        16
a) 80 b) 71 c) 74 d) 75 e) can’t be determined
The correct answer is E. Although we know that the
median is in the interval 71 – 80, we do not know the
actual value of the median.
Problem 25
For this density curve, what percentage of
the observations lies above 1.5?

a) 25% b) 50% c) 85% d) 80 e) can’t be determined
The correct answer is A. Since the height of the
rectangle is 0.5, the base must be 2 in order to have an
area of 1. Therefore, the area to the right of 1.5 must be
25%
Problem 26
When creating a scatterplot, one should:
I.       use the horizontal axis for the response variable
II.      use the horizontal axis for the explanatory variable.
III.     use a different plotting symbol depending on whether the
explanatory variable is categorical or the response variable is
categorical.
IV.      use a plotting scale that makes the overall trend roughly linear.

a) I only b) II only c) III only d) IV only e) None of these
The correct answer is B. We always put the explanatory
variable on the x-axis and the response variable on the
y-axis.
Problem 27
A business has two types of employees, managers and
workers. Managers earn either \$100,000 or \$200,000 per year.
Workers earn either \$10,000 or \$20,000 per year. The number of
male and female managers at each salary level and the number
of male and female workers at each salary level are given in the
tables below.
Managers                       Workers
Male     Female                           Male      Female
\$100,000      80       20               \$10,000         30        20
\$200,000      20       30               \$20,000         20        80

From these data, we may conclude:
a) that the mean salary of female managers is greater than that of male
managers.
b) that the mean salary of males in this business is greater than the mean salary
of females.
c) that the mean salary of female workers is greater than that of male workers.
d) all of the above
e) None of the above

Problem 28
Twelve people who suffer from chronic fatigue syndrome
volunteer to take part in an experiment to see if shark fin
extract will increase one's energy level. Eight of the volunteers
are men and four are women. Half of the volunteers are to be
given shark fin extract twice a day and the other half a placebo
twice a day. We wish to make sure that four men and two
women are assigned to each of the treatments, so we decide to
use a block design with the men forming one block and the
women the other.
Suppose one of the researchers is responsible for determining if
a subject displays an increase in energy level. In this case, we
should probably
a) use two placebos.
b) use stratified sampling to assign subjects to treatments.
c) use fewer subjects but observe them more frequently.
d) conduct the study as a double-blind experiment.
e) None of the above
Question 29
Suppose that for a group of consumers, the
probability of eating pretzels is .75 and that the
probability of drinking Coke is .65. Further suppose
that the probability of eating pretzels and drinking
Coke is .55. Determine if these two events are
independent.
If they are independent, then…
P(eating a pretzel)=P(eating a pretzel | drinking a coke)
However, .75 ≠ .85
Therefore, the events are NOT independent.

Alternatively, if they are independent, then…
P(drinking a coke)=P(drinking a coke| eating a pretzel)
However, .65 ≠ .73
Therefore, the events are NOT independent.
Problem 30
Students at University X must be in one of the class
ranks—freshman, sophomore, junior, or senior. At
University X, 35% of the students are freshmen and
30% are sophomores. If a student is selected at
random, the probability that he or she is either a
junior or a senior is
a) 30%
b) 35%
c) 65%
d) 70%
e) None of the above
Problem 31 - 34
Given that the probability of A is ½, the probability for B is 3/5,
and the probability of both A and B is 1/5.

31.   Are the events disjoint?       NO!!!
32.   Are the events independent?          NO!!!
3
33. P(A ∩   BC)=     .3
10

34. P(A C ∩ B) = 4  .4
10
Problem 35 - 38
Given that the probability of A is ½, the probability for B is 3/5,
and the probability of both A and B is 1/5.

1
35.   P(AC ∩ BC) =        .1
10
2
36. P(B|A) =      .4
5
3
37. P(BC|A) =      .6
5
1
38.   P(BC|AC) =        .2
5
Problem 39
Consider the following probability histogram for a random variable X.

This probability histogram corresponds to which of the following distributions
for X?

a)   X         0          1        2         3         4
P(X)      0.06       0.25     0.38      0.25      0.06

b)   X         0          1        2         3         4
P(X)      0.10       0.25     0.30      0.20      0.15

c)   X         0 The       1       2        3
P(X)      0.10       0.25     0.30      0.25      0.10

d)   X         0          1        2         3         4
P(X)      0.06       0.25     0.30      0.29      0.10

e)   None of the above
Problem 40
Suppose we select an SRS of size n = 100 from a
large population having proportion p of successes.
Let X be the number of successes in the sample. For
which of the following values of p would it be safe
to assume the distribution of X is approximately
normal?

a) 0.01
b) 0.11
c) 0.975
d) 0.999
e) None of the above
Problem 41
A teacher asked her 8 introductory statistics students to record
the total amount of time they spent studying for a particular
test. The amounts of study time x (in hours) and the resulting
test grades y are given below.

X:   2       1       1.5      0.5     1       3        0       2
Y:   92      81      84       68      72      96       58     84

Use your calculator to find all the residuals. Report the sum of
the residuals and the sum of the squares of the residuals.

The sum of the residuals is 0; the sum of squares of the
residuals is 121.088
Problem 42
A concert hall has 2000 seats. There are 1200 seats
on the main floor and 800 in the balcony. 50% of
those on the main floor buy a souvenir program and
40% of those on the balcony buy a souvenir
program. Assuming that all the seats are occupied,
what is the probability that a program was
purchased if an audience member is selected at
random?
Solution:
A.   22.5%
B.   44%     E(X) = (.4)(800) + (.5)(1200) = 320 + 600 = 920
C.   45%                    920
D.   46%
p       0.46  46%
2000
E.   92%

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 32 posted: 3/1/2012 language: English pages: 47