Random Variables

Document Sample
Random Variables Powered By Docstoc
					Cumulative Review


  Review Problems
Problem 1
       Which of the following statements are true?
 I.      All variables can be classified as quantitative or
         categorical variables.
 II.     Categorical variables can be continuous
         variables.
 III.    Quantitative variables can be discrete variables.
        A.   I only      The correct answer is (E). All variables can
        B.   II only     be classified as quantitative or categorical
        C.   III only    variables. Discrete variables are indeed a
        D.   I and II    category of quantitative variables.
                         Categorical variables, however, are not
        E.   I and III   numeric. Therefore, they cannot be classified
                         as continuous variables.
Problem 2
 Mr. Kim says he might give two extra credit
 assignments this week. The probability that
 he give an EC assignment today is 0.4. If he
 gives an EC assignment today, the probability
 that he will give one tomorrow is 0.8. If he
 doesn’t give one today, the probability that he
 gives one tomorrow is 0.3.
   Let the random variable X be the number EC
   assignments that Mr. Kim gives this week. Find
   the expected value of X.
Problem 2 - Solution
 Let’s create a table to represent the situation:
         Outcome              Assignments (X)                P(X = x)
    No EC assignments                      0                   0.42
      1 EC assignment                      1                   0.26
     2 EC assignments                      2                   0.32

 Use a tree diagram to find the probabilities:
                                               EC     0.32
                                     0.8
                              EC
                        0.4
                                     0.2       None   0.08
                                     0.3
                                               EC     0.18
                        0.6
                              None
                                     0.7       None   0.42

 E(X) = 0(0.42)+1(0.26)+2(0.32)
 E(X) = 0.9
 You expect Mr. Kim to give an average of 0.9 EC assignments
 per week.
Problem 3
         Which of the following statements are true?
   I.      The mean of a population is denoted by x.
   II.     Sample size is never bigger than population size.
   III.    The population mean is a statistic.
          A.   I only
          B.   II only
          C.   III only
          D.   All of the above
          E.   None of the above.
The correct answer is (E), none of the above.
The mean of a population is denoted by μ; not x. When sampling with
replacement, sample size can be greater than population size. And the
population mean is a parameter; the sample mean is a statistic.
Problem 4
 A researcher uses a regression equation to predict
 home heating bills (dollar cost), based on home size
 (square feet). The correlation between predicted bills
 and home size is 0.70. What is the correct
 interpretation of this finding?
 A.   70% of the variability in home heating bills can be explained
      by home size.
 B.   49% of the variability in home heating bills can be explained
      by home size.
 C.   For each added square foot of home size, heating bills
      increased by 70 cents.
 D.   For each added square foot of home size, heating bills
      increased by 49 cents.
 E.   None of the above.
Problem 4 - Solution
 The correct answer is (B). The coefficient of
 determination measures the proportion of
 variation in the dependent variable that is
 predictable from the independent variable.
 The coefficient of determination is equal to
 R2; in this case, (0.70)2 or 0.49. Therefore, 49%
 of the variability in heating bills can be
 explained by home size.
Problem 5
       In the context of regression analysis, which
       of the following statements are true?
 I.      When the sum of the residuals is greater than
         zero, the model is nonlinear.
 II.     Outliers reduce the coefficient of determination.
 III.    Influential points reduce the correlation
         coefficient.   The correct answer is (B). Outliers reduce
        A.   I only           the ability of a regression model to fit the
        B.   II only          data, and thus reduce the coefficient of
        C.   III only         determination (r2). The sum of the
                              residuals is always zero, whether the
        D.   I and II only
                              regression model is linear or nonlinear.
        E.   I, II, and III   And influential points often increase the
                              correlation coefficient (r).
Problem 6
       In the context of regression analysis, which
       of the following statements is true?
 I.      A linear transformation increases the linear
         relationship between variables.
 II.     A logarithmic model is the most effective
         transformation method.
 III.    A residual plot reveals departures from
         linearity.
        A.   I only
        B.   II only
        C.   III only
        D.   I and II only
        E.   I, II, and III
Problem 6 - Solution
 The correct answer is (C). A linear
 transformation neither increases nor decreases
 the linear relationship between variables; it
 preserves the relationship. A nonlinear
 transformation is used to increase the
 relationship between variables. The most
 effective transformation method depends on
 the data being transformed. In some cases, a
 logarithmic model may be more effective than
 other methods; but it other cases it may be less
 effective. Non-random patterns in a residual
 plot suggest a departure from linearity in the
 data being plotted.
Problem 7
 Let the random variable X represent the profit
 made on a randomly selected day by a certain
 store. Assume that X is normal with mean
 $360 and standard deviation $50. We know
 that on a randomly selected day the
 probability is about 0.5 that the store will
 make less than $360. The probability is
 approximately 0.6 that on a randomly selected
 day the store will make less than        . Solve
 for the missing amount of profit.
Problem 7 - Solution
   Given: E(X) = μX = $360 and SD(X) = σ = $50
   In order to determine the unknown value, we need to find
   the z-score that corresponds to 0.6
    • Use the calculator: invNorm(0.6) ≈ 0.2533
   Now use the z-score formula to find our missing value.
                              x  x
                         z
                               
                                x  360
                     0.2533 
                                  50
                     12 .67  x  360
                     372 .67  x

   On a randomly selected day the store will make less than
   $372.67 approximately 60% of the time.
Problem 8
       Which of the following statements are true?
 I.      A sample survey is an example of an
         experimental study.
 II.     An observational study requires fewer resources
         than an experiment.
 III.    The best method for investigating causal
         relationships is an observational study.
        A.   I only
        B.   II only
        C.   III only
        D.   All of the above.
        E.   None of the above.
Problem 8 - Solution
 The correct answer is (E). In a sample survey,
 the researcher does not assign treatments to
 survey respondents. Therefore, a sample
 survey is not an experimental study; rather, it
 is an observational study. An observational
 study may or may not require fewer resources
 (time, money, manpower) than an
 experiment. The best method for investigating
 causal relationships is an experiment - not an
 observational study - because an experiment
 features randomized assignment of subjects to
 treatment groups.
Problem 9
       Which of the following statements are true?
 I.      Blinding controls for the effects of confounding.
 II.     Randomization controls for effects of lurking
         variables.
 III.    Each factor has one treatment level.
        A.   I only
        B.   II only
        C.   III only
        D.   All of the above.
        E.   None of the above.
Problem 9 - Solution
 The correct answer is (B). By randomly assigning
 experimental units to treatment levels,
 randomization spreads potential effects of lurking
 variables roughly evenly across treatment levels.
 Blinding ensures that participants in control and
 treatment conditions experience the placebo effect
 equally, but it does not guard against confounding.
 And finally, each factor has two or more treatment
 levels. If a factor had only one treatment level, each
 participant in the experiment would get the same
 treatment on that factor. As a result, that factor
 would be confounded with every other factor in the
 experiment.
Problem 10
       Which of the following statements are true?
 I.      A completely randomized design offers no
         control for lurking variables.
 II.     A randomized block design controls for the
         placebo effect.
 III.    In a matched pairs design, participants within
         each pair receive the same treatment.
        A.   I only
        B.   II only
        C.   III only
        D.   All of the above.
        E.   None of the above.
Problem 10 - Solution
 The correct answer is (E). In a completely
 randomized design, experimental units are
 randomly assigned to treatment conditions.
 Randomization provides some control for
 lurking variables. By itself, a randomized
 block design does not control for the placebo
 effect. To control for the placebo effect, the
 experimenter must include a placebo in one of
 the treatment levels. In a matched pairs
 design, experimental units within each pair
 are assigned to different treatment levels.
Problem 11
  A coin is tossed three times. What is the
  probability that it lands on heads exactly one
  time?
              The correct answer is (D). If you toss a
 A.   0.125   coin three times, there are a total of eight
 B.   0.250   possible outcomes. They are: HHH, HHT,
 C.   0.333   HTH, THH, HTT, THT, TTH, and TTT.
 D.   0.375   Of the eight possible outcomes, three have
 E.   0.500   exactly one head. They are: HTT, THT,
              and TTH. Therefore, the probability that
              three flips of a coin will produce exactly
              one head is 3/8 or 0.375.
Problem 12
       Which of the following is a discrete random
       variable?
 I.      The average height of a randomly selected
         group of boys.
 II.     The annual number of sweepstakes winners
         from New York City.
 III.    The number of presidential elections in the 20th
         century.
        A.   I only
        B.   II only
        C.   III only
        D.   I and II
        E.   II and III
Problem 12 - Solution
 The correct answer is B. The annual number
 of sweepstakes winners is an integer value
 and it results from a random process; so it is
 a discrete random variable. The average
 height of a group of boys could be a non-
 integer, so it is not a discrete variable. And
 the number of presidential elections in the
 20th century is an integer, but it does not
 vary and it does not result from a random
 process; so it is not a random variable.
Problem 13
 Suppose X and Y are independent random
 variables. The variance of X is equal to 16; and
 the variance of Y is equal to 9. Let Z = X - Y.
 What is the standard deviation of Z?
 (A) 2.65
 (B) 5.00
 (C) 7.00
 (D) 25.0
 (E) It is not possible to answer this question,
 based on the information given.
Problem 13 - Solution
 Suppose X and Y are independent random variables.
 The variance of X is equal to 16; and the variance of Y
 is equal to 9. Let Z = X - Y.
 The solution requires us to recognize that Variable Z
 is a combination of two independent random
 variables. As such, the variances ADD!!!
 Var(Z) = Var(X) + Var(Y) = 16 + 9 = 25
 SD2(X) = Var(X). Therefore, the standard deviation is
 equal to the square root of 25, which is 5.
Problem 14
  Which of the two events are most likely to be
  independent?
 a)   having a flat tire and being late for school
 b)   getting an A in math and getting an A in science
 c)   having a driver’s license and having blue eyes
 d)   having a car accident and having 3 inches of
      snow
 e)   being a senior and leaving campus for lunch

The correct answer is C since having blue eyes will
have little affect on having a driver’s license.
Problem 15
  Political analysts estimate the probability
  that a female will run for the next
  presidential election is 45% and the
  probability of the governor of NY running is
  20%. If their decisions are independent,
  what is the probability that only the female
  will run?
 a)   9%     b) 11%     c) 25%      d) 36%      e) 45%

The correct answer is D since the P(both running) = .09
and using a venn diagram, we see that P(only female) =
.36
Problem 16
  The city council has 6 men and 3 women. If
  we randomly choose two to co-chair a
  committee, what is the probability that they
  will be the same gender?
  a)   4/9    b) 1/2      c) 5/9    d) 5/8     e) 7/8

The correct answer is B
Create a tree diagram and determine the probabilities
Problem 17
  Which of the following has a geometric
  model?
 a)   The number of cards of each suit in a 10-card hand
 b)   The number of people we check until we find
      someone with green eyes
 c)   The number of cars inspected until we find three with
      bad mufflers
 d)   The number of Democrats among a group of 20
      randomly chosen registered voters
 e)   The number of aces among the top 10 cards in a well
      shuffled deck
The correct answer is B since we are trying to find the
first person with green eyes
Problem 18
  Which of the following has a binomial
  model?
 a)   The number of cards of each suit in a 10-card hand
 b)   The number of people we check until we find
      someone with green eyes
 c)   The number of cars inspected until we find three with
      bad mufflers
 d)   The number of Democrats among a group of 20
      randomly chosen registered voters
 e)   The number of aces among the top 10 cards in a well
      shuffled deck
The correct answer is D since we are trying to find the
number of Democrats only within a fixed number of 20
Problem 19
  An ice cream stand reports that 12% of the
  cones they sell are “jumbo” size. You want
  to see what a jumbo cone looks like, so you
  watch the stand for a while. What is the
  probability that the first jumbo cone is the
  fourth cone that you see sold?
 a)   8%     b) 33%      c) 40%      d) 60%         e) 93%

The correct answer is A since this is a geometric
distribution. P(X = 4) ≈ .0817766
Problem 20
  An ice cream stand reports that 12% of the
  cones they sell are “jumbo” size. You want
  to see what a jumbo cone looks like, so you
  watch the stand for a while. What is the
  probability that exactly one of the first six
  cones sold is a jumbo?
 a)   6%     b) 12%      c) 38%      d) 54%        e) 84%

The correct answer is C since this is a binomial
distribution. P(X = 1) ≈ .37997
Problem 21
  A friend plans to toss a fair coin 200 times.
  You watch the first 20 tosses and are
  surprised to see 15 heads, but become bored
  and leave. How many heads should you
  expect when she is done with her 200 tosses?
 a)   80    b) 100    c) 105     d) 110     e) 115

The correct answer is C since we know that she already
has 15 heads. We expect her to get 90 heads out of the
next 180 tosses, so we should expect her to get 105
heads.
Problem 22
  On a physical fitness test, middle school boys are
  awarded one point for each push-up and one point for
  each sit-up. National results showed that boys average
  18 push-ups with a standard deviation of 4 and 34 sit-ups
  with a standard deviation of 11. The mean combined
  score of each boy is 18 + 34 = 52. What is the standard
  deviation of their combined scores?
 a)   5.3 b) 11.7 c) 15 d) 137 e) can’t be determined

The correct answer is B since we know the standard
deviations of push-ups and sit-ups. Let P = push-ups
and S = sit-ups. Var(P) = 42 = 16 and Var(S) = 112 = 121
and Var(P + S) = Var (P) + Var(S) = 16 + 121 = 137, so
SD(P + S) ≈ 11.7
Problem 23 - a
     Police reports about traffic accidents last year indicated
     that 70% of the accidents involve speeding, 20% involve
     alcohol, and 14% involve both.
     a)   What is the probability that an accident involved
          neither?
Solution:
Use a venn diagram to determine the probabilities:
 S
                                               So, the probability that the
            Speeding          Alcohol          accident involved neither is .24

               0.56    0.14     0.06




                                        0.24
Problem 23 - b
     Police reports about traffic accidents last year indicated
     that 70% of the accidents involve speeding, 20% involve
     alcohol, and 14% involve both.
     b)   Are the risk factors independent?

Solution:
Use a venn diagram to determine the probabilities:
 S
                                               If S and A are independent, then
            Speeding          Alcohol          P(S ∩ A) = P(S) * P(A)
                                               P(S ∩ A) = .14
               0.56    0.14     0.06
                                               P(S)*P(A) = (.2)(.7) = .14
                                               So, we are not able to confirm
                                        0.24
                                               or deny independence!!!
Problem 24
  In a class of 100 students, the grades on a statistics
  test are summarized in the following frequency
  table. What is the median?
               Grade    Frequency
               91–100       11
               81–90        31
               71–80        42
               61–70        16
  a) 80 b) 71 c) 74 d) 75 e) can’t be determined
The correct answer is E. Although we know that the
median is in the interval 71 – 80, we do not know the
actual value of the median.
Problem 25
   For this density curve, what percentage of
   the observations lies above 1.5?




  a) 25% b) 50% c) 85% d) 80 e) can’t be determined
The correct answer is A. Since the height of the
rectangle is 0.5, the base must be 2 in order to have an
area of 1. Therefore, the area to the right of 1.5 must be
25%
Problem 26
         When creating a scatterplot, one should:
  I.       use the horizontal axis for the response variable
  II.      use the horizontal axis for the explanatory variable.
  III.     use a different plotting symbol depending on whether the
           explanatory variable is categorical or the response variable is
           categorical.
  IV.      use a plotting scale that makes the overall trend roughly linear.


  a) I only b) II only c) III only d) IV only e) None of these
 The correct answer is B. We always put the explanatory
 variable on the x-axis and the response variable on the
 y-axis.
Problem 27
   A business has two types of employees, managers and
   workers. Managers earn either $100,000 or $200,000 per year.
   Workers earn either $10,000 or $20,000 per year. The number of
   male and female managers at each salary level and the number
   of male and female workers at each salary level are given in the
   tables below.
                 Managers                       Workers
                 Male     Female                           Male      Female
   $100,000      80       20               $10,000         30        20
   $200,000      20       30               $20,000         20        80

   From these data, we may conclude:
  a) that the mean salary of female managers is greater than that of male
        managers.
  b) that the mean salary of males in this business is greater than the mean salary
        of females.
  c) that the mean salary of female workers is greater than that of male workers.
  d) all of the above
  e) None of the above

The correct answer is D.
Problem 28
   Twelve people who suffer from chronic fatigue syndrome
   volunteer to take part in an experiment to see if shark fin
   extract will increase one's energy level. Eight of the volunteers
   are men and four are women. Half of the volunteers are to be
   given shark fin extract twice a day and the other half a placebo
   twice a day. We wish to make sure that four men and two
   women are assigned to each of the treatments, so we decide to
   use a block design with the men forming one block and the
   women the other.
   Suppose one of the researchers is responsible for determining if
   a subject displays an increase in energy level. In this case, we
   should probably
   a) use two placebos.
   b) use stratified sampling to assign subjects to treatments.
   c) use fewer subjects but observe them more frequently.
   d) conduct the study as a double-blind experiment.
   e) None of the above
The correct answer is D.
Question 29
  Suppose that for a group of consumers, the
  probability of eating pretzels is .75 and that the
  probability of drinking Coke is .65. Further suppose
  that the probability of eating pretzels and drinking
  Coke is .55. Determine if these two events are
  independent.
 If they are independent, then…
 P(eating a pretzel)=P(eating a pretzel | drinking a coke)
 However, .75 ≠ .85
 Therefore, the events are NOT independent.

 Alternatively, if they are independent, then…
 P(drinking a coke)=P(drinking a coke| eating a pretzel)
 However, .65 ≠ .73
 Therefore, the events are NOT independent.
Problem 30
   Students at University X must be in one of the class
   ranks—freshman, sophomore, junior, or senior. At
   University X, 35% of the students are freshmen and
   30% are sophomores. If a student is selected at
   random, the probability that he or she is either a
   junior or a senior is
   a) 30%
   b) 35%
   c) 65%
   d) 70%
   e) None of the above
The correct answer is B.
Problem 31 - 34
 Given that the probability of A is ½, the probability for B is 3/5,
 and the probability of both A and B is 1/5.

 Answer the following:

31.   Are the events disjoint?       NO!!!
32.   Are the events independent?          NO!!!
                  3
33. P(A ∩   BC)=     .3
                 10

34. P(A C ∩ B) = 4  .4
                 10
Problem 35 - 38
 Given that the probability of A is ½, the probability for B is 3/5,
 and the probability of both A and B is 1/5.

 Answer the following:
                       1
35.   P(AC ∩ BC) =        .1
                      10
              2
36. P(B|A) =      .4
              5
               3
37. P(BC|A) =      .6
               5
                     1
38.   P(BC|AC) =        .2
                     5
Problem 39
  Consider the following probability histogram for a random variable X.




  This probability histogram corresponds to which of the following distributions
  for X?

 a)   X         0          1        2         3         4
      P(X)      0.06       0.25     0.38      0.25      0.06

 b)   X         0          1        2         3         4
      P(X)      0.10       0.25     0.30      0.20      0.15

 c)   X         0 The       1       2        3
                          correct answer is B.          4
      P(X)      0.10       0.25     0.30      0.25      0.10

 d)   X         0          1        2         3         4
      P(X)      0.06       0.25     0.30      0.29      0.10

 e)   None of the above
Problem 40
 Suppose we select an SRS of size n = 100 from a
 large population having proportion p of successes.
 Let X be the number of successes in the sample. For
 which of the following values of p would it be safe
 to assume the distribution of X is approximately
 normal?

 a) 0.01
 b) 0.11
 c) 0.975
 d) 0.999
 e) None of the above
              The correct answer is B.
Problem 41
 A teacher asked her 8 introductory statistics students to record
 the total amount of time they spent studying for a particular
 test. The amounts of study time x (in hours) and the resulting
 test grades y are given below.

 X:   2       1       1.5      0.5     1       3        0       2
 Y:   92      81      84       68      72      96       58     84

 Use your calculator to find all the residuals. Report the sum of
 the residuals and the sum of the squares of the residuals.

 The sum of the residuals is 0; the sum of squares of the
 residuals is 121.088
Problem 42
  A concert hall has 2000 seats. There are 1200 seats
  on the main floor and 800 in the balcony. 50% of
  those on the main floor buy a souvenir program and
  40% of those on the balcony buy a souvenir
  program. Assuming that all the seats are occupied,
  what is the probability that a program was
  purchased if an audience member is selected at
  random?
              Solution:
 A.   22.5%
 B.   44%     E(X) = (.4)(800) + (.5)(1200) = 320 + 600 = 920
 C.   45%                    920
 D.   46%
                          p       0.46  46%
                             2000
 E.   92%

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:32
posted:3/1/2012
language:English
pages:47