Docstoc

Chapter 16 Random Variables

Document Sample
Chapter 16 Random Variables Powered By Docstoc
					  Chapter 16 Random Variables
Random Variable is a variable whose
outcome depends on a random event

Notation: Capital letters such as X, Y, etc.

The values X assumes are denoted by little
x
    Examples: Random Variables

• The amount of time you wait at a light

• The lifespan of a light bulb

• The amount of money an insurance
  company will pay on a policy
            Random Variables

• Like the distribution of any quantitative
  variable, the distribution has shape, center
  and spread.

• In fact, we can calculate the mean, or
  expected value of a random variable and its
  standard deviation.
    Chapter 16 Random Variables
• Example: An insurance company offers a “death and
  disability” policy that pays $10000 when you die or
  $5000 if you are permanently disabled, within five
  years. It charges its customers $50 to enroll in this
  plan.

• Here: X is a random variable that represents how
  much the insurance company pays.
• Note: X = $10000 or X = $5000 or X = $0
     Chapter 16 Random Variables
• Problem: Suppose the             Policyholder payout P(X=x)
  probability of death rate in 5   outcome
  years is 1 out of 1000
  people, and that 2 out of
  1000 suffer some kind of
                                   Death        10000   0.001
  disability. Is the company
  making profit selling such
  plans?                           Disability   5000    0.002

• The answer is obtained by
  forming a probability            Neither      0       0.997
  model.
       Probability Distribution Table for
             Insurance Example
Insurance Payment   X= $10000   X= $5000   X = $0




   Probability        0.001      0.002     0.997
       Chapter 16 Random Variables
• Suppose the insurance company sells the policy for
  $50 each to 1000 customers. According to the model
  we expect 1 to die, 2 to be disabled, and the rest 997
  will be ok.
• Hence the company
   –   Gets $50000
   –   pays $10000 to the family of the deceased
   –   pays $ 10000 to the two disabled persons
   –   and profits $30000 ($30 per person profit)
Chapter 16 Random Variables
Chapter 16 Random Variables
Chapter 16 Random Variables
    Chapter 16 Random Variables
• Example: Suppose you have a computer sales
  business. If your client gets two new computers,
  things are fine. If the client gets one refurbished
  computer, it will be sent back at your expense of $100
  and you can replace it. If both computers are
  refurbished, the client will cancel the order and you
  lose $1000. What is the expected value (mean) and
  standard deviation of your loss?
    Chapter 16 Random Variables
• Probability Model:        Outcome      x      P(X=x)
• Assumptions: In your
  inventory you found out
  that someone has          2 new        0      0.524
  stocked 4 refurbished
  computers with 11 new
  computers. The shipped    One Refurb   100    0.419
  computers were
  randomly selected from
  these group               Two Refurb   1000   0.057
    Chapter 16 Random Variables
• Expected value of loss

  – E(X) = 0 (0.524)+ 100(0.419)+ 1000(0.057) = E(X)=
    $98.90
  – Var(X) = (0-98.90)^2*(0.524)+ (100-
    98.90)^2(0.419) + (1000-98.90)^2*(0.057 = $51,
    408.79
  – Std Dev = Sqrt(51,408.79) = $226.735
    Chapter 16 Random Variables
• Interpretation:
  – In the long run it will cost the company an average
    of $98.90

  – The large standard deviation reflects the fact that
    there is a pretty large range of possible losses.
      Chapter 16 Random Variables
• Some properties:

  –   E(X+c) = E(X) +c
  –   E(X-c) = E(X) – c
  –   E(X+Y)=E(X)+E(Y)
  –   Var(X+c) = Var(X)
  –   Var(X-c) = Var(X)
  –   E(aX) = a E(X)
  –   Var(aX) = a^2Var(X)
  –   Var(X+Y) = Var(X) + Var(Y), provided X, Y are independent
   Chapter 16 Random Variables
• Example: You plan to sell your used Isuzu
  Trooper and spend your vacation in Kyrgyzstan
  where you plan to buy a Honda motor scooter.
  Used Isuzu’s sell for a mean price of $6940
  with a standard deviation of $250. In
  Kyrgyzstan, the scooter sells for a mean price
  of 65,000 Kyrgyzstan soms with a standard
  deviation of 500 soms. 1 USD = 43 Kyrgyzstan
  soms. What is your expected profit, if any?
     Chapter 16 Random Variables
• A = sale price of Isuzu (in dollars)
• B = price of scooter (in soms)
• D = profit (in soms) = 43A-B
• Prices are assumed to be independent
• E(D) = E(43A-B) = 43E(A)-E(B) = 43(6940)-(65000) = 233,420
  soms (expected profit)
• Var(D)= Var(43A-B)= 43^2var(A)+Var(B) = 115, 812, 500
• Std Dev(D) = sqrt(115, 812, 500) = 10,762 soms.
• In dollars, the expected profit is $5428 with a stad dev of 250.
   Chapter 17 Probability Models
• 1. Bernoulli Trials (properties)
  – Two possible outcomes (success and failure)
  – The probability of success, denoted by p, is the
    same on every trial
  – The trials are independent
• Example: Suppose 75% of all drivers wear
  their seatbelts. Find the probability that four
  drivers might be belted among five cars
  waiting traffic light?
    Chapter 17 Probability Models
• 2. The Geometric Model: How many trials are
  needed to observe the first success?

• Let p = probability of success, q = 1-p be probability
  of failure, X= be the number of trials until the first
  success occurs.

• Then P(X=x) q^{x-1}p
• Expected value E(X) = 1/p
• Std deviation of X = Sqrt(q/p^2)
   Chapter 17 Probability Models
• Example: People with O-negative blood type are
  called universal donors. Only about 6% of people
  have O-negative blood. If donors line up at random
  for blood drive,
   – a) how many do you expect to examine before you find
     someone who has O-negative blood?
   – b) What’s the probability that the first O-negative donor
     found is one of the four people in line?
   Chapter 17 Probability Models
• a) E(X) = 1/0.06 = 16.6 On average a universal
  donor is found by examining 16.7 people.

• b) P(X <= 4) = P(X=1)+ P(X=2)+ P(X=3)+ P(X=4)
  0.2193
   Chapter 17 Probability Models
• 3. The Binomial Model: The binomial model is
  like a Bernoulli trial with a number of trials
  being n, n is a number greater than or equal
  to 2.

• We look for the number of k successes in n
  trial. Here k is less than or equal to n.
• Let X = number of successes in n trials.
   Chapter 17 Probability Models
• P(X=x) = C(n,x)*p^x*q^{n-x}
• Here p = probability of success, q = 1-p =
  probability of failure.

• E(X) = np

• Std dev of X = Sqrt{npq}
   Chapter 17 Probability Models
• Example: Suppose 20 people come to the
  blood drive. What is the probability that there
  are 2 or 3 universal donors?
                    Solution:
• P(X=2) = C(20,2)(.06)^{2}(.94)^{18}= 0.2246
• P(X=3) = C(20,3)(.06)^{3}(.94)^{17}= 0.0860
• Answer: 0.3106
     Chapter 17 Probability Models
• To compute the C(n,x) number from the TI-83
  do the following

•   1. Type your number
•   2. Press MATH
•   3. Select Prob
•   4. Press nCr
•   5. Type your second number
                  Examples
1. A basketball player has made 80% of his foul
   shots during the season. Assuming the shots
   are independent, find the probability that in
   tonight’s game he

a) Misses for the first time on his fifth attempt
b) Makes his first basket on his fourth shot
                   Examples
• a) success probability = p = 0.8, failure
  probability = p = 0.2
• Answer: 0.8^4*0.2=0.08192



• b) Answer: 0.2^3*0.8 = 0.0064
                  Examples
2. A certain tennis player makes a successful first
    serve 70% of the time. Assume that each
    serve is independent of the others. If she
    serves 6 times, what’s the probability she
    gets
a) all 6 serves in
b) exactly 4 serves in?
c) at least four serves in?
d) no more than 4 serves in?
                     Examples
2 binomial model: p = 0.7, q = .3, n = 6

a) C(6,6)*.7^6*.3^0 = 0.118
b) C(6,4)*.7^4*.3^2 = 15*.7^4*.3^2 = .324
c) C(6,4)*.7^4*.3^2 = 0.324
   C(6,5)*.7^5*.3^1 = 0.303
   C(6,6)*.7^6*.3^0 = 0.118
Answer: 0.745

d) 1 – (.303+.118) = .575
            Probability Models
•   Discrete Models
    – Bernoulli
    – Geometric
    – Binomial


•   Continuous
    – Normal Model
    – Poisson Model
      Why Continuous Models?
• Example: About 6% of people have O-negative
  blood. Suppose the Red Cross anticipates the
  need for at least 1850 units of O-negative
  blood for this year. It is estimated that 32000
  donors will give blood this year. How great is
  the risk that the Red Cross will fall short of
  meeting its need?
      Why Continuous Models?
                  Answer:
• C(32000,1850)*.06^(1850)*.94^(30150)??

• But the problem asks for at least 1850.

• More numbers need to be computed!!!
• We use the Normal Model for approximation.
  CHAPTER 6: STANDARD DEVIATION & THE NORMAL
                    MODEL
              Chapter 6. What is a normal distribution?
• The normal distribution is pattern for the distribution of a set
  of data which follows a bell shaped curve.

• This distribution is sometimes called the Gaussian distribution
  in honor of Carl Friedrich Gauss, a famous mathematician.

• The bell shaped curve has several properties:

• The curve concentrated in the center and decreases on either
  side. This means that the data has less of a tendency to
  produce unusually extreme values, compared to some other
  distributions.

• The bell shaped curve is symmetric. This tells you that he
  probability of deviations from the mean are comparable in
  either direction.                                                  33
    CHAPTER 6: STANDARD DEVIATION & THE
               NORMAL MODEL
• When you want to describe probability for a
  continuous variable, you do so by describing a
  certain area.

• A large area implies a large probability and a small
  area implies a small probability. Some people don't
  like this, because it forces them to remember a bit of
  geometry (or in more complex situations, calculus).
  But the relationship between probability and area is
  also useful, because it provides a visual
  interpretation for probability.

• Here's an example of a bell shaped curve. This
  represents a normal distribution with a mean of 50
  and a standard deviation of 10.                      34
DESCRIBING DISTRIBUTION NUMERICALLY




                                      35
CHAPTER 6: STANDARD DEVIATION & THE
           NORMAL MODEL




                                      36
                   Formula
• Standardizing normal
  variables:



         Y   
• Formula:


      Z       
           
                             37
              68-95-99.7 Rule
• 68% of the observations are within 1 standard
  deviation unit

• 95% of the observations are within 2 standard
  deviation unit

• 99.7% of the observations are within 3 standard
  deviation unit

• http://davidmlane.com/hyperstat/normal_distribut
  ion.html
                                                    38
   Using a normal model to solve
             problems
• An example using height data and U.S. Marine
  Corps and Army height requirements

• Global Question: Are the height restrictions
  set up by the U.S. Army and U.S. Marine Corps
  more restrictive for men or women or are they
  roughly the same?


                                              39
Data from a National Health Survey
• Heights of adult women are normally
  distributed with a
  – mean of 63.6 in.
  – standard deviation of 2.5 in.
• Heights of adult men are normally distributed
  with a
  – mean of 69.0 in.
  – standard deviation of 2.8 in.

                                                  40
                  Height Restrictions
              Men Minimum   Men Maximum   Women     Women Maximum
                                          Minimum


U.S. Army
              60 in         80 in         58 in     80 in
U.S. Marine
Corps
              64 in         78 in         58 in     73 in




                                                                41
       Which is more unusual?

• A 58in tall woman?

• A 60in tall man?




                                42
       Which is more unusual?

A. A 58in tall woman?

B. A 64in tall man?




                                43
       Which is more unusual?

A. A 73in tall woman?

B. A 78in tall man?




                                44
        Which is more unusual?

A. An 80 in tall woman?

B. An 80 in tall man?




                                 45
 Approximately what percent of U.S. women do
      you expect to be under 66.1 in tall?

A.68%
B.95%
C.16%
D.84%
E. I have no idea how to do this




                                               46
  Approximately what percent of U.S. women
     Do you expect to be over 68.6in tall?

A.5%
B.2.5%
C.1%
D.0.3%
E. I have no idea how to do this




                                             47
     Approximately what percent of U.S. women
      Do you expect to be between 66.1in and
                    68.6in tall?

A.5%
B.10%
C.13.5%
D.16%
E. I have no idea how to do this




                                                48
     Approximately what percent of U.S. women
     Do you expect to be under 5 feet tall (60in)?



A.3%
B.5%
C.7.5%
D.10%
E. I have no idea how to do this



                                                     49
                               Example
Some IQ tests are standardized to a Normal model with a mean of 100 and a
  standard deviation of 16.

a) Describe the 68-95-99.7 rule for this problem

b) About what percent of people should have IQ scores above 116?

c) About what percent of people should have IQ scores between 68 and 84?

d) About what percent of people should have IQ scores above 132?

e) About what percent of people should have IQ scores above 120?

f) About what percent of people should have IQ scores below 90?

g) About what percent of people should have IQ scores between 95 and 130?

h) A person is a genius if his/her IQ belong to the top 10% of the all IQ scores.
    What minimum IQ score qualifies you to be a genius?
                                                                                50
                   Answers to the Example
Some IQ tests are standardized to a Normal model with a mean of 100 and a standard deviation of 16.

b) 16%

c) 13.5%

d) 2.5%

e) About what percent of people should have IQ scores above 120?
      Z = (120 -100)/16 = 1.25
      Find P(Z > 1.25) from standard normal chart or your TI calculator.
      Answer: 1-.8944 = .1056
f) About what percent of people should have IQ scores below 90?
      Z = (90 -100)/16 = -0.625
      Find P(Z < -0.625) from standard normal chart or your TI calculator.
      Answer: .26
g) About what percent of people should have IQ scores between 95 and 130?
      Z = (95 -100)/16 = -0.3125
      Z = (130 -100)/16 = 1.875

      Find P(-0.3125< Z < 1.875) = .9699-.3783 = .5916

h) A person is a genius if his/her IQ belong to the top 10% of the all IQ scores. What
minimum IQ score qualifies you to be a genius?

The top 10% corresponds to the 90th percentile. For the standard normal the 90th
percentile is 1.28. Hnece solve 1.28 = (Y-100)/16. The value of Y is 120.48.
                                                                                                      51
                    Example
• In 2006 combined verbal and math SAT scores
  followed a normal distribution with mean 1020 and
  standard deviation 240.


• Suppose you know that Peter scored in the top 3% of
  SAT scores. What was Peter’s approximate SAT score?

• Answer: 1471.2


                                                      52
            Some TI-83/84 Commands
              To upload data on your TI calculator:

•       Press STAT, Enter (for EDIT)
    –     If there are old data under L1:

    –     Press the up arrow, then CLEAR, ENTER


•       Enter data values in L1 one at a time, pressing
        ENTER after each
    –     If you make an error, use the up or down arrows to highlight the
          error, then enter the correct value. Use the arrows to get to the
          bottom of the list for the next value, if necessary.

    –     Be sure to press ENTER after the last data value.                   53
            Some TI-83/84 Commands
     To Find One Variable Statistics (mean, median, standard
                           deviation, etc)

• Press STAT, Right Arrow (for CALC), ENTER

• Press ENTER (for 1-Var Stats)

• Press ENTER again

• Read results
   – The Standard Deviation is labeled Sx


                                                               54
           Using the TI-83 to Find a Normal Percentage
                                                              Always draw a
• The TI-83 provides a function named normalcdf
                                                              picture!
   – Press 2nd, DISTR (found above VARS)
   – Scroll to normalcdf ( and press ENTER, or press 2.
• If z has a standard normal distribution:
   – Percent(a < z < b) = normalcdf ( a , b )                           ?
   – Example: to find P( -1.2 < z < .8 ),                    -1.2           .8
     press 2nd, DISTR, 2, then -1.2 , .8 )
   – Note that the comma between -1.2 and .8 must be entered
   – Read .6731
                                                                        ?
• To find Percent( z < a ), enter normalcdf ( -5 , a )
   – Example: normalcdf( -5 , 1.96 ) gives .9750                                 1.96


                                                                    ?
• To find Percent( z > a ), enter normalcdf ( a , 5 )
   – Example: normalcdf( -1.645 , 5 ) gives .9500         -1.645
                                                                                  55
        Using the TI-83/84 for Normal Percentages Without
                        Computing z-Scores
We can let the TI find its own z-scores:
 – Find Percent(90 < x < 105) if x follows the normal model with mean 100 and
   standard deviation 15:
     • Percent(90 < x < 105) = normalcdf( 90 , 105 , 100 , 15)
                             = .378




Notice that this is a time-saver for this type of problem, but that you may still need
to be able to compute z-scores for other types of problems!

                               x1            x2




                                                                                         56
        Suppose We’re Given a normal
       Percentage and Need A z-score?
• IQ scores are distributed normally with a
  mean of 100 and a standard deviation of 15.
  What score do you need to capture the
  bottom 2%?
  – That is, we must find a so that Percent(x < a) = 2% when x
    has a normal distribution with a mean of 100 and a
    standard deviation of 15.
  – With the TI 83/84:
       a = invNorm( .02, 100 , 15) = 69.2


                                                             x

                                                                 57
      Why Continuous Models?
• Example: About 6% of people have O-negative
  blood. Suppose the Red Cross anticipates the
  need for at least 1850 units of O-negative
  blood for this year. It is estimated that 32000
  donors will give blood this year. How great is
  the risk that the Red Cross will fall short of
  meeting its need?
      Why Continuous Models?
• The Normal Model comes to the rescue!
• Idea: Approximate the binomial model with a
  normal model!

• The Binomial Model has
  – mean = np = 1920
  – std dev. = sqrt(np(1-p))= 42.48
• Let X = # of O-blood donors
      Why Continuous Models?
• The problem now is to find

  P(X < 1850) = P (Z < (1850-1920)/42.28) ~

  P(Z < 1.65) ~ 0.05
• Hence there is a 5% chance that the Red Cross
  will not meet its goal.
      Why Continuous Models?
• Question: Can we always approximate the
  binomial model by a normal model?

• Answer: NO!
       Why Continuous Models?
• Example: Suppose there is a 20% chance that cereal
  boxes contain pictures of Tiger woods. You buy 5
  boxes. Let X = # of Tiger Woods Pictures you get.
  What is the distribution of X?


        X           0     1     2     3     4     5

     P(X=x)        .33   .41   .20   .05   .01   .03
        Why Continuous Models?
• Simulation at

   – http://www.stat.wvu.edu/SRS/Modules/NormalApprox/no
     rmalapprox.html

• A Normal Model is a close approximation of binomial
  for a large number of trials
   – “Large” is explained by the
      • Success/failure Condition: A Binomial Model is approximately
        normal if we expect at least 10 successes and 10 failures,
        i. e. np is at least 10 and nq = n(1-p) is at least 10
      Why Continuous Models?
• Example: A communication monitoring
  company reports that 91% of e-mail messages
  are spam. Recently you installed a spam filter.
  You observe that over the past week it okayed
  only 151 of 1422 e-mails you received,
  classifying the rest as junk. What is the
  probability that no more than 151 of 1422 e-
  mails is a real message?
         Why Continuous Models?
                              Solution:

Let X = # of real messages
p = .09, q = 1-p = .91, n = 1422, x = 151
np = 127.98, sqrt(npq) = 10.79

P(X < 151) = P( Z < (151-127.98)/10.79) =
                 = P( Z < 2.13) = .98

There is a 98% chance that no more than 151 messages
among the 1422 receives are real messages. Filter is
working properly

				
DOCUMENT INFO