VIEWS: 50 PAGES: 65 POSTED ON: 12/23/2010
Chapter 16 Random Variables Random Variable is a variable whose outcome depends on a random event Notation: Capital letters such as X, Y, etc. The values X assumes are denoted by little x Examples: Random Variables • The amount of time you wait at a light • The lifespan of a light bulb • The amount of money an insurance company will pay on a policy Random Variables • Like the distribution of any quantitative variable, the distribution has shape, center and spread. • In fact, we can calculate the mean, or expected value of a random variable and its standard deviation. Chapter 16 Random Variables • Example: An insurance company offers a “death and disability” policy that pays $10000 when you die or $5000 if you are permanently disabled, within five years. It charges its customers $50 to enroll in this plan. • Here: X is a random variable that represents how much the insurance company pays. • Note: X = $10000 or X = $5000 or X = $0 Chapter 16 Random Variables • Problem: Suppose the Policyholder payout P(X=x) probability of death rate in 5 outcome years is 1 out of 1000 people, and that 2 out of 1000 suffer some kind of Death 10000 0.001 disability. Is the company making profit selling such plans? Disability 5000 0.002 • The answer is obtained by forming a probability Neither 0 0.997 model. Probability Distribution Table for Insurance Example Insurance Payment X= $10000 X= $5000 X = $0 Probability 0.001 0.002 0.997 Chapter 16 Random Variables • Suppose the insurance company sells the policy for $50 each to 1000 customers. According to the model we expect 1 to die, 2 to be disabled, and the rest 997 will be ok. • Hence the company – Gets $50000 – pays $10000 to the family of the deceased – pays $ 10000 to the two disabled persons – and profits $30000 ($30 per person profit) Chapter 16 Random Variables Chapter 16 Random Variables Chapter 16 Random Variables Chapter 16 Random Variables • Example: Suppose you have a computer sales business. If your client gets two new computers, things are fine. If the client gets one refurbished computer, it will be sent back at your expense of $100 and you can replace it. If both computers are refurbished, the client will cancel the order and you lose $1000. What is the expected value (mean) and standard deviation of your loss? Chapter 16 Random Variables • Probability Model: Outcome x P(X=x) • Assumptions: In your inventory you found out that someone has 2 new 0 0.524 stocked 4 refurbished computers with 11 new computers. The shipped One Refurb 100 0.419 computers were randomly selected from these group Two Refurb 1000 0.057 Chapter 16 Random Variables • Expected value of loss – E(X) = 0 (0.524)+ 100(0.419)+ 1000(0.057) = E(X)= $98.90 – Var(X) = (0-98.90)^2*(0.524)+ (100- 98.90)^2(0.419) + (1000-98.90)^2*(0.057 = $51, 408.79 – Std Dev = Sqrt(51,408.79) = $226.735 Chapter 16 Random Variables • Interpretation: – In the long run it will cost the company an average of $98.90 – The large standard deviation reflects the fact that there is a pretty large range of possible losses. Chapter 16 Random Variables • Some properties: – E(X+c) = E(X) +c – E(X-c) = E(X) – c – E(X+Y)=E(X)+E(Y) – Var(X+c) = Var(X) – Var(X-c) = Var(X) – E(aX) = a E(X) – Var(aX) = a^2Var(X) – Var(X+Y) = Var(X) + Var(Y), provided X, Y are independent Chapter 16 Random Variables • Example: You plan to sell your used Isuzu Trooper and spend your vacation in Kyrgyzstan where you plan to buy a Honda motor scooter. Used Isuzu’s sell for a mean price of $6940 with a standard deviation of $250. In Kyrgyzstan, the scooter sells for a mean price of 65,000 Kyrgyzstan soms with a standard deviation of 500 soms. 1 USD = 43 Kyrgyzstan soms. What is your expected profit, if any? Chapter 16 Random Variables • A = sale price of Isuzu (in dollars) • B = price of scooter (in soms) • D = profit (in soms) = 43A-B • Prices are assumed to be independent • E(D) = E(43A-B) = 43E(A)-E(B) = 43(6940)-(65000) = 233,420 soms (expected profit) • Var(D)= Var(43A-B)= 43^2var(A)+Var(B) = 115, 812, 500 • Std Dev(D) = sqrt(115, 812, 500) = 10,762 soms. • In dollars, the expected profit is $5428 with a stad dev of 250. Chapter 17 Probability Models • 1. Bernoulli Trials (properties) – Two possible outcomes (success and failure) – The probability of success, denoted by p, is the same on every trial – The trials are independent • Example: Suppose 75% of all drivers wear their seatbelts. Find the probability that four drivers might be belted among five cars waiting traffic light? Chapter 17 Probability Models • 2. The Geometric Model: How many trials are needed to observe the first success? • Let p = probability of success, q = 1-p be probability of failure, X= be the number of trials until the first success occurs. • Then P(X=x) q^{x-1}p • Expected value E(X) = 1/p • Std deviation of X = Sqrt(q/p^2) Chapter 17 Probability Models • Example: People with O-negative blood type are called universal donors. Only about 6% of people have O-negative blood. If donors line up at random for blood drive, – a) how many do you expect to examine before you find someone who has O-negative blood? – b) What’s the probability that the first O-negative donor found is one of the four people in line? Chapter 17 Probability Models • a) E(X) = 1/0.06 = 16.6 On average a universal donor is found by examining 16.7 people. • b) P(X <= 4) = P(X=1)+ P(X=2)+ P(X=3)+ P(X=4) 0.2193 Chapter 17 Probability Models • 3. The Binomial Model: The binomial model is like a Bernoulli trial with a number of trials being n, n is a number greater than or equal to 2. • We look for the number of k successes in n trial. Here k is less than or equal to n. • Let X = number of successes in n trials. Chapter 17 Probability Models • P(X=x) = C(n,x)*p^x*q^{n-x} • Here p = probability of success, q = 1-p = probability of failure. • E(X) = np • Std dev of X = Sqrt{npq} Chapter 17 Probability Models • Example: Suppose 20 people come to the blood drive. What is the probability that there are 2 or 3 universal donors? Solution: • P(X=2) = C(20,2)(.06)^{2}(.94)^{18}= 0.2246 • P(X=3) = C(20,3)(.06)^{3}(.94)^{17}= 0.0860 • Answer: 0.3106 Chapter 17 Probability Models • To compute the C(n,x) number from the TI-83 do the following • 1. Type your number • 2. Press MATH • 3. Select Prob • 4. Press nCr • 5. Type your second number Examples 1. A basketball player has made 80% of his foul shots during the season. Assuming the shots are independent, find the probability that in tonight’s game he a) Misses for the first time on his fifth attempt b) Makes his first basket on his fourth shot Examples • a) success probability = p = 0.8, failure probability = p = 0.2 • Answer: 0.8^4*0.2=0.08192 • b) Answer: 0.2^3*0.8 = 0.0064 Examples 2. A certain tennis player makes a successful first serve 70% of the time. Assume that each serve is independent of the others. If she serves 6 times, what’s the probability she gets a) all 6 serves in b) exactly 4 serves in? c) at least four serves in? d) no more than 4 serves in? Examples 2 binomial model: p = 0.7, q = .3, n = 6 a) C(6,6)*.7^6*.3^0 = 0.118 b) C(6,4)*.7^4*.3^2 = 15*.7^4*.3^2 = .324 c) C(6,4)*.7^4*.3^2 = 0.324 C(6,5)*.7^5*.3^1 = 0.303 C(6,6)*.7^6*.3^0 = 0.118 Answer: 0.745 d) 1 – (.303+.118) = .575 Probability Models • Discrete Models – Bernoulli – Geometric – Binomial • Continuous – Normal Model – Poisson Model Why Continuous Models? • Example: About 6% of people have O-negative blood. Suppose the Red Cross anticipates the need for at least 1850 units of O-negative blood for this year. It is estimated that 32000 donors will give blood this year. How great is the risk that the Red Cross will fall short of meeting its need? Why Continuous Models? Answer: • C(32000,1850)*.06^(1850)*.94^(30150)?? • But the problem asks for at least 1850. • More numbers need to be computed!!! • We use the Normal Model for approximation. CHAPTER 6: STANDARD DEVIATION & THE NORMAL MODEL Chapter 6. What is a normal distribution? • The normal distribution is pattern for the distribution of a set of data which follows a bell shaped curve. • This distribution is sometimes called the Gaussian distribution in honor of Carl Friedrich Gauss, a famous mathematician. • The bell shaped curve has several properties: • The curve concentrated in the center and decreases on either side. This means that the data has less of a tendency to produce unusually extreme values, compared to some other distributions. • The bell shaped curve is symmetric. This tells you that he probability of deviations from the mean are comparable in either direction. 33 CHAPTER 6: STANDARD DEVIATION & THE NORMAL MODEL • When you want to describe probability for a continuous variable, you do so by describing a certain area. • A large area implies a large probability and a small area implies a small probability. Some people don't like this, because it forces them to remember a bit of geometry (or in more complex situations, calculus). But the relationship between probability and area is also useful, because it provides a visual interpretation for probability. • Here's an example of a bell shaped curve. This represents a normal distribution with a mean of 50 and a standard deviation of 10. 34 DESCRIBING DISTRIBUTION NUMERICALLY 35 CHAPTER 6: STANDARD DEVIATION & THE NORMAL MODEL 36 Formula • Standardizing normal variables: Y • Formula: Z 37 68-95-99.7 Rule • 68% of the observations are within 1 standard deviation unit • 95% of the observations are within 2 standard deviation unit • 99.7% of the observations are within 3 standard deviation unit • http://davidmlane.com/hyperstat/normal_distribut ion.html 38 Using a normal model to solve problems • An example using height data and U.S. Marine Corps and Army height requirements • Global Question: Are the height restrictions set up by the U.S. Army and U.S. Marine Corps more restrictive for men or women or are they roughly the same? 39 Data from a National Health Survey • Heights of adult women are normally distributed with a – mean of 63.6 in. – standard deviation of 2.5 in. • Heights of adult men are normally distributed with a – mean of 69.0 in. – standard deviation of 2.8 in. 40 Height Restrictions Men Minimum Men Maximum Women Women Maximum Minimum U.S. Army 60 in 80 in 58 in 80 in U.S. Marine Corps 64 in 78 in 58 in 73 in 41 Which is more unusual? • A 58in tall woman? • A 60in tall man? 42 Which is more unusual? A. A 58in tall woman? B. A 64in tall man? 43 Which is more unusual? A. A 73in tall woman? B. A 78in tall man? 44 Which is more unusual? A. An 80 in tall woman? B. An 80 in tall man? 45 Approximately what percent of U.S. women do you expect to be under 66.1 in tall? A.68% B.95% C.16% D.84% E. I have no idea how to do this 46 Approximately what percent of U.S. women Do you expect to be over 68.6in tall? A.5% B.2.5% C.1% D.0.3% E. I have no idea how to do this 47 Approximately what percent of U.S. women Do you expect to be between 66.1in and 68.6in tall? A.5% B.10% C.13.5% D.16% E. I have no idea how to do this 48 Approximately what percent of U.S. women Do you expect to be under 5 feet tall (60in)? A.3% B.5% C.7.5% D.10% E. I have no idea how to do this 49 Example Some IQ tests are standardized to a Normal model with a mean of 100 and a standard deviation of 16. a) Describe the 68-95-99.7 rule for this problem b) About what percent of people should have IQ scores above 116? c) About what percent of people should have IQ scores between 68 and 84? d) About what percent of people should have IQ scores above 132? e) About what percent of people should have IQ scores above 120? f) About what percent of people should have IQ scores below 90? g) About what percent of people should have IQ scores between 95 and 130? h) A person is a genius if his/her IQ belong to the top 10% of the all IQ scores. What minimum IQ score qualifies you to be a genius? 50 Answers to the Example Some IQ tests are standardized to a Normal model with a mean of 100 and a standard deviation of 16. b) 16% c) 13.5% d) 2.5% e) About what percent of people should have IQ scores above 120? Z = (120 -100)/16 = 1.25 Find P(Z > 1.25) from standard normal chart or your TI calculator. Answer: 1-.8944 = .1056 f) About what percent of people should have IQ scores below 90? Z = (90 -100)/16 = -0.625 Find P(Z < -0.625) from standard normal chart or your TI calculator. Answer: .26 g) About what percent of people should have IQ scores between 95 and 130? Z = (95 -100)/16 = -0.3125 Z = (130 -100)/16 = 1.875 Find P(-0.3125< Z < 1.875) = .9699-.3783 = .5916 h) A person is a genius if his/her IQ belong to the top 10% of the all IQ scores. What minimum IQ score qualifies you to be a genius? The top 10% corresponds to the 90th percentile. For the standard normal the 90th percentile is 1.28. Hnece solve 1.28 = (Y-100)/16. The value of Y is 120.48. 51 Example • In 2006 combined verbal and math SAT scores followed a normal distribution with mean 1020 and standard deviation 240. • Suppose you know that Peter scored in the top 3% of SAT scores. What was Peter’s approximate SAT score? • Answer: 1471.2 52 Some TI-83/84 Commands To upload data on your TI calculator: • Press STAT, Enter (for EDIT) – If there are old data under L1: – Press the up arrow, then CLEAR, ENTER • Enter data values in L1 one at a time, pressing ENTER after each – If you make an error, use the up or down arrows to highlight the error, then enter the correct value. Use the arrows to get to the bottom of the list for the next value, if necessary. – Be sure to press ENTER after the last data value. 53 Some TI-83/84 Commands To Find One Variable Statistics (mean, median, standard deviation, etc) • Press STAT, Right Arrow (for CALC), ENTER • Press ENTER (for 1-Var Stats) • Press ENTER again • Read results – The Standard Deviation is labeled Sx 54 Using the TI-83 to Find a Normal Percentage Always draw a • The TI-83 provides a function named normalcdf picture! – Press 2nd, DISTR (found above VARS) – Scroll to normalcdf ( and press ENTER, or press 2. • If z has a standard normal distribution: – Percent(a < z < b) = normalcdf ( a , b ) ? – Example: to find P( -1.2 < z < .8 ), -1.2 .8 press 2nd, DISTR, 2, then -1.2 , .8 ) – Note that the comma between -1.2 and .8 must be entered – Read .6731 ? • To find Percent( z < a ), enter normalcdf ( -5 , a ) – Example: normalcdf( -5 , 1.96 ) gives .9750 1.96 ? • To find Percent( z > a ), enter normalcdf ( a , 5 ) – Example: normalcdf( -1.645 , 5 ) gives .9500 -1.645 55 Using the TI-83/84 for Normal Percentages Without Computing z-Scores We can let the TI find its own z-scores: – Find Percent(90 < x < 105) if x follows the normal model with mean 100 and standard deviation 15: • Percent(90 < x < 105) = normalcdf( 90 , 105 , 100 , 15) = .378 Notice that this is a time-saver for this type of problem, but that you may still need to be able to compute z-scores for other types of problems! x1 x2 56 Suppose We’re Given a normal Percentage and Need A z-score? • IQ scores are distributed normally with a mean of 100 and a standard deviation of 15. What score do you need to capture the bottom 2%? – That is, we must find a so that Percent(x < a) = 2% when x has a normal distribution with a mean of 100 and a standard deviation of 15. – With the TI 83/84: a = invNorm( .02, 100 , 15) = 69.2 x 57 Why Continuous Models? • Example: About 6% of people have O-negative blood. Suppose the Red Cross anticipates the need for at least 1850 units of O-negative blood for this year. It is estimated that 32000 donors will give blood this year. How great is the risk that the Red Cross will fall short of meeting its need? Why Continuous Models? • The Normal Model comes to the rescue! • Idea: Approximate the binomial model with a normal model! • The Binomial Model has – mean = np = 1920 – std dev. = sqrt(np(1-p))= 42.48 • Let X = # of O-blood donors Why Continuous Models? • The problem now is to find P(X < 1850) = P (Z < (1850-1920)/42.28) ~ P(Z < 1.65) ~ 0.05 • Hence there is a 5% chance that the Red Cross will not meet its goal. Why Continuous Models? • Question: Can we always approximate the binomial model by a normal model? • Answer: NO! Why Continuous Models? • Example: Suppose there is a 20% chance that cereal boxes contain pictures of Tiger woods. You buy 5 boxes. Let X = # of Tiger Woods Pictures you get. What is the distribution of X? X 0 1 2 3 4 5 P(X=x) .33 .41 .20 .05 .01 .03 Why Continuous Models? • Simulation at – http://www.stat.wvu.edu/SRS/Modules/NormalApprox/no rmalapprox.html • A Normal Model is a close approximation of binomial for a large number of trials – “Large” is explained by the • Success/failure Condition: A Binomial Model is approximately normal if we expect at least 10 successes and 10 failures, i. e. np is at least 10 and nq = n(1-p) is at least 10 Why Continuous Models? • Example: A communication monitoring company reports that 91% of e-mail messages are spam. Recently you installed a spam filter. You observe that over the past week it okayed only 151 of 1422 e-mails you received, classifying the rest as junk. What is the probability that no more than 151 of 1422 e- mails is a real message? Why Continuous Models? Solution: Let X = # of real messages p = .09, q = 1-p = .91, n = 1422, x = 151 np = 127.98, sqrt(npq) = 10.79 P(X < 151) = P( Z < (151-127.98)/10.79) = = P( Z < 2.13) = .98 There is a 98% chance that no more than 151 messages among the 1422 receives are real messages. Filter is working properly