Document Sample
Probability Powered By Docstoc
					Introduction to Probability
                  and Risk
   Theoretical, or a priori probability – based on
    a model in which all outcomes are equally
    likely. Probability of a die landing on a 2 =
   Empirical probability – base the probability on
    the results of observations or experiments. If
    it rains an average of 100 days a year, we
    might say the probability of rain on any one
    day is 100/365.
   Subjective (personal) probability – use
    personal judgment or intuition. If you go to
    college today, you will be more successful in
    the future.
   Suppose there are M possible outcomes for
    one process and N possible outcomes for a
    second process. The total number of
    possible outcomes for the two processes
    combined is M x N.
   How many outcomes are possible when you
    roll two dice?
   A restaurant menu offers two choices for an
    appetizer, five choices for a main course, and
    three choices for a dessert. How many
    different three-course meals?
   A college offers 12 natural science classes,
    15 social science classes, 10 English classes,
    and 8 fine arts classes. How many choices?
   Let’s try to solve these:
    ◦ A license plate has 7 digits, each digit
      being 0-9. How many possible outcomes?

    ◦ What if the license plate allows digits 0-9
      and letters A-Z?

    ◦ How many zip codes in the US? In
   P(A) = (number of ways A can occur) / (total
    number of outcomes)
   Probability of a head landing in a coin toss:
   Probability of rolling a 7 using two dice: 6/36
   Probability that a family of 3 will have two
    boys and one girl: 3/8 (BBB,BBG,BGB,BGG,GBB, GBG,
    GGB, GGG)
   Probability based on observations or
   Records indicate that a river has crested
    above flood level just four times in the past
    2000 years. What is the empirical probability
    that the river will crest above flood level next
    4/2000 = 1/500 = 0.002
   What if we were to toss 2 coins? What are the
    theoretical probabilities of a two-coin toss?
    ◦ HH, HT, TH, TT – 4 possibilities, so each is 1/4

   Now let’s toss 2 coins 10 times and observe
    the results (empirical results)

   Compare the theoretical results to the
   P(not A) = 1 - P(A)

   If the probability of rolling a 7 with two dice
    is 6/36, then the probability of not rolling a 7
    with two dice is 30/36
   Two events are independent if the outcome of
    one does not affect the outcome of the next

   The probability of A and B occurring together,
    P(A and B), = P(A) x P(B)

   When you say “this occurring AND this
    occurring” you multiply the probabilities
   For example, suppose you toss three coins.
    What is the probability of getting three tails
    (getting a tail and a tail and a tail)?
    1/2 x 1/2 x 1/2 = 1/8
     (8 combinations of H and T, so each is 1/8)

   Find the probability that a 100-year flood will
    strike a city in two consecutive years
    1 in 100 x 1 in 100 = 0.01 x 0.01 = 0.0001
   You are playing craps in Vegas. You have had
    a string of bad luck. But you figure since
    your luck has been so bad, it has to balance
    out and turn good
   Bad assumption! Each event is independent
    of another and has nothing to do with
    previous run. Especially in the short run (as
    we will see in a few slides)
   This is called Gambler’s Fallacy
   Is this the same for playing Blackjack?
   If you ask what is the probability of either this
    happening OR that happening, and the two
    events don’t overlap:
    P(A or B) = P(A) + P(B)

   Suppose you roll a single die. What is the
    probability of rolling either a 2 or a 3?
    P(2 or 3) = P(2) + P(3) = 1/6 + 1/6 = 2/6

    When you say “this occurring OR that occurring”, you
     ADD the two probabilities
   What is the probability of something
    happening at least once?

   P(at least one event A in n trials) = 1 - [P(A
    not happening in one trial)]n
   What is the probability that a region will
    experience at least one 100-year flood during
    the next 100 years?
   Probability of a flood is 1/100. Probability of
    no flood is 99/100.
   P(at least one flood in 100 years) = 1 -
    0.99100 = 0.634
   You purchase 10 lottery tickets, for which the
    probability of winning some prize on a single
    ticket is 1 in 10. What is the probability that
    you will have at least one winning ticket?

   P(at least one winner in 10 tickets) = 1 -
    0.910 = 0.651
   McDonalds has a new promotion. If you buy
    a large drink, your cup has a scratch off label
    on it. One in 20 cups wins a free Quarter
    Pounder. If you purchase 5 large drinks, what
    is the probability that you will win a Quarter
   The probability of tossing a coin and landing
    tails is 0.5. But what if you toss it 5 times
    and you get HHHHH?
   The law of large numbers tells you that if you
    toss it 100 / 1000 / 1,000,000 times, you
    should get 0.5.
   But this may not be the case if you only toss
    it 5 times.
   Expected value is what you expect to gain or
    lose over the long run.
   What if you have multiple related events?
    What is the expected value from the set of

   Expected value = event 1 value x event 1
    probability + event 2 value x event 2
    probability + …
   Suppose that $1 lottery tickets have the
    following probabilities: 1 in 5 win a free $1
    ticket; 1 in 100 win $5; 1 in 100,000 to win
    $1000; and 1 in 10 million to win $1 million.
    What is the expected value of a lottery ticket?
   Ticket purchase: value -$1, prob 1
   Win free ticket: value $1, prob 1/5
   Win $5: value $5, prob 1/100
   Win $1000: prob 1/100,000
   Win $1million: prob 1/10,000,000
   -$1 x 1= -1; $1 x 1/5 = $0.20; $5 x 1/100
    = $0.05; $1000 x 1/100,000 = $0.01;
    $1,000,000 x 1/10,000,000 = $0.10
   Now sum all the products:

-$1 + 0.20 + 0.05 + 0.01 + 0.10 =
Thus, averaged over many tickets, you should
  expect to lose $0.64 for each lottery ticket
  that you buy. If you buy, say, 1000 tickets,
  you should lose $640.
   Suppose an insurance company sells policies
    for $500 each.
   The company knows that about 10% will
    submit a claim that year and that claims
    average to $1500 each.
   How much can the company expect to make
    per customer?
   Company makes $500 100% of the time
    (when a policy is sold)
   Company loses $1500 10% of the time
   $500 x 1.0 - $1500 x 0.1 = 500 – 150 = 350
   Company gains $350 from each customer
   The company needs to have a lot of
    customers to ensure this works

   Let’s stop here for today.
   With terrorism, homicides, and traffic
    accidents, is it safer to stay home and
    take a college course online rather than
    head downtown to class?
   Are you safer in a small car or a sport utility
   Are cars today safer than those 30 years ago?
   If you need to travel across country, are you
    safer flying or driving?
   In 1966, there were 51,000 deaths related to
    driving, and people drove 9 x 1011 miles
   In 2000, there were 42,000 deaths related to
    driving, and people drove 2.75 x 1012 miles
   Was driving safer in 2000?
   51,000 deaths / 9 x 1011 miles = 5.7 x 10-8
    deaths per mile
   42,000 deaths / 2.75 x 1012 miles = 1.5 x
    10-8 deaths per mile
   Driving has gotten safer! Why?
   Over the last 20 years, airline travel has
    averaged 100 deaths per year
   Airlines have averaged 7 billion (7 x 109)
    miles in the air
   100 deaths / 7 x 109 miles = 1.4 x 10-8
    deaths per mile
   How does this compare to driving (1.5 x 10-8
    deaths per mile)?
   Is it fair to compare miles driven to miles
    flown? Instead compare deaths per trip?
   Suppose you are buying a new car. For an
    additional $200 you can add a device that will
    reduce your chances of death in a highway
    accident from 50% to 45%. Interested?
   What if the salesman told you it could reduce
    your chances of death from 5% to 0%.
    Interested now? Why?
   Suppose you can purchase an extended
    warranty plan for a new auto which covers
    100% of the engine and drive train (roughly
    33% of the car) but no other items at all
   Or you can purchase an extended warranty
    plan which covers the entire auto but only at
    33% coverage
   Which would you choose?
   Which do you think caused more deaths in
    the US in 2000, homicide or diabetes?
   Homicide: 6.0 deaths per 100,000
   Diabetes: 24.6 deaths per 100,000
   Which is safer – staying home for the day or
    going to school/work?
   In 2003, one in 37 people was disabled for a day
    or more by an injury at home – more than in the
    workplace and car crashes combined
   Shave with razor – 33,532 injuries
   Hot water – 42,077 injuries
   Slice a grapefruit with a knife – 441,250 injuries
   What if you run down two flights of stairs to
    fetch the morning paper?
   28% of the 30,000 accidental home deaths
    each year are caused by falls (poisoning and
    fires are the other top killers)
   Ratio of people killed every year by
    lightning strikes versus number of
    people killed in shark attacks: 4000:1
   Average number of people killed
    worldwide each year by sharks: 6
   Average number of Americans who die
    every year from the flu: 36,000
   Hide in a cave?
   Know the data – be aware!

   Now, let’s start our first med school lecture
   Welcome to the DePaul School of Medicine!
   Most people associate tumors with cancers,
    but not all tumors are cancerous
   Tumors caused by cancer are malignant
   Non-cancerous tumors are benign
   We can calculate the chances of getting a
    tumor and/or cancer – this is based on
    empirical research studies and probabilities
   If you don’t know how to calculate simple
    probabilities, you will misinform your patient
    and cause undo stress
   Suppose your patient has a breast tumor.
    Is it cancerous?
   Probably not
   Studies have shown that only about 1 in
    100 breast tumors turn out to be
   Nonetheless, you order a mammogram
   Suppose the mammogram comes back
    positive. Now does the patient have
   Earlier mammogram screening was 85%
   85% would lead you to think that if you tested
    positive, there is a pretty good chance that
    you have cancer.
   But this is not true. Do the math!
   Consider a study in which mammograms are
    given to 10,000 women with breast tumors
   Assume that 1% (1 in 100) of the tumors are
    malignant (100 women actually have cancer,
    9900 have benign tumors)
                 Tumor is    Tumor is   Totals
                 Malignant   Benign



Total            100         9900       10,000

 Tumor is Malignant is 1/100th of the total 10,000.
   Mammogram screening correctly identifies
    85% of the 100 malignant tumors as
   These are called true positives
   The other 15% had negative results even
    though they actually have cancer
   These are called false negatives
            Tumor is    Tumor is Benign   Totals

Positive    85 True
Mammogram   Positives

Negative    15 False
Mammogram   Negatives

Total       100         9900              10,000
   Mammogram screening correctly identifies
    85% of the 9900 benign tumors as benign
   Thus it gives negative (benign) results for
    85% of 9900, or 8415
   These are called true negatives
   The other 15% of the 9900 (1485) get
    positive results in which the mammogram
    incorrectly suggest their tumors are
    malignant. These are called false
                            Tumor is    Tumor is Benign   Totals

           Positive         85 True     1485 False
           Mammogram        Positives   Positives

           Negative         15 False    8415 True
           Mammogram        Negatives   Negatives

           Total            100         9900              10,000

This is what a mammogram should show: True Positives and True Negatives
               Tumor is    Tumor is Benign   Totals

Positive       85 True     1485 False        1570
Mammogram      Positives   Positives

Negative       15 False    8415 True         8430
Mammogram      Negatives   Negatives

Total          100         9900              10,000

        Now compute the row totals.
   Overall, the mammogram screening gives
    positive results to 85 women who actually
    have cancer and to 1485 women who do not
    have cancer
   The total number of positive results is 1570
   Because only 85 of these are true positives,
    that is 85/1570, or 0.054, or 5.4%
   Thus, the chance that a positive result really
    means cancer is only 5.4%
   Therefore, when your patient’s mammogram
    comes back positive, you should reassure her
    that there’s still only a small chance that she
    has cancer
   Suppose you are a doctor seeing a patient
    with a breast tumor. Her mammogram
    comes back negative. Based on the numbers
    above, what is the chance that she has
                  Tumor is
                              Tumor is

Positive          85 True     1485 False 1570
Mammogram         Positives   Positives

Negative          15 False    8415 True    8430
Mammogram         Negatives   Negatives

Total             100         9900         10,000

15/8430, or 0.0018, or slightly less than 2 in 1000.

This is a dangerous position. Now what do you do?

That’s the end of the med school lecture for today.