Document Sample

Probability Lecture 2 Probability • Why did we spend last class talking about probability? • How do we use this? You’re the FDA • A company wants you to approve a new drug • They run an experimental trial – 40 people have the disease – 20 get drug, 20 get placebo – Random assignment – Conducted perfectly You’re the FDA • Results: – Placebo group: 10 of 20 live – Drug group: 11 of 20 live • Does the drug work? – Would you approve it? – Why or why not? You’re the FDA • Different study, same design • Results: – Placebo: 2 of 20 live – Drug: 18 of 20 live • Does the drug work? – Would you approve it? – Why or why not? You’re the FDA • Different study, same design • Results: – Placebo: 8 of 20 live – Drug: 12 of 20 live • Does the drug work? – Would you approve it? – Why or why not? • How big of a difference do we need? Why probability • Probability provides the answer – Set of agreed on rules – All based on mathematical formula Example • How many of you would accept the following wager: – If no two people in the class have the same birthday (month and day) you get an automatic A. – If two or more people in class have the same birthday, you get an automatic F. • Not ethical for me to accept the wager Example • Would you have won? Example • Would you have won? • What is the probability? – Not 60/365 – Think of the complement – How many possible pairs are there in the class? • Me and each student = N • First student and every other student = N-1 • Second student and every remaining student = N-2 • … • Last two students • = = 1770 Example • P of any pair matches is 1/365 = 0.00274 • P any pair doesn’t match is 1-0.00274 – = 0.99726 • We have 1770 pairs. – Remember the rule – Joint probability of all not matching is: – P(first pair not match)*P(second pair not match)*…*P(last pair not match) What is random? • What are the odds that the first flip is a heads? –½ – Each outcome is equally likely • The second flip? –½ • So what are the odds that both are? – Four outcomes: • HH, HT, TH, TT • so ¼ (each equally likely) What is random? • Odds the third flip is a heads? –½ • Odds that all three are heads? – 8 outcomes – HHH, HHT, HTH, HTT, THH, THT, TTH, TTT – So, 1/8 • Odds the fourth flip is a heads? –½ • All four? – 1/16 What is random? • Odds that five in a row are heads? – 1/32 • Odds that six in a row? – 1/64 • If we did this as a probability they would be: – 0.5 – 0.25 – 0.125 – 0.0625 – 0.03125 – 0.0078125 • Each is the previous probability multiplied by 0.5 Example • P of any pair matches is 1/365 = 0.00274 • P any pair doesn’t match is 1-0.00274 – = 0.99726 • We have 1770 pairs. – Remember the rule – Joint probability of all not matching is: – P(first pair not match)*P(second pair not match)*…*P(last pair not match) – = 0.99726 1770 – = 0.008 • Seems likely that at least one would match • Rules of probability and math let us determine how likely an event is. • Want to be able to determine “statistical significance” – Can we conclude that the pattern we see didn’t happen by chance? What is “statistical significance?” • First, let’s be clear about what statistical significance is NOT. • A finding that a relationship between some X and some Y is “statistically significant” does NOT mean that the relationship is “strong.” (It might be strong, but not because it’s statistically significant.) This is a common mistake • Many people think that a “statistically significant” relationship is by definition a “strong” one. In fact, many people think that “statistical significance” IS ITSELF a test of the strength of the relationship. It’s not. Then what is statistical significance? • It is a probabilistic statement—typically, 95% confidence—that the relationship we observe in the sample, no matter how strong or weak, exists in the population. But, as always… • There is a 5% chance we could be wrong —that is, that despite what we observe in the sample, there really is no relationship in the population. How do we demonstrate statistical significance? • We perform something called “hypothesis testing.” • We actually begin with a statement called the “null hypothesis.” It is always a statement that there is not a relationship between two variables. Why a Null Hypothesis? • We want to know if there is a relationship • Our theory is not strong enough to tell us how large the effect is – Theory: Gender helps determine vote choice – Hypothesis: Women were more likely to vote for Obama than men were – Problem: How much more? We don’t know. • How large of a difference would be big enough? Null Hypothesis • Big enough to not happen by chance • Ok, but how much is enough to be “not by chance?” • This is where probability comes in • Anything is possible—the normal distributions is unbounded. Null Hypothesis • Everything may be possible, but everything is not probable • We want to know the probability that a relationship could exist in the data by chance Example: Gender Gap Female Male McCain 281 (31%) 230 (37%) Obama 625 (69%) 391 (63% Probability • If we make some assumptions we can calculate how probable any outcome is. • What do we assume? – There is no difference between treatments – What the probability distribution is (this is technical and I will tell you what matters). • With these, we can calculate P(data occurred by chance). Probability • But that isn’t exactly what we want to know. • We want to know probability that there is a difference, this would be probability that there is no difference. • Unfortunately that is as good as we can do Probability • So, what is the null hypothesis (since that is where this started)? • It is the hypothesis that there is no relationship (thus, “null”). This is what we can test. • It is the inverse of what we want to know. • So, if our theory is right the null hypothesis is wrong and we will reject the null hypothesis. • If our theory is wrong, we will accept the null hypothesis • What does this Female Male mean? • How likely are we to see this by chance? McCain 281 230 • If there were no (31%) (37%) difference Obama 625 391 between genders, (69%) (63% the probability of seeing this difference is 0.01 • 0.01 • That is pretty unlikely, but what does that mean? • One of three things occurred 1. The data are wrong 2. We were really unlucky 3. The assumption of no relationship is wrong – Conclusion is the last one. We have a relationship. Probability • How unlikely does the null have to be for us to reject it? • 1 out of 20 (5%) • Why? • Vestige of pre-computer days – Norm

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 0 |

posted: | 6/11/2013 |

language: | English |

pages: | 31 |

OTHER DOCS BY yurtgc548

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.