How to Lie, Cheat, and Steal with Statistics

Reviews
Shared by: smythesteven
Stats
views:
74
rating:
not rated
reviews:
0
posted:
6/12/2009
language:
English
pages:
0
How to Lie, Cheat, and Steal with Statistics Abuse of tests, deceptive graphs, beware! Use and Abuse of Tests    Hypothesis tests are more subject to abuse than confidence intervals. Many deceptive people will attempt to convince you of the importance of some discovery by telling you the hypothesis test has a small pvalue, and therefore it is “significant”. Reference section 6.3 in M&M text. Statistical vs Practical Significance    Just because a hypothesis test gets a small pvalue does not necessarily mean the finding is important in any practical sense. For example, lets test if a $10,000 LSAT prep course improves scores and we get a p-value of .02, evidence it does improve scores. However suppose the avg improvement is only .27 points better than no prep course. Was this really an important increase? (Nope). Big P-Values Need Love Too!   Just because a hypothesis test gets a big p-value, it does not necessarily mean this finding is unimportant. A test of two cancer therapies giving a big pvalue might be very important medically. They evidently don’t have much difference in duration of life, but maybe other considerations might suggest one over the other. Cost, pain, etc. Beware of Data Dredging   If many tests and comparisons are performed on data from populations that have NO differences, approximately 5 out of 100 will show evidence of differences just by random chance variation. So in a large number of tests or comparisons, you should expect some to have p-values below .05 just from random chance variation. Bible Code Bible Code    A rearrangement of the Hebrew letters of the Torah seem to reveal interesting patterns in the text that are claimed by proponents to be very unusual. However, skeptics have claimed to have found similar patterns in other large Hebrew texts like Moby Dick, War and Peace, etc. It is easy to search and find data patterns, but careful analysis is needed to determine their true meaning and likelihood. Power of a Test    Power is the probability of getting a small pvalue in a hypothesis test. In most research projects we are trying to reject a Ho and support a Ha, power is a measure of how likely we are to dump a Ho. As scientists we would like this probability to be as big as possible. It is a disaster to perform an experiment when we know Ha is true, but not be able to prove it to the world. Barnacles ! Power of a Test   It is extremely important to plan your research studies so that they have high power and thus a good chance of showing the result you desire. What can you do to make a study have high power? Power Exercises   How would you design a study to show that two groups differ, even if they are only a teensieweensie bit different? How would you design a study to show no evidence of a difference when in fact there is a substantial difference between the two groups? Heartworm Treatments   I worked for an animal health company that had developed a new treatment for heartworm disease in dogs that cured the dogs about 90 % of the time. A competing company had an old product that cured the dogs 80% of the time. Heartworm Treatments   The other company put out an ad for vets that said there was no significant difference between the two treatments. The company I consulted for quickly called me in my office and wanted to know how they could claim no difference in an experiment. Had they cheated or lied? Heartworm Treatment   No, they hadn’t cheated, but they were very clever. The study had 10 dogs on our treatment, (9 were cured) and 20 dogs on the other inferior treatment. The hypothesis test had a big p-value and thus no evidence of a difference. They were VERY clever. They did an experiment that had no chance of ever showing a difference. They could have printed the ad brochure before the results were in. Heartworm Treatment   Use the U Iowa simulation site to approximate the chance this experiment could have shown evidence the treatments differ. See it now? Their experiment had very low power and could hide a difference that actually existed. 2003 Deceptive Graph Award Winner Area Confusion with Height Vertical Axis Distorts Impression Mean vs Median   Recall the dangers of using a mean to describe the typical value in a list when the data distribution is skewed or has outliers. In data with these features, the median is a better description of typical value. Deceptive Graphs: Cut Vertical Axis Graph Deception  Laid out in this manner, the graph suggests that the gap between the two groups is overwhelming, rather than only 8 percentage points, within the poll's margin of error of +/- 7 percentage points. Also, this presentation obscures the poll's finding that majorities of all the groups sampled approved of the removal of Schiavo's feeding tube. A more accurate presentation of the poll's findings would have looked like this: Appropriate Graph Lurking Variables   the results of a survey of Cornell graduates (in the 1950's). The survey showed that 93% of the middle-aged male graduates were married but only 65 percent of the women were. One popular magazine writer quickly concluded that going to college seriously reduced a woman's chances of marriage. Or did it?? The correlation is real - the women did indeed marry at a lower rate. But - implying a causation is risky. Remember - correlation does not necessarily mean causation. Consider the following alternative explanation: the young women who go to Cornell are those who are more likely to delay marriage in favor of a career. A career-oriented woman would be more likely to attend a university and then head into a career than a marriage-oriented woman. The obvious correlation is a result of a single factor that is producing BOTH results. Distortion Variation in Graph vs Data Selective Reporting  Another example of poor use of statistics and polls is "Selective Reporting." In January 1995, the U.S. Agency for International Development (AID) reported (to great media response) that Americans support foreign aid. A Reader's Digest poll showed that 86% of the people don't know what AID does. It also showed that 67% want foreign aid cut. The same question was asked, in the context of cutting the deficit, and 83% said cut foreign aid. So where did AID get there information? A poll asked if the US should "share at least a small portion of its wealth with those in the world who are in great need." Over 80% said yes. They were asked if foreign aid should be cut a little, somewhat, a lot or "eliminate it entirely." 8% chose the last option. They took this to mean 92% "support" foreign aid. They ignored that 75% said there is "too much" foreign aid. Peter Donnelly   TED presentation, www.ted.com How Juries are Fooled by Statistics

Related docs
Steal The Car
Views: 3  |  Downloads: 2
How to Lie with Statistics
Views: 75  |  Downloads: 1
How to Lie with Statistics
Views: 168  |  Downloads: 0
How to Lie with Statistics
Views: 8  |  Downloads: 1
“the thief comes to kill, steal and destroy
Views: 11  |  Downloads: 1
How to Steal an Election by Hacking the Vote
Views: 49  |  Downloads: 6
Statistics Cheat Sheet
Views: 95  |  Downloads: 4
premium docs
Other docs by smythesteven
Service providers business plan financials
Views: 1030  |  Downloads: 183
ADDRESS BOOK
Views: 542  |  Downloads: 17
Customer Credit Application is Accepted Letter
Views: 318  |  Downloads: 1
Shareholder Resolution Approving Sale of Stock
Views: 283  |  Downloads: 17
Sample Articles of Organization for a Nevada LLC
Views: 778  |  Downloads: 17
Form I-9 Employment Eligibility Verification
Views: 544  |  Downloads: 9
Dirty Joke Cheat
Views: 1000  |  Downloads: 11
Duke ECE 163 Notes
Views: 637  |  Downloads: 16