Learning Center
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Hypotheses Testing


									Hypothesis Testing

                         Where Am I?
• Wake up after a rough night in unfamiliar surroundings
• Still in Boulder?

Expected if in Boulder   Surprising but not impossible Couldn’t happen IF in Boulder
  (large likelihood)        (moderate likelihood)          (likelihood near zero)
                                                           Can’t be in Boulder
         Steps of Hypothesis Testing
1. State clearly the two hypotheses
2. Determine which is the null hypothesis (H0) and which is the
   alternative hypothesis (H1)
3. Compute a relevant test statistic from the sample
4. Find the likelihood function of the test statistic according to the
   null hypothesis
5. Choose alpha level (a): how willing you are to abandon null (usually .05)
6. Find the critical value: cutoff with probability a of being exceeded
   under H0
7. Compare the actual result to the critical value
   • Less than critical value  retain null hypothesis
   • Greater than critical value  reject null hypothesis;
      accept alternative hypothesis
         Specifying Hypotheses
• Both hypotheses are statements about
  population parameters
• Null Hypothesis (H0)
  – Always more specific, e.g. 50% chance, mean of 100
  – Usually the less interesting, "default" explanation
• Alternative Hypothesis (H1)
  – More interesting – researcher’s goal is usually to
    support the alternative hypothesis
  – Less precise, e.g. > 50% chance,  > 100
                      Test Statistic
• Statistic computed from sample to decide between
• Relevant to hypotheses being tested
   – Based on mean if hypotheses are about means
   – Based on number correct (frequency) if hypotheses are
     about probability correct
• Sampling distribution according to null hypothesis
  must be fully determined
   – Can only depend on data and on values assumed by H0
• Often a complex formula with little intuitive meaning
   – Inferential statistic: Only used in testing reliability
              Likelihood Function
• Probability distribution of a statistic according to a
   – Gives probability of obtaining any possible result
• Usually interested in distribution of test statistic
  according to null hypothesis
• Same as sampling distribution, assuming the
  population is accurately described by the hypothesis
• Test statistic chosen because we know its likelihood
   – Binomial test: Binomial distribution
   – t-test: t distribution
                                              Critical Value
              •   Cutoff for test statistic between retaining and rejecting null hypothesis
                   – If test statistic is beyond critical value, null will be rejected
                   – Otherwise, null will be retained
              •   Before collecting data: What strength of evidence will you require to reject null?
                   – How many correct outcomes?
                   – How big a difference between M and 0?
              •   Critical region
                   – Range of values that will lead to rejecting null hypothesis
                   – All values beyond critical value

                                 Frequency                                               t
                      Types of Errors
• Goal: Reject null hypothesis when it’s false; retain it when it’s
• Two ways to be wrong
   – Type I Error: Null is correct but you reject it
   – Type II Error: Null is false but you retain it
• Type I Error rate
   – IF H0 is true, probability of mistakenly rejecting H0
   – Proportion of false theories we conclude are true
       • Proportion of useless drugs that are deemed effective
• Logic of hypothesis testing is founded on controlling Type I
  Error rate
   – Set critical value to give desired Type I Error rate
                          Alpha Level
• Choice of acceptable Type I Error rate
   – Usually .05 in psychology
   – Higher  more willing to abandon null hypothesis
   – Lower  require stronger evidence before abandoning null hypothesis
• Determines critical value
   – Under the sampling distribution of the test statistic according to the
     null hypothesis, the probability of a result beyond the critical value is
                        Sampling Distribution from H0


                                Test Statistic
                                                 Critical Value
                       Doping Analogy
• Measure athletes' blood for signs of doping
    – Cheaters have high RBCs, but even honest people vary
• What rule to use?
    – Must set some cutoff, and punish anyone above it
    – Will inevitably punish some innocent people
• H0 likelihood function is like distribution of innocent athletes’ RBCs
• Cutoff determines fraction of innocent people that get unfairly punished
    – This fraction is alpha

                        Distribution of Innocent Athletes

                 Don’t Punish                               Punish

•   Type II Error rate
     – IF H0 is false, probability of failing to reject it
     – E.g., fraction of cheaters that don’t get caught
•   Power
     – IF H0 is false, probability of correctly rejecting it
     – Equal to one minus Type II Error rate
     – E.g., fraction of cheaters that get caught
•   Power depends on sample size
     – Choose sample size to give adequate power
     – Researchers must make a guess at effect size to compute power

                                  Type I error rate (a)
       H0                                                      H0

    Type II error rate               Power

       H1                                                      H1
                             Two-Tailed Tests
   • Sometimes want to detect effects in either direction
        – Drugs that help or drugs that hurt
   • Formalized in alternative hypothesis
        –  < 0 or  > 0
   • Two critical values, one in each tail
   • Type I error rate is sum from both critical regions
        – Need to divide errors between both tails
        – Each gets a/2 (2.5%)

Reject H0                Reject H0         a/2                         a/2

                0                               -tcrit    0   tcrit

               M                                           t
  One-Tailed vs. Two-Tailed Tests

One-tailed                                a

                            0   tcrit


Two-tailed   a/2                          a/2

                   -tcrit   0     tcrit

        An Alternative View: p-values
• Reversed approach to hypothesis testing
    – After you collect sample and compute test statistic
    – How big must a be to reject H0
• p-value
    –   Measure of how consistent data are with H0
    –   Probability of a value equal to or more extreme than what you actually got
    –   Large p-value  H0 is a good explanation of the data
    –   Small p-value  H0 is a poor explanation of the data
• p > a: Retain null hypothesis
• p < a: Reject null hypothesis; accept alternative hypothesis
• Researchers generally report p-values, because then reader can choose
  own alpha level
    – E.g. “p = .03”
    – If willing to allow 5% error rate, then accept result as reliable
    – If more stringent, say 1% (a = .01), then remain skeptical
                                                       t for a a .05a
                                                               tcrit =
                                                    tcritcrit for = for.03= .01

                                                     t   t = 2.15  p = .03

To top