# Hypothesis testing by yantingting

VIEWS: 3 PAGES: 31

• pg 1
```									Hypothesis testing

Behavioural Science II
Week 1, Semester 2, 2002
Hypothesis testing
• Null hypothesis is that there is no
systematic relationship between
independent variables (IVs) and
dependent variables (DVs).
• Research hypothesis is that any
relationship observed in the data is
real.

Behavioural Science II    2
Hypothesis testing
• Whereas research hypothesis tends to be
imprecise about numerical differences
between groups (e.g., difference in
reaction times), null hypothesis states
very specifically that difference should be
zero.

Behavioural Science II        3
Null hypothesis versus
alternative hypothesis
• The null hypothesis assumes that
scores for different levels of the IV
are random samples from the same
population.
• The alternative hypothesis is that
samples come from different
populations.

Behavioural Science II    4
Null hypothesis versus
alternative hypothesis
• For any single experiment, we are bound
to see a difference, just as we see a
difference between the means of two
random samples in a distribution of
sample means.
• If the null hypothesis is true, then
differences in mean scores are just two
random samples from the same
population.

Behavioural Science II      5
Testing the null hypothesis
• A statistical test assesses the
probability of obtaining a given
sample or samples of scores,
assuming the null hypothesis is
correct.

Behavioural Science II   6
Testing the null hypothesis
• If the probability is low enough (e.g.,
p<.05), then the null hypothesis is rejected
in favour of the alternative (research)
hypothesis, and the IV is deemed to have
a systematic effect.
• If the probability is not sufficiently low
(e.g., p>.05), then the null hypothesis is
not rejected but retained, and the IV is
deemed to have no effect (i.e., the
observed changes are due to chance).
Behavioural Science II     7
Statistical significance
• Statistical significance refers to the
probability of the data obtained, given that
the null hypothesis is true.
• A statistically significant result does not
mean that the null hypothesis is
improbable.
• There is an ongoing gap between
statistical significance and substantive
significance.

Behavioural Science II     8
Hypothesis testing and
sampling distributions
• The decision to reject or not reject
the null hypothesis usually is made
with reference to the sampling
distribution of a statistic of some
kind (e.g., z-distribution, t-
distribution).

Behavioural Science II    9
Example of hypothesis
testing using z-distribution
• Null hypothesis population
parameters:
 = 15
=15
• Random sample statistics
Mean = 110
N=9

Behavioural Science II   10
Applying formulae
    15 15
X             5
N    9 3

X  X     110 100 10
Z                        2
X          5      5
• Given that z-score of 1.96 = p< .05 (two-
tailed), would reject null hypothesis.
Behavioural Science II      11
Example of hypothesis
testing using t-distribution
• Null hypothesis population
parameters:
=100
• Random sample statistics
Mean = 110
N=9
∑x2 = 960

Behavioural Science II   12
Applying formulae
Given that t-
scores of  ˜
   x 2


960 960
    10.95
2.306 (df=8)     N 1       9 1   8
=p< .05
(two-tailed),
would
reject the  

˜        10.95 10.95
˜X                       3.65
null               N           9    3
hypothesis.

X  X         110 100   10
t                                 2.74

˜X              3.65     3.65
Behavioural Science II    13
Hypothesis testing using
confidence intervals
• We reject null hypothesis when null
population mean lies outside the
confidence interval.
• We infer alternative population mean is
higher than null population mean if lower
limit of confidence intervals is to right of
null population mean and lower if upper
limit of confidence intervals is to left of
null population mean.
Behavioural Science II         14
Errors in hypothesis testing
• Given the gap between statistical and
substantive significance, a decision
based on probability to retain or
reject the null hypothesis can be
wrong.

Behavioural Science II   15
When null hypothesis is
true (Type I error)
• When null hypothesis is true, and it
is rejected, this decision is called a
Type 1 error.
• The probability of making such an
error is designated alpha () and is
equivalent to the significance level
(e.g., p<.05).

Behavioural Science II     16
When null hypothesis is
true (Type I error)
• If null hypothesis is true and alpha level is
set at .05, then the null hypothesis will be
rejected 5% of time even though it is true.
• One way to safeguard against a Type I
error is to set a more stringent alpha level
(e.g., p<.01).

Behavioural Science II      17
When null hypothesis is
false (Type II or III errors)
• When alternative hypothesis is true,
and the statistic (mean) from
alternative distribution falls within
cut-off points (i.e., p>.05), then null
hypothesis would be retained.

Behavioural Science II      18
Type II error
• Retaining null hypothesis when alternative
hypothesis is true is called a Type II error.
• The probability of making a Type II error
usually is symbolized as beta ().
• The probability of beta depends on how
much the alternative hypothesis sampling
distribution overlaps the retention region
of the null hypothesis sampling
distribution.

Behavioural Science II      19
Type III error
• It is also possible to make a Type III error,
by rejecting a null hypothesis but inferring
the incorrect alternative hypothesis.
• The probability of making a Type III error
usually is symbolized as gamma () and is
equivalent to whatever percentage of
scores in the alternative distribution falls
in the far end of the null hypothesis
distribution. The probability of making a
Type III error is usually quite small.
Behavioural Science II      20
The power of a test
• The probability of rejecting a false
null hypothesis and correctly
inferring the position or direction of
the alternative hypothesis with
respect to the null hypothesis.
• Factors affecting power and error
rates

Behavioural Science II     21
Power is affected by
significance (alpha) level
• Setting a less stringent significance
level increases the discriminatory
power of the statistical test and
increases power as long as the
alternative hypothesis is true.

Behavioural Science II     22
Power is affected by magnitude of
difference between sample means
• So, increasing the difference in the
size of the mean at differing levels of
the IV increases the power of the
test.

Behavioural Science II   23
Power is affected by sample size

• An increase in sample size increases
the power of the test, if the
alternative hypothesis is true.
• This is because as sample size
increases, the standard error of the
mean decreases, thus reducing the
overlap between the null and
alternative hypotheses.
Behavioural Science II   24
Effect size
• In order to gauge the effect of the IV,
it makes sense to contrast the
difference between the population
mean for the null hypothesis and the
population mean for the alternative
hypothesis.

Behavioural Science II   25
Effect size formula
 0  1
Eff ect_ size 

• where
•  is standard deviation of population
of dependent measure scores.

Behavioural Science II   26
Judging effect sizes
• According to Cohen (1988)
.20 = small effect size
.50 = medium effect size
.80 = large effect size

Behavioural Science II   27
Do we really need the null
hypothesis?
• A significant test of the null
hypothesis does not mean the data
are not a product of chance.
• The significant result may simply be
a Type I error (falsely rejecting null
hypothesis).

Behavioural Science II     28
Do we really need the null
hypothesis?
• Better to test research hypothesis, if
know size and direction of effect.
• Even better report combination of
outcome values (e.g., effect sizes,
confidence intervals, strength of
relationship).

Behavioural Science II     29
One-tailed versus two-tailed
tests
• Conventionally reject null hypothesis if
obtained z-score or t-score falls beyond
certain values in either tail of the relevant
sampling distribution (i.e., a two-tailed
test).
• In specific contexts, a one-tailed test
might seem appropriate (e.g., reject null
hypothesis only if test statistic fell in 5%
left-hand tail of distribution.
Behavioural Science II         30
One-tailed versus two-tailed
tests
• Generally, two-tailed tests are preferred to
one-tailed tests.
• The IV may have an effect in opposite
direction to the one predicted.

Behavioural Science II     31

```
To top