# Hypothesis testing

Document Sample

```					       Lecture 8

Hypothesis testing: SIDS

Hypothesis testing   1
Testing alternative hypotheses
•   Suppose we want to compare how a given piece of evidence
(e) bears on the probabilities of H1 and H2.
p ( H 1)  p (e / H 1)
p ( H 1 / e) 
p (e)
p ( H 2)  p (e / H 2)
p ( H 2 / e) 
p (e)

p ( H 1)  p (e / H 1)
p ( H 1 / e)            p (e)           p ( H 1)  p (e / H 1)
                        
p ( H 2 / e)   p ( H 2)  p (e / H 2)   p ( H 2)  p (e / H 2)
p (e)
• The last formula is very useful and worth examining.

Hypothesis testing             2
The ratio of probabilities
•   Let us look at the ratio of posterior probabilities of H1 and H1:
p( H 1 / e) p( H 1) p(e / H 1)
       
p( H 2 / e) p( H 2) p(e / H 2)
• The posterior probability of H1 is higher than the posterior
probability of H2 if and only if the product of prior
probability and likelihood is higher for H1 than for H2.
• The formula gives the ratio of probabilities of the two
theories. If these two theories are the only possibilities (one
of them must be true), then we can immediately obtain the
probabilities themselves.
• For example, if p(H1│e) is four times higher than p(H2│e),
then p(H1│e) must be 0.8, and p(H2│e) must be 0.2 (under
the assumption that either H1 or H2 is true).
Hypothesis testing                 3
The Sally Clark case

•   November 1996: Christopher Clark (11 weeks old) dies in
the presence of his mother.
•   December 1997: Harry Clark (8 weeks old) dies, and again
only the mother was present.
•   1999: Sally Clark convicted of double murder.
•   2001: First appeal against the sentence (unsuccessful).
•   2003: Second appeal (successful).
•   An expert witness for the prosecution relied on a
probabilistic argument that created an uproar among the
statisticians.
•   Intervened: the Royal Statistical Society, the President of the
RSS, two professors of statistics (Oxford and UCL), the
President of the Mathematical Society, the Governor of the
Bank of England...

Hypothesis testing                4

•   Sudden infant death syndrome (SIDS) is “the sudden death
of an infant under 1 year of age, which remains
unexplained after a thorough case investigation, including
performance of a complete autopsy, examination of the
death scene, and review of the clinical history.”
•   SIDS is an infant death, due to unknown natural causes.
•   SIDS happens very rarely, so the repetition of SIDS in the
same family must happen even more rarely.
•   Meadow’s law: “One case of SIDS in a family is a tragedy,
two cases is suspicious, and three cases is a murder until
proven otherwise.” (Goldfinger’s rule!)
•   Applied to the Sally Clark case: the chances of one SIDS: 1
in 8,500. The chances of two SIDS: 1 in 73 million.

Hypothesis testing               5
Bayes’s theorem (the odds form)
p(2S / E)   p(2S) p(E / 2S)
(1)                     
p(2M / E) p(2M) p(E / 2M)

(2)     p(2S)  p(S)  p(S2 / S)

(3)     p(2M)  p(M)  p(M2 / M)

p(2S/E) p(S)   p(S2/S) p(E/2S)
(4)                       
p(2M/E) p(M) p(M2/M) p(E/2M)
The ratio of posterior probabilities of 2S and 2M depends on:
1. The ratio of prior probabilities of S and M.
2. The ratio of repetition probabilities of S and M.
3. The ratio of likelihoods of 2S and 2M.
Hypothesis testing             6
Prior probabilities of a single S and a single M

•   SIDS happens more frequently than infant murder, and so
p(S)/p(M) is significantly greater than 1.
•   But the difference is sometimes exaggerated.
•   An unspecified proportion of officially declared SIDS are
not really SIDS but murders.
•   True, some officially declared murders are also not really
murders but SIDS, but for two reasons the mistakes are far
more frequent in the former direction.
•   First, the crucial witness usually declares the case to be
SIDS and denies the murder hypothesis.
•   Second, the case is classified as murder only if it is proved
beyond reasonable doubt, whereas the SIDS classification
is based largely on ignorance.

Hypothesis testing               7
To square or not to square, that is the question
•    Meadow claimed that p (S & S2) = p (S) x p (S).
•    In general, however, p (S & S2) = p (S) x p (S2│S).
•    So, Meadow’s claim entails that p (S2│S) = p (S), which can
be called the independence hypothesis (IH).
•    There are two possibilities:
1.   Meadow was not aware that his claim entails IH, and he
committed an elementary probability mistake.
2.   Meadow was aware that his claim entails IH, but he did
not see it as a problem because he believed that IH is true.
•    RSS did not consider option 2 at all, but immediately
embraced 1, the fallacy scenario.
•    Two reasons: (a) Meadow gave no justification for IH, and
(b) RSS thought that there are strong a priori reasons
against IH.

Hypothesis testing               8
The probability of a second SIDS

•   “This approach *the squaring of the single SIDS
probability] is, in general, statistically invalid. It would
only be valid if SIDS cases arose independently within
families, an assumption that would need to be justified
empirically. Not only was no such empirical justification
provided in the case, but there are very strong a priori
reasons for supposing that the assumption will be false.
There may well be unknown genetic or environmental
factors that predispose families to SIDS, so that a second
case within the family becomes much more likely.” (RSS
2001).
•   …or perhaps less likely?
•   Isn’t this an empirical issue, not to be decided by “strong a
priori reasons,”, i.e. speculation?
Hypothesis testing               9
Should p(M) be squared too?

•   Dawid’s “equivalence argument”: if p(S) is squared, then
the same thing could be done with p(M) with equal
legitimacy.
•   The final result: p(2M) >> p(2S).
•   The equivalence argument is wrong.
•   Even if squaring of p(S) were dubious, the same procedure
with p(M) could be, and would be, much more dubious.
•   There is a strong reason to believe that the probability of a
second infanticide in the family would be substantially
higher than the probability of the first infanticide — if there
is no knowledge that the first case was infanticide.
•   In “Beyond Reasonable Doubt,” Helen Joyce works with
the assumption that p(M2│M) = 0.1.

Hypothesis testing                10
Figure 1a: Prior probabilities of S and M
(SIDS independence)

1.00

0.80

0.60                                                     M
Probability

2M
S
0.40
2S

0.20

0.00
0%        5%         10%             15%   20%

Proportion of misdiagnosed SIDS

Hypothesis testing                    11
Figure 1b: Prior probabilities of S and M
(SIDS dependence)
1.00

0.80

M
Probability

0.60
2M
S
0.40
2S

0.20

0.00

0%       5%         10%            15%   20%

Proportion of misdiagnosed SIDS

Hypothesis testing               12
Likelihoods of double SIDS and double murder

•   What are p(E│2S) and p(E│2M)?
•   What is E in the Sally Clark case?
•   Is E merely the fact that both children died?
•   But then, p(E│2S) = p(E│2M)= 1.
•   Was Clark really convicted just on the basis of prior
probabilities, as Helen Joyce suggests? (“The lightning
does not strike twice.”)
•   The judge’s explicit instruction to the jury: “I should I
think, members of the jury, just sound a note of caution
about the statistics. However compelling you may find
those statistics to be, we do not convict people in these
courts on statistics. It would be a terrible day if that were
so.”

Hypothesis testing                13
Why is this evidence “worrying”?

•   The judge in the first appeal: “Young, immobile infants do
not sustain injury without the carer having a credible
history as to how the injury was caused.”
•   “We and others have gone through the movements of
resuscitation on cadavers and have found that it is
extremely difficult to fracture ribs in an infant by pressing
on the chest or by any of the usual methods of artificial
respiration. Fractures of the ribs, however, can be
relatively easily produced by abnormal grasping of the
child’s thorax. The presence of fractures in any site in a
child younger than 1 year should be considered as caused
by abuse unless proven otherwise.” (John Emery)

Hypothesis testing               14
Why is this evidence “worrying”?
•   The judge in the first appeal: “Young, immobile infants do
not sustain injury without the carer having a credible
history as to how the injury was caused.”
•   “We and others have gone through the movements of
resuscitation on cadavers and have found that it is
extremely difficult to fracture ribs in an infant by pressing
on the chest or by any of the usual methods of artificial
respiration. Fractures of the ribs, however, can be
relatively easily produced by abnormal grasping of the
child’s thorax. The presence of fractures in any site in a
child younger than 1 year should be considered as caused
by abuse unless proven otherwise.” (John Emery)
p(2S/E)   p(S)   p(S2/S)   p(E/2S)
                         5.6  0.0069 0.2  0.008
p(2M/E) p(M) p(M2/M) p(E/2M)
Hypothesis testing               15
Figure 2A
Double SIDS or double murder: posterior probabilities (I)
1.00

2M
0.80
Probability

0.60

0.40

0.20

2 SIDS
0.00
0%         5%             10%        15%   20%

Proportion of misdiagnosed SIDS

Hypothesis testing                    16
Figure 2B
Double SIDS or double murder: posterior probabilities (D)
1.00

0.80

2M
Probability

0.60

0.40

0.20
2 SIDS

0.00
0%         5%              10%       15%   20%

Proportion of misdiagnosed SIDS

Hypothesis testing                   17

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 3 posted: 10/25/2011 language: English pages: 17