# Hypothesis testing and parameter estimation

Document Sample

```					    Hypothesis testing and
parameter estimation
Bhuvan Urgaonkar
“Empirical methods in AI” by P. Cohen
System behavior in
unknown situations

Self-tuning systems ought to behave properly in
situations not previously encountered

How to quantify the goodness of a system in
dealing with unknown situations?

Statistical inference is one way

2
Statistical inference

Process of drawing inference about an unseen
population given a relatively small sample

Populations and samples

Statistics: Functions on samples

Parameters: Functions on populations

3
Examples

Example 1: Toss a fair coin
– Parameter: number of heads in 10 tosses
– Can be determined analytically

Example 2: Two chess programs A and B play 15
games, A wins 10, draws 2, loses 3.
– Parameter: probability that A wins
– Population of all possible chess games too large to
enumerate => we cannot know the exact value
• Can estimate pwin as p=0.67
• p is a statistic derived from the above sample
4
Two kinds of statistical inference

Hypothesis testing: Answer a yes-or-no question
about a population and assess the probability
– Assume pwin=0.5 and assess the probability of the
sample result p=0.67
– If this is very small, A and B are not equal
Parameter estimation: Estimate the true value of
a parameter given a statistic
– If p=0.67, what is the “best” estimate of pwin
– How wide an interval to draw around p to be
confident that pwin falls within it?
5
Two kinds of statistical inference

Hypothesis testing: Answer a yes-or-no question
about a population and assess the probability
– Assume pwin=0.5 and assess the probability of the
sample result p=0.67
– If this is very small, A and B are not equal
Parameter estimation: Estimate the true value of
a parameter given a statistic
– If p=0.67, what is the “best” estimate of pwin
– How wide an interval to draw around p to be
confident that pwin falls within it?
6
Hypothesis testing example

Two programs A and B that summarize news
stories
– Performance measured as recall, the proportion of
the important parts of a story that make it into the
summary
Suppose you run A every day for 120 days and
record mean recall scores of 10 stories
Then you run B and want to answer:
– Is B better than A?

7
Hypothesis testing steps

 Formulate a null hypothesis
– mean(A) = mean(B)
 Gather a sample of 10 news stories and run them
through B. Call the sample mean Emean(B)
 Assuming the null hypothesis is right, estimate the
distribution of mean recall scores for all possible
samples of size 10 run through B
 Calculate the probability of obtaining Emean(B) given
this distribution
 If this probability is low, reject the null hypothesis
8
Hypothesis testing steps

 Formulate a null hypothesis
– mean(A) = mean(B)
 Gather a sample of 10 news stories and run them
through B. Call the sample mean Emean(B)
 Assuming the null hypothesis is right, estimate the
distribution of mean recall scores for all possible
samples of size 10 run through B
 Calculate the probability of obtaining Emean(B) given
this distribution
 If this probability is low, reject the null hypothesis
9
Sampling distributions

Distribution of a statistic calculated from all possible
samples of a given size, drawn from a given
population

Example: Two tosses of a fair coin; sample statistic
– Sampling distribution is discrete
– Elements are 0, 1, 2 with probabilities 0.25, 0.5, 0.25

How to get sampling distributions?

10
Exact sampling distributions

Coin tossed 20 times, num. heads=16
– Is the coin fair?
Sampling distribution of the proportion ph under
the null hypo that the coin is fair
Easy to calculate exact probabilities of all the
values for ph for N coin tosses
– Possible values: 0/N, 1/N, …, N/N
– Pr(ph=i/N) = N! * 0.5N / i! * (N-i)!
– Pr(ph=16/20) = 0.0046 --- next to impossible!
11
Estimated sampling distributions

Unlike the sampling distribution of the
proportion, that of the mean cannot be
calculated exactly.
– Recall the news story example

It can, however, be estimated due to a
remarkable theorem

12
Central limit theorem

The sampling distribution of the mean of
samples of size N approaches a normal
distribution as N increases.
– If samples are drawn from a population with mean M
and std. dev SD, then the mean of the sampling
distribution is M, its std. dev is SD/sqrt(N)
– This holds irrespective of the shape of the population
distribution!

13
The missing piece in
hypothesis testing

Null hypothesis
– mean(A) = mean(B)

We don’t know the distribution of mean(B), but
we do know the distribution of Emean(A)!
– CLT: Emean(A) = mean (A) = mean (B)

14
Computer-aided methods for
estimating sampling distributions

Use simulation to estimate the sampling
distribution
Monte Carlo tests
– If population distribution is known but not the
sampling distribution of the test statistic
– Derive samples from this known distribution
Bootstrap methods
– Population distribution is unknown
– Idea: Resample from the sample (treat the sample as
the population!)
15
Other related concepts/techniques

 Hypotheses tests that work under different conditions
– Z-test, t-test (small values of N)
– Ref: Paul Cohen

 Parameter estimation
–   Confidence intervals
–   Analysis of variance: interaction among variables
–   Contingency tables
–   Ref: Paul Cohen

 Expectation maximization
– X: observed data, Z: unobserved, Let Y=X U Z
– Searches for h that maximizes E[ln P(Y | h)]
– Ref: “Machine Learning” by Tom Mitchell
16

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 9 posted: 4/19/2012 language: pages: 16
How are you planning on using Docstoc?