Basic principles of probability theory

W
Document Sample
scope of work template
							                   Elementary hypothesis testing

•   Purpose of hypothesis testing
•   Type of hypotheses
•   Type of errors
•   Critical regions
•   Significant levels
•   Hypothesis vs intervals
                      Purpose of hypothesis testing

Statistical hypotheses are in general different from scientific ones. Scientific
       hypotheses deal with the behavior of scientific subjects such as interactions
       between all particles in the universe. These hypotheses in general cannot be
       tested statistically. Statistical hypotheses deal with the behavior of observable
       random variables. These are hypotheses that are testable by observing some set
       of random variables. They are usually related to the distribution(s) of observed
       random variables.
For example if we have observed two sets of random variables x=(x1,x2,,,,xn) and
       y=(y1,y2,,,,ym) then one natural question arises: are means of these two sets are
       different? It is a statistically testable hypothesis. Another question may arise do
       these two sets of random variables come from the population with the same
       variance? Or do these distribution come from the populations with the same
       distribution? These questions can be tested using randomly observed samples.
                           Types of hypotheses

Hypotheses can in general be divided into two categories: a) parametric and b) non-
     parametric. Parametric hypotheses concern with situations when the
     distribution of the population is known. Parametric hypotheses concern with the
     value of one or several parameters of this distribution. Non-parametric
     hypotheses concern with situations when none of the parameters of the
     distribution is specified in the statement of the hypothesis. For example
     hypothesis that two set of random variables come from the same distribution is
     non-parametric one.
Parametric hypotheses can also be divided into two families: 1) Simple hypotheses
     are those when all parameters of the distribution are specified. For example
     hypothesis that set of random variables come from normal distribution with
     known variance and known mean is a simple hypothesis 2) Composite
     hypotheses are those when some parameters of the distribution are specified
     and others remain unspecified. For example hypothesis that set of random
     variables come from the normal distribution with a given mean value but
     unknown variance is a composite hypothesis.
                       Errors in hypothesis testing

Hypothesis is usually not tested alone. It is tested against some alternative one.
     Hypothesis being tested is called the null-hypothesis and denoted by H0 and
     alternative hypothesis is denoted H1. Subscripts may be different and reflect
     nature of the alternative hypothesis. Null-hypothesis gets “benefit of doubt”.
     There are two possible conclusions: reject null-hypothesis or not-reject null-
     hypothesis. H0 is only rejected if sample data contains sufficiently strong
     evidence that it is not true. Usually testing of hypothesis comes to verification
     of some test statistic (function of the sample points). If this value belongs to
     some region w hypothesis is rejected.. This region is called critical region. The
     region complementary to the critical region that is equal to W-w is called
     acceptance region. By rejecting or accepting hypothesis we can make two types
     of errors:
Type I error: Reject H0 if it is true
Type II error: Accept H0 when it is false.
Type I errors usually considered to be more serious than type II errors.
Type I errors define significance levels and Type II errors define power of the test. In
     ideal world we would like to minimize both of these errors.
                               Power of the test

The probability of Type I error is equal to the size of the critical region, . The
     probability of the type II error is a function of the alternative hypothesis (say
     H1). This probability usually denoted by . Using notation of probability we
     can write:
                  P( x  w | H 0 )  
                  P( x W  w | H1 )   or P( x  w | H1 )  1  
Where x is the sample points, w is the critical region and W-w is the acceptance
      region. If the sample points belongs to the critical region then we reject the
      null-hypothesis. Above equations are nothing else than Type I and Type II
      errors written using probabilistic language.
Complementary probability of Type II error, 1- is also called the power of the test of
      the null hypothesis against the alternative hypothesis.  is the probability of
      accepting null-hypothesis if alternative hypothesis is true and 1- is the
      probability of rejecting H0 if H1 is true
Since the power of the test is the function of the alternative hypothesis specification
      of H1 is an important step in hypothesis testing. It is usual to use test statistics
      instead of sample points to define critical region and significance levels.
                                   Critical region
Let us assume that we want to test if some parameter of the population is equal to a
     given value against alternative hypothesis. Then we can write:
                                  H 0 :   0 against H1 :   0
Test statistic is usually a point estimation for  or somehow related to it. If critical
      region defined by this hypothesis is an interval (-;cu] then cu is called the critical
      value. It defines upper limit of the critical interval. All values of the statistic to the
      left of cu leads to rejection of the null-hypothesis. If the value of the test statistic is
      to the right of cu this leads to not-rejecting the hypothesis. This type of hypothesis
      is called left one-sided hypothesis. Problem of the hypothesis testing is either for
      a given significance level find cu or for a given sample statistic find the observed
      significance level (p-value).
                                 Significance level
It is common in hypothesis testing to set probability of Type I error,  to some values
       called the significance levels. These levels usually set to 0.1, 0.05 and 0.01. If null
       hypothesis is true and probability of observing value of the current test statistic is
       lower than the significance levels then hypothesis is rejected.
Consider an example. Let us say we have a sample from the population with normal
       distribution N(,2). We want to test following null-hypothesis against alternative
       hypothesis:
                                  H0:  = 0 and H1:  < 0
This hypothesis is left one-sided hypothesis. Because all parameters of the distribution
       (mean and variance of the normal distribution) have been specified it is a simple
       hypothesis. Natural test statistic for this case is the sample mean. We know that
       sample mean has normal distribution. Under null-hypothesis mean for this
       distribution is 0 and variance is /n. Then we can write:
                                         X  0 cu  0           c  0
                  P ( X  cu )  P(                 )  P( Z  u     )
                      0
                                         / n / n                / n
If we use the fact that Z is standard normal distribution (mean 0 and variance 1) then
      using the tables of standard normal distribution we can solve this equation.
                                  Significance level: Cont.
Let us define:
                        0  cu
                 z 
                        / n

Then we need to solve the equation (using standard tables or programs):
                            ) z   Z (P  

Having found z we can solve the equation w.r.t cu.
             0  cu
                      z and cu  0  z / n
             / n
If the sample mean is less than this value of cu we would reject with significance level
       . If sample mean is greater than this value then we would not reject null-
       hypothesis. If we reject (sample mean is smaller than cu) then we would say that if
       the population mean would be equal to 0 then probability that we would observe
       sample mean is .
To find the power of the test we need to find probability under condition that alternative
       hypothesis is true.
                       Significance level: An example.
Let us assume that we have a sample of size 25 and sample mean is 128. We know that this
       sample comes from the population with normal distribution with variance 5.4. We do not
       know population mean. We want to test the following hypothesis:
                                  H0: =130, against H1: <130
We have 0 = 130. Let us set significance level to 0.05. Then from the table we can find that
       z0.05=1.645 and we can find cu.
                         cu= 0 –z0.05 5.4/25 = 130-1.645 5.4/5 = 128.22
Since the value of the sample mean (128) belongs to the critical region (I.e. it is less than 128.22)
       we would reject null-hypothesis with significance level 0.05.

Test we performed was left one-sided test. I.e. we wanted to know if value of the sample mean is
      less than assumed value (130). Similarly we can build right one-sided tests and combine
      these two tests and build two sided tests. Right sided tests would look like
                                    H0: =0 against H1: >0
Then critical region would consist of interval [c l;). Where cl is the lower bound of the critical
      region

And two sided test would look like
                                    H0: =0 against H1: 0
Then critical region would consists combination of two intervals (-;cu] [cl;).
                            Composite hypothesis
In the above example we assumed that the population variance is known. It was simple
      hypothesis (all parameters of the normal distribution have been specified). But in
      real life it is unusual to know the population variance. If population variance is
      not known the hypothesis becomes composite (hypothesis defines the population
      mean but population variance is not known). In this case variance is calculated
      from the sample and it replaces the population variance. Then instead of normal t
      distribution with n-1 degrees of freedom is used. Value of z is found from the
      table of the tn-1 distribution. If n (>100) is large then as it can be expected normal
      distribution very well approximates t distribution.
Above example can be easily extended for testing differences between means of two
      samples. If we have two samples from the population with equal but unknown
      variances then tests of differences between two means comes to t distribution with
      (n1+n2-2) degrees of freedom. Where n1 is the size of the first sample and n2 is the
      size of the second sample.
If variances for both population variances would be known then testing differences
      between two means comes to normal distribution.
                                 P-value of the test
Sometimes instead of setting pre-defined significance level p-value is reported. It is also
    called observed significance level. Let us analyse it. Let us consider above
    example when we had sample of size 25 with the sample mean 128. We assumed
    that we knew population variance – 5.4. P-value is calculated as follows:
                                                128  0
                     P0 ( X  128 )  P( Z             )  P( Z  1.852 )  0.0322
                                                 / n

We would reject null-hypothesis with significance level 0.05 but we would accept it
    with significance level 0.01. Probability 0.0322 if the population mean would be
    130 observing 128 or less has probability 0.0322. In other word if would draw
    100 times sample of size 25 would observe around 3 times that mean value is less
    or equal to 128.
                    Hypothesis testing vs intervals
Some modern authors in statistics think that significance testing is overworked
    procedure. It does not make much sense once we have observed the sample. Then
    it is much better to work with confidence intervals. Since we can calculate
    statistics related with the parameter we want to estimate then we can make
    inference that where “true” value of the parameter may lie. As we could see in the
    above example we would reject particular hypothesis with significance level 0.01
    but would accept with the significance level 0.05. Testing hypothesis did not say
    anything about parameter in spite of the fact that we had the sample mean. On the
    contrary confidence interval at least says where “true” value may be and do we
    need more experiment to increase our confidence and reduce interval size.
                                 Further reading
Full exposition of hypothesis testing and other statistical tests can be found in:

Stuart, A., Ord, JK, and Arnold, S. (1991) Kendall’s advanced Theory of statistics.
      Volume 2A. Classical Inference and the Linear models. Arnold publisher,
      London, Sydney, Auckland
Box, GEP, Hunter, WG, Hunter, JS (1978) Statistics for experimenters
                               Exercise 1
Two species (A and B) of trees were planted randomly. Each specie had 10
   plots. Average height for each plot was measured after 6 years. Analyze
   differences in means.
A: 3.2 2.7 3.0 2.7 1.7 3.3 2.7 2.6 2.9 3.3
B: 2.8 2.7 2.0 3.0 2.1 4.0 1.5 2.2 2.7 2.5

Write a report.
Hint: Use t.test for differences in means and var.test for differences in
   variances.

						
Related docs
Other docs by nyut545e2