VIEWS: 128 PAGES: 5 CATEGORY: Jobs & Careers POSTED ON: 7/20/2010
• From the statistical tables for a Standard Normal distribution, we note that N (0,1) Area Under From To Density Function ←0.95→ 0.90 -1.64 1.64 0.95 -1.96 1.96 -1.96 0 +1.96 Confidence Interval 0.99 -2.58 2.58 or • From the central limit theorem, if x and s2 are mean and variance Interval Estimation of a random sample of size n, n > (from a large parent population), then we can say that 90% C.I. for population mean µ x−µ P{−1.64 ≤ ≤ +1.64} ≈ 0.90 σ n where s replaces σ, if σ is unknown • or P{x − 1.64 σ ≤ µ ≤ x + 1.64 σ } ≈ 0.90 n n 1 2 Confidence Interval for the mean Confidence Interval for the mean- contd. 1. If sample size is less than 30 (n < 30) and population 2. A normal distribution with known population variance or variance unknown large sample size (n > 30) X ±t s n ( n −1,α / 2 ) X ± Zα / 2 s n Thus, for a 95% confidence interval for a sample with 20 Thus, for a 95% confidence interval for a sample with 50 observations, observations, X ± 2 . 093 s n X ± 1.96 s n – 2.093 is the t(19, 0.025) percentile point. 3 4 Confidence Interval for Variance CI for Difference between Two Population Means • A normal distribution – Depends on the chi-square distribution, which is the distribution of Two random samples are drawn from two populations squared normal variables 1. If population variances are known, C. I is given by – So, if underlying distribution is not normal, the estimation will be poor σ 2 σ 2 ( x1 − x 2 ) ± zα /2 1 + 2 n1 n2 (n-1)s2 / χ2n-1, α/2, (n-1)s2 / χ2n-1, α/2 • Use sample variance for sample means – Not symmetric • When n1+n2-2 < 30, use t distribution • Rarely used in practice as such, but equality of variances are tested (using the F-distribution) in a number of procedures 5 6 1 CI for Difference between Two Population Means CI for Population Proportions • Population Variances are Equal and Unknown • Same procedures as with means, p for x and p(1-p) for s2 – Can use pooled estimate of variance as on previous slide except • For example, that a pooled estimate of the sample variance is used ( n1 − 1) s12 + ( n 2 − 1) s 2 2 s2 = p ± z (α / 2 ) p (1 − p ) / n p n1 + n2 − 2 ˆ ˆ ˆ σ 2 σ 2 ( x1 − x 2 ) ± z α /2 1 + 2 n1 n2 • Is the 100(1-α) percent confidence interval for p • Similarly for difference between two population proportions • Population Variances are Unequal and Unknown σ 2 σ 2 ( x1 − x 2 ) ± z α /2 1 + 2 n1 n2 7 8 Example: confidence intervals Examples: • Attribute Sampling • A random sample of size n = 10, drawn from a large parent population, A random sample of size, n = 25 has x = 15 and s = 2. has a mean of 12 and a standard deviation s = 2. Then a 99% Then a 95% confidence interval for m is confidence interval for the parent mean is P{x − 1.96 s ≤ µ ≤ x + 1.96 s } ≈ 0.95 x ± 3.25 s n n n i.e. 15 + 1.96 (2 / 5) so, C.I. is 14.22 to 15.78 ie. 12 + 3.25 (2)/3 that is an interval 9.83 to 14.17 • and 95% confidence limits for the parent mean is • Proportionate Sampling x ± 2.262 s A random sample of size n = 1000 has p = 0.40 n ie 12 + 2.262 (2)/3 that is an interval 10.492 to 13.508. P{ p − 1.96 p (1 − p ) ≤ P ≤ p + 1.96 p (1 − p ) } ≈ 0.95 Note that for n = 1000, 1.96 p(1 − p ) ≈ 0.03 for values of p n n • n between 0.3 and 0.7. A 95% confidence interval for P is 0.40 ±0.03 (i.e.) 0.37 to 0.43. Refer to 3% “swing” or “inherent error” 9 10 Exmple: Hypothesis testing N(0,1) • Suppose that it is claimed 0.95 that the average survival time of patients with cancer at a specific -1.96 1.96 site = 60 months. A random sample of n= 49 patents gives a mean of 55 with Hypothesis Testing a standard deviation of 2. Is the sample Rejection regions finding consistent with the claim? We regard the original claim as a null hypothesis (H0) which is tentatively accepted as true: H0 : µ = 60, with H1 : µ ≠60 x −µ If H0 true, test statistic tn − 1 = as above s n 11 12 2 Testing a Single Mean – One Sided Testing a Single Mean – Two Sided • Test to compare the mean of a normal distribution against a pre-specified value, such as a population mean • In some cases, may not be sure of which direction the • Test statistic is difference may be going X − µ0 • In this case, we are testing t= s n – H0: µ=µ0 vs. H1: µ ≠ µ0 – Test statistic is the same • for H0: µ=µ0 vs. H1: µ<µ0 or µ>µ0 X − µ0 – with σ unknown t= s n • Reject if t < t(n-1, α), accept otherwise – t is called a test statistic – Reject if t < t(n-1, α/2), accept otherwise – t(n-1,α) is called a critical value – As with confidence intervals, two sided tests have higher critical values • Alternatively, use p-values directly 13 14 Paired T-Test Testing a Single Variance • Test to deal with two observations on the same individuals • Test statistic (Chi-Square) is: – Before vs. after treatment – Before vs. after some biological milestone χ2 = (n-1) s2/σ02 • Approach is to calculate differences between two measurements for each individual and then test the difference against zero • One sided test, H0: σ2= σ02 vs. H1: σ2 < σ02 or σ2 > σ02 – di = Xi1 – Xi2 • Reject if χ2 < χ2 (n-1, α), accept otherwise – Test statistic t = d (s d n ) • Two sided test, H0: σ2= σ02 vs. H1: σ2 ≠ σ02 – And test against the t-distribution with d.f. = n-1 and associated p-value • Reject if χ2 < χ2 (n-1, α/2), accept otherwise 15 16 Testing a Population Proportion Comparison of Means of Two independent Populations • Extension of the approaches for mean. • Assume now that samples are independent Test statistic is given by • We test H0: 1= 2 vs. H1: 1 ≠ 2 1. If the populations variances are Unknown and Equal z= ( p − p0 ) ˆ • Test statistic 1 1 p0q0 / n t = ( x1 − x 2 ) / s p + n1 n2 1. One sided, H0: P=P0 and H1: P<P0 or P>P0 compare Z with Zα Where sp is called a pooled estimate of the standard deviation and is given as 2. Two sided, H0: P=P0 and H1: P≠P0 n − 1 s 2 + n − 1 s 2 s p = 1 1 2 2 compare Z with Zα/2 n + n − 2 1 2 17 18 3 Contd. Testing for Equality of Two Variances • How do we test for equal variances for the t-test 2. If the populations variances are Unknown and Unequal • We calculate the ratio of the variances • Test statistic s 12 s 22 F = s 1 2 / s 22 t = ( x1 − x 2 ) / + n1 n2 • Test that against the F distribution with n1 numerator and n2 denominator degrees of freedom – Also called the F test – a major part of regression analysis • Reject if t < t(n-1, α/2), accept otherwise – Two-sided tests, so we reject for small and large values of the F statistic • Because it is two-sided, does not matter which variance is in the numerator vs. the denominator 19 20 Continued Examples: H.T. for a single proportion • TWO-SAMPLE 1. In a survey of injecting drug users, 18 out of 423 were HIV H 0 : σ 12 = σ 2 2 positive. Claim fewer than five percent in IDU population HIV s σ 2 2 positive? Hypothesised proportion = 0.05 gives H1 : σ 12 ≠ σ 2 2 Fα / 2 < 1 < F1−(α / 2) 1 s σ 2 2 2 2 H 0 : p ≥ 0.05 p = 18 / 423 = 0.0426 ˆ while H 1 : p < 0.05 after manipulation – gives σ p = (0.05)(0.95) 423 ˆ s s2 2 σ s s 2 2 2 1-sided at α = 0.01, has a Z= - 2.33, while from data: 1 <2 < 1 1 2 F1−(α / 2 ) σ 2 Fα / 2 2 0.0426 − 0.05 and where, conveniently: Z= = −0.70 (0.05)(0.95) 423 1 F1 − α = / 2 , v1 , v 2 Fα We accept H0 at α = 0.01. Clearly, test inconclusive at α = 0.01 / 2 , v 2 , v1 21 22 Examples: H.T. for two proportions Example: H.T. for a single Variances 2. Two groups of patients, 55 with hypertension of whom 24 on • Given a simple random sample, size 12, of animals studied to examine special diet, 149 without, of whom 36 on special diet. Can we say? release of mediators in response to allergen inhalation. Known S.E. of H 0 : pH ≤ p H i.e. p H − p H ≤ 0 ? Test at 5% level of significance. sample mean = 0.4 from subject measurement. Can we claim on the basis of data that population variance is not 4? From data, we have, H 0 : σ 2 = 4 vs H 0 : σ 2 ≠ 4 pH = 0.4364, pH = 0.2416 ˆ ˆ and From χ 2 tables, critical value χ11, 0.25 is 21.920, whereas the data give 2 n −1 p = ( 24 + 36) /(55 + 149) = 0.2941 (0.4364 − 0.2416) s 2 = 12(0.4) 2 = 1.92 and Z= = 2.71 (0.2941)(0.7059) (0.2941)(0.7059) + 55 149 (11)(1.92) χ c2 = = 5.28 4 As Z0.05= 1.65 , then we Reject H0 at α=0.05 So cannot reject H0 at α=0.05 23 24 4 Example: H.T. for two Variances • Two different microscopic methods available. Repeated observations on standard object give estimates of variance: A : n1 = 11, s12 = 1.232 B : n2 = 20, s2 = 0.304 2 H 0 : σ 12 = σ 2 2 H1 : σ 12 ≠ σ 2 2 Test statistic s2 F = 1 = 1 . 232 = 4 . 05 s 2 0 . 304 2 where critical values for dof 10 and 19 = 2.817 for α =0.025. Reject H0 25 5