Document Sample

Inference Confidence intervals for means – Margin of error – Small populations Confidence intervals for proportions Sample size Introduction to hypothesis testing Overview of Inference Select Simple Random Sample Compute Sample Statistics and Verify Assumptions Construct a Confidence Interval that Includes a Margin of Error Draw Conclusion about a Population Parameter Metropolitan buses A simple random sample of 36 buses shows a sample mean of 225 passengers carried per day per bus. The sample standard deviation is 60 passengers. What’s a 99% confidence interval estimate of the mean number of passengers carried per bus during a 1-day period? Metropolitan buses Assumptions: One is valid because we used simple random sampling to select the sample. Two is valid because the sample is > 30. So we can use do this confidence interval. Metropolitan buses: Statpro Use Statpro function in Excel. Data file is metrobus.xls from website Population mean number of passengers carried per day is between 198 and 253 at 99% confidence level Metropolitan buses: by hand Compute standard error: 60 / 6 = 10 Compute margin of error = 2.58 x 10 = 25.8 99% confidence interval is (roughly) between 199.2 and 250.8 Why is this different from Statpro? – Rounding of mean and standard deviation – Rule of thumb z value not exact Does it matter? – not usually Metropolitan buses: template This problem is also solvable using the Excel template for Confidence Intervals and Hypothesis tests. Enter data for mean, standard deviation, number in sample and level of confidence desired. Interpretation of Confidence Interval 99% confident that interval 225 + 25.8 contains the unknown population mean number of passengers. This means: If we selected 100 samples of size n = 36 and constructed 100 confidence intervals, about 99 would contain the unknown population mean and 1 would not. CI overview A sample mean is a point estimate of the population mean A confidence interval is an interval estimate of the population mean A confidence interval gives information, not only about where the population mean is, but also about how accurate the information is. CI overview - procedure 1) Assumptions? 2) /confidence level? 3) Use Statpro on data to get interval or use Excel template or - compute standard error - use z-value rule of thumb to compute margin of error - write down confidence interval Sample Means vs. Proportions Sample Means – Are computed from quantitative data. – Can be all possible values, positive or negative. – Estimate population means. Sample Proportions – Are computed from yes/no data (binomial) – Are numbers between 0 and 1 (inclusive). – Estimate population proportions. Procedure for drawing conclusions from sample proportions Select Simple Random Sample. Compute Sample Proportion. Check for Normality p z p(1 p) / n ˆ ˆ ˆ Draw Conclusion about Population Proportion, p Procedure for computing CIs for proportions. Check np 5 and n(1 p) 5 ˆ ˆ = 1 - confidence level Confidence interval is: p (1 p ) ˆ ˆ p (1 p ) ˆ ˆ pz ˆ p pz ˆ n n Mortgage Lending Last year, of a total of 58,000 customers, 2.7% defaulted on their mortgage. If this year is like last year, what are optimistic and pessimistic projections of the percentage who will default? Sample size vs. study cost and width of confidence interval Sample Size Study Cost Width of CI Small and Large High Meaningful Wide and Less Small Low Meaningful Sample size calculations For a mean: 2 2 ( z) s n 2 ( MOE ) For a proportion: ( z ) p (1 p ) 2 n 2 ( MOE ) Secondhand cars again Suppose the dealer wants to know how many people to survey to be able to get within 1 year of the average age of the population of secondhand car buyers with 95% confidence. Product popularity A manager has done a small initial study that shows that 40% of shoppers will buy a new product. He wants to be able to estimate the population percentage with a margin of error of no more than 1% with 95% confidence and asks your help to calculate the necessary sample size. Three Key Ideas Sample statistics estimate population parameters. As sample size increases, sample statistics tend to better approximate their population parameters. Confidence intervals provide a probable range within which the population parameter will fall. What if... A bus drivers union claimed that on average more than 280 people per day travel on buses in the Metropolitan Bus example. Would you be inclined to believe them? Why or why not? So confidence intervals can be used to test out claims made by people about the population parameters. Which leads us to…. Hypothesis Testing State Hypotheses Determine Type I, II errors Set Significance Level Run study and collect data Make decision, compute p-value What is a hypothesis test doing? Usually there is some default assumption or common wisdom about the true mean or true proportion of a population. Any sample you take from the population will probably not conform exactly to this default or common wisdom. What we want to know is: how unusual does your sample data have to be before you are willing to say the common wisdom is wrong? Advantages of Hypothesis Testing Approach Hypothesis testing is decision oriented. – Is a population parameter less than, equal to, or greater than a specific value (has decision-making implications)? Highlights that two different decision making errors are possible. – TYPE I OR ERROR – TYPE II OR ERROR p-value (prob. value) aids in interpreting results. Hypothesis Testing Process Assume the population mean income is $35K (Null Hypothesis) Population The Sample Does X 65 come from a population with 35? Mean Is $65K No, not likely! REJECT Sample Null Hypothesis Example: Customer ages A random sample of 28 customers were asked their age. The sample mean and sample standard deviation are, respectively, 51.2 and 22.8 years. Is the company’s claim that the mean age of customers is 40 years credible? What is the p-value? (If we take issue with the company, what is the chance, based on this sample, that their claims are actually correct?) Hypotheses The null hypothesis, H0, is usually the “default” or current situation. Rejecting the null hypothesis will cause us to take a new course of action. The alternative hypothesis, H1, is almost always a claim made, or challenge to current situation/wisdom. Together, the null and alternative hypothesis take into account all possible values. Hypothesis Testing State Hypotheses Step 1: State Hypotheses CORRECT H 0 : 40 H1 : 40 Company’s Claim that INCORRECT H 0 : 40 average age is 40 is Null H1 : 40 Hypothesis. (Null is “default” or current situation) INCORRECT H 0 : 40 Null and Alternative Hypoth- eses Must be Mutually H1 : 40 Exclusive and Exhaustive. INCORRECT H 0 : x 40 Hypotheses about Unknown Population mean, Not Known H1 : x 40 Sample mean. One or two-tailed What if you care about customers being “young” or not : i.e. the null hypothesis is that average age of customers is 40 or less. Alternative hypothesis is that average age of customers is more than 40. This is a one-tailed hypothesis test and hypotheses would look like: H 0 : 40 H1 : 40 Hypothesis Testing State Hypotheses Determine Type I, II errors Type I and II errors Type I error: – reject null hypothesis when it should have been accepted Type II error: – accept null hypothesis when it should have been rejected. Type I and II Errors H0 is true in H0 is false in population population Decision: Correct Type II Accept H0 Decision error Decision: Type I Correct Reject H0 error Decision Customer ages: Possible Errors Type I Error Reject Null Hypothesis When Null is True ( Error) Type II Error Do Not Reject Null When Null is False ( Error) Type IError Reject Null; Customer age is Null is True; mean 40 years but change marketing strategy Type IIError Don’t Reject Null; Customer age is not Null is False; 40 years but don’t change strategy Customer age: Possible Costs of a Type II Error Aiming products at the wrong market Lost opportunities Marketing spending that is not effective Customer age: Possible Costs of a Type I Error Expensive rethink of marketing strategy, advertising and collateral which is unnecessary. Loss of profit. Question: Do we consider the cost of the study into this decision? Hypothesis Testing State Hypotheses Determine Type I, II errors Set Significance Level Customer age: Step 3: Set the Significance Level, Significance Level, , is maximum risk of making a type I error that decision maker can “live with.” Decision maker sets significance level prior to data collection. For costly type I error, set at 0.05 or less. Guidelines to Selecting a Value for Alpha Type I Error Cost Type II Error Cost Set Significance Level High Low .01 or less Low High .2 or above High High .05 or .01 Hypothesis Testing State Hypotheses Determine Type I, II errors Set Significance Level Run study and collect data Customer age: Step 4: Run Study and Collect Data Put data for each variable in columns in an Excel spreadsheet with labels at the top of each column In this case it is already done for you and the resulting spreadsheet is customerages.xls Hypothesis Testing State Hypotheses Determine Type I, II errors Set Significance Level Run study and collect data Make decision, compute p-value Step 5: The p-value We want to know how unusual the sample data is by finding the p-value. We ask, if the null hypothesis was true what proportion of samples would be further away from the hypothesised value (more unusual) than this sample we have taken? This is the p-value. Finding the p-value Use Statpro: – Statistical Inference > One sample analysis… Enter data range Choose “Hypothesis test for mean” Specify null value (in this case 40) and exact form of alternative hypothesis (in this case “not equal”) Statpro will give you the p-value for test The p-value for Customer age example p-value is 0.015 level = .01 0 40 51.2 Found using Statpro in Excel. Risk of making a type I error is 1.5% - more than the level the company decided it was prepared to accept () Interpreting the p-value If you chose to reject the null hypothesis (that mean customer age is 40) then you would have a 1.5% chance of being wrong. P-value is the actual chance of making a type I error if you reject the null hypothesis. Comparing the p-value and p-value Significance Level Range 0<p<1 0 < < 1 Maximum Actual Probability Definition Probability of Making of Making Error Error How Determined From Sample Data Manager Determines and When Known After Study Before Study Decision-making using the p-value If p < then reject the null hypothesis If p then accept the null hypothesis For customer age example: p >0.01 and = 0.01 so accept null Hypothesis Testing Summary Set Null and Alternative Hypothesis. – Alternative is generally the challenge to default Consider Type I and II Errors. Set . Run Study, Collect Data, Make Decision, Compute p-value. Key Ideas Develop the alternative hypothesis first. Base it on the claim (challenge to default situation) that is to be tested. Develop the null hypothesis. Together, the null and the alternative must be mutually exclusive and exhaustive. Determine the correct rejection region. Confidence Intervals and Hypothesis Tests: what’s the difference? Confidence intervals centred around sample mean Hypothesis tests centred around null value and use sample to test whether null likely to be true Otherwise same concept. So to do a two-sided hypothesis test “by hand” you can do a confidence interval as usual around the null value and then check to see whether the sample mean falls in the interval you’ve constructed. If yes – accept the null. If no – reject the null. Hypothesis tests for proportions Check normality assumptions Use z-value rules of thumb Remember to use the hypothesised proportion value to calculate the standard error and to compute the interval. Otherwise same “by hand” procedure as for a hypothesis test for a mean. There is a separate worksheet in the Excel template for hypothesis tests for proportions. Comparing two sample means When might this be important? What kinds of questions might you ask? Important distinction: – Unpaired or independent samples – Paired data Confidence intervals for difference between means Procedure for difference of means for two independent samples 2 Independent Random Samples. Compute Sample Means and Stddevs. 2 2 s1 s2 x1 x2 z n1 n2 Draw Conclusion about Population Difference Car repairs RACV has “approved repairers” where the repair shop fixes the car and sends the bill directly to the insurance company. Suppose RACV wanted to check approved repairers were charging fair prices. Collected data for shop A: 36 cars, average cost = $1840 and stddev = $370 Data for shop B: 40 cars, average cost = $1630 and stddev = $280 What is a 95% confidence interval for the difference in costs between the two repair shops? But then… But it would make more sense, in a situation like that, to compare the prices shop A and shop B were quoting for the same cars. This is called a Paired Sample. Procedure for difference of means for paired samples A Paired Random Sample. Compute Mean Difference, d, and Stddev of Differences, sd. sd d z n Draw Conclusion about Population Difference, D Car repairs with same cars Send same 36 cars to both Shops A and B Mean difference in cost (A – B) is $170 with a standard deviation of difference in cost of $39. What is a 95% confidence interval for the difference in cost? What did we do? Constructed confidence intervals for means and proportions Calculated sample sizes required for particular levels of confidence and margins of error Looked at the steps required to set up a hypothesis test Managerial applications What did you learn today that makes a difference to the way you manage? What are the three most important things to remember from today’s lecture? Next class Read supplementary material on Correlation + Hedging and More Correlation. Read Colmar Brunton Opinion Poll and “Family car is killing us” Prepare Home Education case. Assessed for syndicate groups who chose option 1. Download data file funding.xls and bring on laptop

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 12 |

posted: | 6/28/2012 |

language: | English |

pages: | 60 |

OTHER DOCS BY jennyyingdi

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.