VIEWS: 46 PAGES: 19 POSTED ON: 12/6/2011
Chapter 4 12. According to a Gallup Poll, the extent to which employees are engaged with their workplace varies from country to country. Gallup reports that the percentage of U.S workers engaged with their workplace is more than twice as high as the percentage of German workers. The study also shows that having more engaged workers leads to increased innovation, productivity, and profitability, as well as reduced employee turnover. The results of the poll are summarized in the following table: Country Engagement United States Germany Total Engaged 550 246 796 Not Engaged 1,345 1,649 2,994 Total 1,895 1,895 3,790 If an employee is selected at random, what is the probability that he or she a. Is engaged with his or her workplace? Probability = 796/3790 = 0.210026 b. Is a U.S worker? Probability = 1895/3790 = 0.5 24. . According to a Gallup Poll, the extent to which employees are engaged with their workplace varies from country to country. Gallup reports that the percentage of U.S workers engaged with their workplace is more than twice as high as the percentage of German workers. The study also shows that having more engaged workers leads to increased innovation, productivity, and profitability, as well as reduced employee turnover. The results of the poll are summarized in the following table: Country Engagement United States Germany Total Engaged 550 246 796 Not Engaged 1,345 1,649 2,994 Total 1,895 1,895 3,790 a. Given that a worker is from the United States, what is the probability that the worker is engaged? Probability = 550/1895 = 0.290237 b. Given that a worker is from the United States, what is the probability that the worker is not engaged? Probability = 1345/1895 = 0.709763 c. Given that a worker is from Germany, what is the probability that the worker is engaged Probability = 246/1895 = 0.129815 d. Given that a worker is from Germany, what is the probability that the worker is not engaged Probability = 1649/1895 = 0.870185 51. Enzyme- Linked immunosorbent assay (ELISA) is the most common type of screening test for detecting the HIV virus. A positive result from an ELISA indicates that the HIV virus is present. For most populations, ELISA has a high degree of sensitivity (to detect infection) and specificity (to detect noninfection). Suppose the probability that a person is infected with the HIV virus for a certain population is 0.015. If the HIV virus is actually present, the probability that the ELISA test will give a positive result is 0.995. If the HIV virus is not actually present, the probability of a positive result from ELISA is .01. If the ELISA has given a positive result, use Bayes’ thermo to find the probability that the HIV virus is actually present. Let’s denote following events as HIV present: H HIV not present: nH Tested positive: P Tested negative: N p(H/P) = p(H)*p(P/H) / [p(H)*p(P/H) + p(nH)*p(P/nH)] = 0.015*0.995 / (0.015*0.995+0.985*0.01) = 0.602422 Chapter 5 12. you are trying to develop a strategy for investing in two different stocks. The anticipated annual return for 1,000 dollar investment in each stock under four different economic conditions has the following probabilitydistribution: Returns Probability Economic Condition Stock x Stock Y 0.1 Recession -100 50 0.3 Slow growth 0 150 0.3 Moderate Growth 80 -20 0.3 Fast Growth 150 -100 A. Compute the expected return for stock X and for stock Y E(x) = 0.1*(-100)+0.3*(0+80+150) = 59 E(y) = 0.1*50 + 0.3*(150-20-100) = 14 B. Standard deviation for stock X and for stock Y σx = sqrt(0.1*(-100-59)^2+0.3*((0-59)^2+(80-59)^2+(150-59)^2)) = 78.67 σy = sqrt(0.1*(50-14)^2+0.3*((150-14)^2+(-20-14)^2+(-100-14)^2)) = 99.62 C. Covariance of stock X and Stock Y ρxy = 0.1*(-100)*50+0.3*(0*150-80*20-150*100) - 59*14 = -6306 D. Would you invest in stock X or stock Y? Explain? Stock x because it has higher expected return and lower risk (standard deviation). 13. Suppose that in problem 12 you wanted to create a portfolio that consist of stock X and stock Y. Compute the portfolio expected return and portfolio risk for each of the following percentages invested in stock X: a. 30% Expected return = 0.3*59+0.7*14 = 27.5 Risk = sqrt((0.3*78.67)^2+(0.7*99.62)^2-2*0.3*0.7*6306) = 52.64 b. 50% Expected return = 0.5*59+0.5*14 = 36.5 Risk = sqrt((0.5*78.67)^2+(0.5*99.62)^2-2*0.5*0.5*6306) = 29.58 c. 70% Expected return = 0.7*59+0.3*14 = 45.5 Risk = sqrt((0.7*78.67)^2+(0.3*99.62)^2-2*0.7*0.3*6306) = 35.74 d. On the basis of the results of a through c which portfolio would you recommend and why? Portfolio c is recommended because it has largest expected return and even though risk is slightly higher than that of portfolio b, relatively larger expected return makes up for it. 25. When a customer places an order with Rudy’s online office supplies, a computerized accounting information system (AIS) automatically checks to see if the customer has exceeded his or her credit limit. Past records indicate that probability of customers exceeding their credit limit is 0.05. Suppose that, on a given day, 20 customers place orders. Assume that the number of customers that the AIS detects as having exceeding their credit limit is distributed as a binomial random variable. a. What are the mean and standard deviation of the number of customers exceeding their credit limits? Mean = 0.05*20 = 1 Standard deviation = sqrt(0.05*0.95*20) = 0.974679 b. What is the probability that zero customers will exceed their limits Probability = (1-0.05)^20 = 0.358486 c. What is the probability that one customer will exceed his or her limit? Probability = 20*0.05*(1-0.05)^19 = 0.377354 d. What is the probability that two or more customers will exceed their limits? Probability = 1 - (0.358486+0.377354) = 0.26416 37. J.D Power and associates calculates and publishes various statistics concerning car quality. The initial quality score measures the number of problems per new car sold. For 2009 model cars, Ford had 1.02 problems per car and Dodge had 1.34 problems per car. Let the random variable X be equal to the number of problems with a newly purchased 2009 Ford. a. What assumptions must be made in order for X to be distributed as a Poisson random variable? Are these assumptions reasonable? Assumptions are that problems are randomly distributed and are independent of each other. These assumptions might not be reasonable because generally one problem in a car may lead to another, so independence is not always reasonable. Make the assumptions as in (a), if you purchased a 2009 Ford, what is the probability that the new car will have B. Zero Problems Probability = (1.02)^0 * e^(-1.02) / 0! = 0.360595 C. Two or fewer problems Probability = (1.02)^0*e^(-1.02) / 0! + (1.02)^1*e^(-1.02) /1! + (1.02)^2*e^(-1.02) / 2! = 0.91598 D. Give an operational definition for problem. Give an operational definition for problem. Why is the operational definition important in intrepreting the initial quality score? Problem in the car is defined as i) which interferes with the normal working of the car and ii) which hasn’t been caused due to a previously existing problem. Operational definition is important because this generalizes the concept for all models and thus the quality scores become comparable on the same scale. Chapter 6 11. A statistical analysis of 1,000 long-distance telephone calls made from the headquarters of the Bricks and Clicks computer corporation indicates that the length of these calls is normally distributed, with u=240 and O = 40 seconds. (The u and o has tails on them) a. What is the probability that a call lasted less than 180 seconds? P(x<180) = P(z < (180-240)/40) = P(z < -1.5) = 0.0668 b. What is the probability that a call lasted between 180 and 300 seconds? P(180<x<300) = P((180-240)/40< z < (300-240)/40) = P(-1.5 < z < 1.5) = 0.8664 c. What is the probability that a call lasted between 110 and 180 seconds? P(110<x<180) = P((110-240)/40< z < (180-240)/40) = P(-3.25<z<-1.5) = 0.0662 d. 1% of all calls will last less then how many seconds? Z for 0.01 = -2.3263 So, x = 240-2.3263*40 = 146.95 So, 1% of calls will last less than 146.95 seconds. 13. Many manufacturing problems involve the matching of machine parts, such as shafts that fit into a valve hole. A particular design requires a shaft with a diameter of 22.000 mm, but shafts with diameters between 21.990 mm and 22.010 mm are acceptable. Suppose that the manufacturing process yields shafts with diameters normally distributed, with a mean of 22.002 mm and a standard deviation of 0.005 mm. For this process what is? a. The proportion of shafts with a diameter between 21.99 mm and 22.00 mm? P(21.99<x<22.00) = P((21.99-22.002)/0.005 < z < (22.00-22.002)/0.005) = P(-2.4<z<-0.4) = 0.3364 B. The probability that a shaft is acceptable? P(21.990<x<22.010) = P((21.990-22.002)/0.005 < z <(22.010-22.002)/0.005) = P(-2.4<z<1.6) = 0.937 c. The diameter that will exceed by only 2% of the shafts? For top 2%, z = 2.05375 X = 22.002+2.05375*0.005 = 22.01227 So, d. What would be your answers in (a) through (c) if the standard deviation of the shaft diameters were 0.004mm? Answer to a) P(21.99<x<22.00) = P((21.99-22.002)/0.004< z < (22.00-22.002)/0.004) = P(-3<z<-0.5) = 0.3072 Answer to b) P(21.990<x<22.010) = P((21.990-22.002)/0.004< z < (22.010-22.002)/0.004) = P(-3<z<2) = 0.9759 Answer to c) X = 22.002+2.05375*0.004 = 22.01022 Chapter 7 21. Time spent using email-per session is normally distributed, with µ= 8 and σ= 2 minutes. If you select a random sample of 25 sessions, a. What is the probability that the sample mean is between 7.8 and 8.2 minutes? Standard error = 2/sqrt(25) = 0.4 P(7.8<xbar<8.2) = P((7.8-8)/0.4 < z < (8.2-8)/0.4) = P(-0.5<z<0.5) = 0.3829 B. What is the probability that the sample mean is between 7.5 and 8 minutes? P(7.5<xbar<8) = P((7.5-8)/0.4 < z < (8-8)/0.4) = P(-1.25<z<0) = 0.3944 C. If you select a random sample of 100 sessions, what is the probability that the sample mean is between 7.8 and 8.2 minutes? Standard error = 2/sqrt(100) = 0.2 P(7.8<xbar<8.2) = P((7.8-8)/0.2 < z < (8.2-8)/0.2) = P(-1<z<1) = 0.6827 D. Explain the difference in the results of (a) and (c) In (a) sample size is only 25 but in (c) sample size is increased to 100, thus standard error has decreased and so the probability that the sample mean would lie in an area about the population mean has increased. Chapter 8 9. The manager of a paint supply store wants to estimate the actual amount of paint contained in 1 gallon cans purchased from a nationally known manufacturer. The manufacturer’s specifications state that the standard deviation of the amount of paint is equal to 0.02 gallon. A random sample of 50 cans is selected, and the sample mean amount of paint per 1 gallon can is 0.995 gallon. a. Construct a 99% confidence interval estimate for the population mean amount of paint included in a 1-gallon can. For 99% confidence, critical z = -2.5758 Lower limit = 0.995 - 2.5758*0.02/sqrt(50) = 0.987715 Upper limit = 0.995 + 2.5758*0.02/sqrt(50) = 1.002285 So, confidence interval is (0.987715, 1.002285). B. On the basis of these results, do you think that the manager has a right to complain to the manufacturer? Why? No, because the 99% confidence interval includes specified 1 gallon mark. C. Must you assume that the population amount of paint per can is normally distributed here? Explain? Yes, because even though sampling distribution is normal irrespective of population distribution, estimated standard deviation is based on the standard deviation of the population. So, the latter must also be assumed to be normally distributed. D. Construct a 95% confidence interval estimate. How does this change your answer to B? For 95% confidence, z = 1.96 Lower limit = 0.995 -1.96*0.02/sqrt(50) = 0.98946 Upper limit = 0.995 + 1.96*0.02/sqrt(50) = 1.00054 So, 95% confidence interval is (0.98946, 1.00054). This has narrowed down from B. 36. If you want to be 99% confident of estimating the population proportion to within a sampling error of ± 0.04, what sample size is needed? E = z*sqrt(p*(1-p)/n) Or, n = P*(1-P)*(z/E)² = 0.5*(1-0.5)*(2.5758/0.04)^2 = 1037 Chapter 9 3. If you use a 0.10 level of significance in a (two tailed) Hypothesis test, what is your decision rule for rejecting null hypotheses that the population mean is 500 if you use the Z test? Critical values at 0.10 significance level = ±1.64485 So, reject null hypothesis if z<-1.64485 or z>1.64485 4. If you use a 0.01 level of significance in a (two tail) hypothesis test, what is your decision rule for rejecting Ho: µ= 12.5 if you use the Z test. Critical values at 0.01 level of significance = ±2.5758 So, reject null hypothesis if z<-2.5758 or z>2.5758 5. What is your decision in problem 4 if Zstat= -2.61 Reject null hypothesis 6. What is the P value if, in a two tail hypothesis test, Zstat = +2.00 Two-tailed p-value = 0.0455 13. Do students at your school study more than, less than, or about the same as students at other business schools? Business Week reported that at the top 50 business schools, students studied an average of 14.6 hours per week.(Data extracted from (Cracking the books,” Special Report/online extra, www.businessweek.com, March 19, 2007). Set up a hypothesis test to try to prove that the mean number of hours is studied at your school is different from the 14.6 hour per week benchmark reported by business week. a. State the null and alternative hypotheses Ho: µ = 14.6 Ha: µ ≠ 14.6 B. what is the I error for your test? Type I error is to infer that mean is different from 14.6 hour when actually mean is equal to 14.6 hour. c. What is the type II error for your test? Type II error is to fail to infer that mean is different from 14.6 hour when actually it is so. 23. A manufacturer of chocolate candies uses machines to package as they move along the filling line. Although the packages are labeled as 8 ounces, the company wants the packages to contain a mean of 8.17 ounces so that virtually none of the packages contain less than 8 ounces. A sample of 50 packages is selected periodically, and the packaging process is stopped if there is evidence that the mean amount packaged is different from 8.17 ounces. Suppose that in a particular sample of 50 packages, the mean amount dispensed is 8.159 ounces, with a sample standard deviation of 0.051 ounce. a. Is there evidence that the population mean amount is different from 8.17 ounces? (Use a 0.05 level of significance) Ho: µ = 8.17 Ha: µ ≠ 8.17 t-statistic = (8.159-8.17)/(0.051/sqrt(50)) = -1.52513 Degree of freedom = 50-1 = 49 Critical values = ±2.0096 Since test statistic doesn’t lie in critical region, null hypothesis can’t be rejected. Not enough evidence to say that population mean is different from 8.17 ounces. B. Determine the P value and interpret its meaning. P-value = 0.1337 This means if the actual mean was 8.17, the probability that a sample mean of 8.159 would have occurred is 0.1337. Chapter 10 12. A bank with a branch located in a commercial district of a city has developed an improved process for serving customers during the noon-to-1 hour lunch period. The waiting time (operationally defined as the time elapsed from when a customer enters the line until he or she reaches the teller window) needs to be shortened to increase customer satisfaction. A random sample of 15 customers is selected, and the results (in minutes) are as follows (and stored in bank 1): 4.21 5.55 3.02 5.13 4.77 2.34 3.54 3.20 4.50 6.10 0.38 5.12 6.46 6.19 3.79 Suppose that another branch, located in a residential area, is also concerned with the noon to 1 lunch period. A random sample of 15 customers is selected, and the results are as follows (and stored in bank 2) 9.66 5.90 8.02 5.79 8.73 3.82 8.01 8.35 10.49 6.68 5.64 4.08 6.17 9.91 5.47 a. Assuming that the population variances from both banks are equal, is there evidence of a difference in the mean waiting time between the two branches? ( use α=0.05.) Ho: µ1 = µ2 Ha: µ1 ≠ µ2 Mean1 = 4.2867, Mean2 = 7.1147, s1 = 1.638, s2 = 2.0822 t-statistic = (4.2867-7.1147)/sqrt((1.638^2+2.0822^2)/15) = -4.1343 Degree of freedom = 15+15-2 = 28 Critical values = ±2.0484 Since test statistic lies in critical region, null hypothesis is rejected. There is a difference in mean waiting times. B. Determine the p value in a and interpret its meaning P-value = 0.0003 This means that if actually there was no significant difference in waiting times, then the probability that the observed difference would have occurred was 0.0003. c. In addition to equal variances, what other assumption is necessary in a Two populations are normally distributed and samples are independent and random. d. Construct and interpret a 95% confidence interval estimate of the difference between the population means in the two branches. For 95% confidence, with d.f. of 28, critical t = 2.0484 Lower limit = (4.2867-7.1147) - 2.0484 * sqrt((1.638^2+2.0822^2)/15) = -4.22918 Upper limit = (4.2867-7.1147) + 2.0484 * sqrt((1.638^2+2.0822^2)/15) = -1.42682 So, 95% confidence interval for difference of mean is (-4.22918, 1.42682). 45. A professor in the accounting department of a business school claims that there is more variability in the final exam scores of students taking the introductory accounting course who are not majoring in accounting then for students who are taking the course who are majoring in accounting. Random samples of 13 non-accounting majors and 10 accounting majors are selected from the professors class roster in his large lecture and the following results are computed based on the final exam scores: Non-accounting: N1=13 S2/1=210.2 (the 2/1 is really a 2 over the 1 I couldn’t figure out how to type it though) Accounting: N2=10 S2/2= 36.5 A. At the 0.05 level of significance, is there evidence to support the professors claim? Ho: σ²1 ≤ σ²2 Ha: σ²1 > σ²2 F-statistic = 210.2/36.5 = 5.7589 Numerator d.f. = 12, denominator d.f. = 9 Critical value of F = 3.0729 Since F-statistic lies in critical region, null hypothesis is rejected. Non-accounting variance is greater than that of accounting, and professor claim is supported by the evidence. B. Interpret the p value P-value = 0.0066 This means that if first variance was actually not greater than the second, the probability that observed variances would have occurred is 0.0066. C. What assumption do you need to make in a about the two populations in order to justify your use of the F test? Assumption of normality and independence of samples Chapter 11 13. A pet food company has a business objective of expanding its product line beyond its current kidney- and shrimp- based cat foods. The company developed two new products, one based on chicken livers and the other based on salmon. The company conducted an experiment to compare the two new products with its two existing ones, as well with a generic beef-based product sold in a supermarket chain. For the experiment, a sample of 50 cats from the population at a local animal shelter was selected. Ten cats were randomly assigned to each of the five products being tested. Each of the cats was then presented with three ounces of the selected food in a dish at feeding time; the researchers defined the variable to be measured as the number of ounces of food that the cat consumed within a 10-minute interval that began when the filled dish was presented. The results of this experiment are summarized in the following table and stored in Catfood. Kidney Shrimp Chicken Liver Salmon Beef 2.37 2.26 2.29 1.79 2.09 2.62 2.69 2.23 2.33 1.87 2.31 2.25 2.41 1.96 1.67 2.47 2.45 2.68 2.05 1.64 2.59 2.34 2.25 2.26 2.16 2.62 2.37 2.17 2.24 1.75 2.34 2.22 2.37 1.96 1.18 2.47 2.56 2.26 1.58 1.92 2.45 2.36 2.45 2.18 1.32 2.32 2.59 2.57 1.93 1.94 a. At the 0.05 level of significance, is there evidence of a difference in the mean amount of food eaten among the various products? Ho: All means are equal Ha: At least one mean is different ANOVA table Tests of Between-Subjects Effects Dependent Variable:Food Type III Sum of Source Squares df Mean Square F Sig. Type 3.659 4 .915 20.805 .000 Error 1.978 45 .044 Total 248.298 50 Corrected Total 5.637 49 a. R Squared = .649 (Adjusted R Squared = .618) Since p-value is almost 0, null hypothesis is rejected. There is evidence of a difference in the mean amount of food eaten among the various products. B. If appropriate, which products appear to differ significantly in the mean amount of food eaten? Tukey’s HSD Post Hoc test output is Multiple Comparisons Food Tukey HSD 95% Confidence Interval Mean Difference (I) Type (J) Type (I-J) Std. Error Sig. Lower Bound Upper Bound * Beef ChickenLiver -.6140 .09377 .000 -.8804 -.3476 * Kidney -.7020 .09377 .000 -.9684 -.4356 * Salmon -.2740 .09377 .041 -.5404 -.0076 * Shrimp -.6550 .09377 .000 -.9214 -.3886 * ChickenLiver Beef .6140 .09377 .000 .3476 .8804 Kidney -.0880 .09377 .880 -.3544 .1784 * Salmon .3400 .09377 .006 .0736 .6064 Shrimp -.0410 .09377 .992 -.3074 .2254 * Kidney Beef .7020 .09377 .000 .4356 .9684 ChickenLiver .0880 .09377 .880 -.1784 .3544 * Salmon .4280 .09377 .000 .1616 .6944 Shrimp .0470 .09377 .987 -.2194 .3134 * Salmon Beef .2740 .09377 .041 .0076 .5404 * ChickenLiver -.3400 .09377 .006 -.6064 -.0736 * Kidney -.4280 .09377 .000 -.6944 -.1616 * Shrimp -.3810 .09377 .002 -.6474 -.1146 * Shrimp Beef .6550 .09377 .000 .3886 .9214 ChickenLiver .0410 .09377 .992 -.2254 .3074 Kidney -.0470 .09377 .987 -.3134 .2194 * Salmon .3810 .09377 .002 .1146 .6474 Based on observed means. The error term is Mean Square(Error) = .044. *. The mean difference is significant at the .05 level. If p-values is less than 0.05, then the two pairs differ in mean. Thus, Beef and Salmon differ from all other products while Chicken Liver, Shrimp and Kidney have same mean. C. At the 0.05 level of significance, is there evidence of a significant difference in the variation in the amount of food eaten among the various products? Ho: All variances are equal Ha: At least one variance is different a Levene's Test of Equality of Error Variances Dependent Variable:Food F df1 df2 Sig. 2.339 4 45 .069 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a Levene's Test of Equality of Error Variances Dependent Variable:Food F df1 df2 Sig. 2.339 4 45 .069 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a. Design: Intercept + Type Since p-value is .069, null hypothesis can’t be rejected. There is no evidence of significant different in variation in amount of food eaten. D. What should the pet food company conclude? Fully describe the pet companies options with respect to its products? Chicken liver, though seems better than Beef and Salmon, isn’t much different (may be inferior) from already existing Kidney and Shrimp. So, this may be cancelled. Salmon seems better than only Beef, but looks inferior to all other products. So Salmon based product can be cancelled too. Chapter 13 36. A mail order catalog business that sells personal computer supplies, software, and hardware maintains a centralized warehouse for the distribution of products ordered. Management is currently examining the process of distribution from the warehouse and is interested in studying the factors that affect warehouse distribution costs. Currently, a small handling fee is added to the order, regardless of the amount of the order. Data that indicates the warehouse distribution costs and the number of orders received have been collected over the past 24 months. These are the results: Months Distribution Cost Number of orders (Thousands of Dollars) 1 52.95 4,015 2 71.66 3,806 3 85.58 5,309 4 63.69 4,262 5 72.81 4,296 6 68.44 4,097 7 52.46 3,213 8 70.77 4,809 9 82.03 5,237 10 74.39 4,732 11 70.84 4,413 12 54.08 2,921 13 62.98 3,977 14 72.30 4,428 15 58.99 3,964 16 79.38 4,582 17 94.44 5,582 18 59.74 3,450 19 90.50 5,079 20 93.24 5,735 21 69.33 4,269 22 53.71 3,708 23 89.18 5,387 24 66.80 4,161 a. Assuming a linear relationship, use the least squares method to find the regression coefficients b0 and b1 b0 = 0.4576 b1 = 0.0161 b. Predict the monthly warehouse distribution costs when the number of orders is 4,500 Distribution cost = 0.4576 + 0.0161*4500 = 72.9076 (thousands of dollars) c. Plot the residuals versus the time period D. Compute the Durbin-Watson Statistic. At the o.05 level of significance, is there evidence of positive autocorrelation among the residuals? Durbin-Watson Statistic = 2.07526 Critical value lower: 1.27276, Critical value upper: 1.44575 Since test statistic is greater upper critical value, there is evidence of no positive autocorrelation. E. Based on the result of c and d, is there reason to question the validity of the model? Yes, the model is reasonably valid, as the errors seems randomly distributed and are uncorrelated.