Document Sample

Final Assignment 1 Business Statistics Final Assignment By Benjamin W. Kratz Professor Bruce Busbee BUSN 5760 Applied Statistics Webster University at Fort Jackson October 13, 2008 Final Assignment 2 Abstract When running a business it is imperative to perform four types of statistical analysis: ANOVA testing, Linear Regression, Correlation Analysis, and Pricing Indexes. By looking at each of the statistical analysis, the business owners and employees can determine how prices and wages will change along with what variables are the main influences for the change. By determining this, they are able to create models that can predict the future outcome with a statistical accuracy of 90-99%. Final Assignment 3 Introduction: Businesses use statistical data to answer the “so what?” Their goal is be able to predict how the economy will change and what variables will cause the greatest influence and to what extent they influence the end cost. ANOVA testing, Linear Regression, Correlation Analysis, and Pricing Indexes are four key statistical analysis areas they can use to determine how the economy will change. ANOVA Testing: Businesses use ANOVA testing to see if the means of a population are the same (your null hypothesis) or if they differ between populations (your research hypothesis) by looking at the variances. An ANOVA will tell you if there is a statistically significant difference between group means (averages) based on group variances and sample sizes. When conducting the ANOVA, they look for the total variation by obtaining the sum of the squared differences between each observation and the overall mean. When calculating the total variation they break the computation down into two separate components. The first component is the treatment variation (TV) and is computed by taking the sum of the squared differences between each treatment mean and the total mean. The second component is the random variation (RV) and is computed by taking the sum of the squared differences between each observation and its treatment mean. The RV information also indicates the error component. The ANOVA test procedure produces an F-statistic, which is used to calculate the p-value. To determine F distribution they use the following equation: (��������)/(������������������������ ��������) ���� = (��������)/(�������������������� ��������) If the null hypothesis is correct, we expect F to be about one, whereas "large" F indicates a location effect. How big should F be before we reject the null hypothesis? In statistical hypothesis testing, we use a p-value (probability value) to decide whether we have enough evidence to reject the null hypothesis and say our research hypothesis is supported by the data. To find the 1 percent level of significance they can use the chart found in Appendix B.4 in the textbook titled “Statistical Techniques in Business and Economics” (Lind, Marchal, and Wathen, 2008). Chapter 12 of the same textbook provides a great example of the ANOVA by surveying passengers from four different airlines. The intent is to find if there is a difference in the mean satisfaction level among the four airlines. Final Assignment 4 The survey included questions on ticketing, boarding, in-flight service, baggage handling, pilot communication, and so forth. Twenty-five questions offered a range of possible answers: excellent, good, fair, or poor: A response of excellent was given a score of 4, good a 3, fair a 2, and poor a 1. These responses were then totaled, so the total score was an indication of the satisfaction with the flight. The greater the score, the higher the level of satisfaction with the service. The highest possible score was 100. (Lind, Marchal, and Wathen, 2008). Table 1: Results from Surveys: (Lind, Marchal, and Wathen, 2008). Eastern TWA Allegheny Ozark 94 75 70 68 90 68 73 70 85 77 76 72 80 83 78 65 88 80 74 68 65 65 The null hypothesis and the alternate hypothesis are as follows: H0: µ1 = µ2 = µ3 = µ4 H1: µ1 > µ2 > µ3 > µ4 Acceptance of the null hypothesis means that there is no difference in the mean scores for all four airlines. Rejection of null hypothesis means that there is no a difference in at least one pair of mean scores. However, the initial computation will not denote which data group differs or how many data groupings differ. The F distribution is used to test the statistical data using a significance level of .01. To formulate the decision rule we look for the critical value (cv) by using Appendix B.4 (Lind, Marchal, and Wathen, 2008). In order to find the cv the degrees of freedom (df) need to be identified for the numerator (k) by taking the total number of treatments and subtracting 1 and the denominator by taking the total number of observations (n) and subtracting the number of treatments. Therefore, df for the numerator = k – 1 = 4 – 1 = 3 and df for the denominator = n - k = 22 – 4 = 18. By using the 3 and 18 they compute the cv to be 5.09 which if the computed value of F exceeds 5.09 then they will reject H0. The final step is to select the sample, perform the calculations, and make a decision as shown in Table 2. Table 2: ANOVA Computation Layout: (Lind, Marchal, and Wathen, 2008). Source of Variation Sum of Squares df Mean Square F Treatments SST k-1 SST/(k-1) = MST MST/MSE Error SSE n-k SSE/(n-k) = MSE Total SS total n-1 Final Assignment 5 Excel commands for the one-way ANOVA were used to run the data and the results are shown in Table 3. Table 3: Excel One-way ANOVA Computation Results: SUMMARY Groups Count Sum Average Variance Eastern 4 349 87.25 36.92 TWA 5 391 78.20 58.70 Allegheny 7 510 72.86 30.14 Ozark 6 414 69.00 13.60 Total 22 1664 75.64 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 890.68 3 296.89 8.99 0.0007 3.16 Within Groups 594.41 18 33.02 Total 1,485.09 21 The results tell us that the total error is 1,485.10 and the error within the groups is 890.68. The information needed to accept or reject is the error between groups, which is 890.68. By taking the error between groups and dividing it by the df (3) the mean square (MST) of 296.89 is obtained. Doing the same with the error within groups the mean square (MSE) of 33.02 is obtained. There is a large differential between each mean square providing an early indication that the hypothesis may be rejected since the between group error is larger than the within group error. To confirm this notion the formula MST/MSE is used to find the F value. The calculated F Value is then compared the the critical value obtained from Appendix B.4, pg. 789 (Lind, Marchal, and Wathen, 2008). By taking the df for the denominator (18) and the df for the numerator the critical value of 5.09 is derived. The final step is to compare the F value and the F critical value. Since the F value is greater than the F critical value the H o is rejected and it indicates that there is a significant difference between each sample group. So what? By looking at the P- value of .0007 it is determined that the probability of finding a f value larger when the null hypothesis is true is very small. Since this is the case, the likelihood of obtaining a Type I error is very small. With this data, a customer wanting to travel would be able to know that not all airlines provide the same level of service with the same satisfaction. By knowing this, the customer would then begin to look more closely at the survey and try to see what services were mentioned. Were the services that they prefer to use included in the survey? If so what were the results for those services? By answering these questions, the customer is adding additional weighted value to the variables, which would result in a new analysis of the data. Final Assignment 6 To narrow down the influencing variable, more data would be needed to perform a correlation analysis to see what service or combination of services have the greatest influence on flight satisfaction. The bad part about performing statistical analysis on such a streamlined data set is that one person’s personal satisfaction is not the same as another person. This creates a degree of bias in the data set that can potentially mislead the true nature of the statistical findings. To eliminate some of the bias in the survey, there should be a control factor provided as a means to reflect the bias data within the statistical analysis. Linear Regression and Correlation Analysis: While the ANOVA allow businesses to find similarities between several populations, it still leaves questions to be answered as noted in the problem with the airlines. So how do businesses answer the question of how do the variables relate to each other. In order to answer this type of questions a correlation analysis needs to be conducted to create a model of the data that if we have a known value we will know the resulting value. To know the range of certainty the business would also perform a linear regression and confidence interval. Douglas Lind and associates provide a great example of this in Chapter 13 as they talk about the copier sales of America (2008). The example looks the number of sales calls and copiers sold for 10 salespeople, as seen in Table 4) to see if there is a direct or indirect relationship between the number of calls made and the number of copiers sold. They look to create a model using correlation analysis to measure the association between two variables. To do the Table 4: Number of Sales Calls and Copiers Sold for 10 Salespeople: Sales Represenative Number of Sales Calls Number of Copiers Sold Tom Keller 20 30 Jeff Hall 40 60 Brian Virost 20 40 Greg Fish 30 60 Susan Walch 10 30 Carlos Ramirez 10 40 Rich Niles 20 40 Mike Kiel 20 50 Mark Reynolds 20 30 Soni Jones 30 70 Total 220 450 (Table 13-1: Lind, Marchal, and Wathen, 2008) Correlation analysis it is important to identify the dependent variable and independent variable. The dependent variable is the variable that is being predicted and the independent variable is the variable that provides the basis for estimation. If they were to conduct a scatter diagram as seen in Table 5, the dependent variable would be on the y- Final Assignment 7 axis and the independent variable on the x-axis. Once plotted the graph clearly shows that there is some type of correlation between the number of calls made by a sales person and the number of sales. Table 4: Number of Sales Calls and Copiers Sold for 10 Salespeople: Sales Calls and Copiers Sold 80 Copiers Sold 60 40 20 0 0 10 20 30 40 50 Sales Calls (Chart 13-1: Lind, Marchal, and Wathen, 2008) So how does this help the business manager? Well, in the graphic form they can quickly show it to the sales reps as a means to motivate them to increase their calls in an effort to increase sales. For most managers this is not enough to go on when they want to know what the amount of sales will be if calls are increased. To properly answer this they need to gain a better understanding of how the two values relate to each other by computing the coefficient of correlation (r) along with determining how far they deviate from the mean and their products. Table 5: Deviations from the mean and Their Products: (Table 13-3: Lind, Marchal, and Wathen, 2008) The following equation is used to compute the coefficient of correlation (r) using the standard deviations of the samples of the sales calls and 10 copiers sold using the following formula: Final Assignment 8 The resulting value can range from -1.00 to 1.00. The closer the value is to -1.00 or 1.00 the stronger the correlation and the closer to 0.00 the weaker the correlation. A negative value indicates an inverse relationship and a positive value indicates a direct relationship. Using excel„s descriptive statistics function we can obtain the standard deviation (s) as seen in Table 6. Table 6: Descriptive statistics of Sales Calls and Copiers Sold for 10 Salespeople: Number of Sales Calls Number of Copiers Sold Mean 22.000 Mean 45.000 Standard Error 2.906 Standard Error 4.534 Median 20.000 Median 40.000 Mode 20.000 Mode 30.000 Standard Deviation (sx) 9.189 Standard Deviation (sy) 14.337 Sample Variance 84.444 Sample Variance 205.556 Kurtosis 0.396 Kurtosis -1.001 Skewness 0.601 Skewness 0.566 Range 30.000 Range 40.000 Minimum 10.000 Minimum 30.000 Maximum 40.000 Maximum 70.000 Sum 220.000 Sum 450.000 Count 10.000 Count 10.000 Confidence Level(95.0%) 6.574 Confidence Level(95.0%) 10.256 900 The computation of ���� = = 0.759 indicates a strong positive correlation. This data does not tell 10−1 9.189 (14.337) the manager that as the number of calls increase the number of sales will also increase, only that the two variables have some type of relationship. The data does not yet tell the manager to what amount of sales will one additional call create, to determine this; a correlation needs to be established using linear regression analysis equation: Ŷ = a + bX “Ŷ” is the estimated value of the Y variable for a selected X value “a” is the Y-intercept (value of Y when X = 0) “b” is the slope if the line (mean change in Ŷ for each change of one unit in the X variable) “X” is the selected independent variable. To obtain the slope of the regression line by taking the correlation coefficient 14.337 The first step is to find the slope: b = r (sy / sx) = 0.759( ) = 1.1842. The second step is to determine the Y- 9.189 intercept: a = Ȳ - bX = 45 – 1.1842 (22) = 18.9476. With these two values the manager can now calculate how many sales will result from an increase in 20 calls (X) by calculating Ŷ = 18.9476 + 1.1843 (20) = 42.6316 copiers. Simply put for every additional call the sale representative can expect an increase of 1.2 copiers sold. However, the equation is not truly reliable since the sales calls ranged from 10 to 40 which then limit the use of the equation to Final Assignment 9 this range. If you use 0-10 or < 40 the accuracy lessens. The equation is only a prediction statement and is not perfect. To provide some validity to the accuracy of the prediction equation they calculate the standard error of estimate to determine the measure of dispersion of the observed values around the line or regression. This is done using the ∑( ����−Ŷ Y− Ŷ ) follow equation: s y * X = . To ease the process the date has been computed using excel‟s Data ����−2 Analysis program to calculate regression. The results are shown in Table 7. Table 7: Regression calculation of Sales Calls and Copiers Sold for 10 Salespeople: Regression Statistics Multiple R 0.759 R Square 0.576 Adjusted R Square 0.523 Standard Error 9.901 Observations 10.000 Coefficients Intercept 18.95 Calls 1.18 The standard error computes to 9.901 depicting how far from the regression line the data point deviate. With this knowledge, the manager can be 90% certain of their calculations. Therefore, there is a 10% chance their data is off. Index Numbers: Drawing correlations between variables is not the only thing that is important to businesses and managers. Profits are important and to be able to see the true profits then use Consumer Price Index (CPI) numbers. CPI expresses the relative change in the sample value compared to the base period established. Two basic types of data are needed to construct the CPI: price data and weighting data. The percent change in the CPI is a measure of inflation. The CPI can be used to adjust for the effects of inflation in wages, salaries, pensions, or regulated or contracted prices. On weighted index is the Laspeyres Price Index (LPI) developed to determine a weighted price index using base-period quantities as weights using the following: Final Assignment 10 P=( ptqo/ poqo) x 100 Douglas Lind and associates provide a great example of this in Chapter 15, as they talk about the prices for the six food items shown in Table 8 (2008). Table 8: Price and Quantity of Food Items in 1995 and 2005: Item Price-95 Qty-95 Price-95*Qty-95 Price-05 Price-05*Qty-95 Bread $0.77 50 $38.50 $0.89 $44.50 Eggs $1.85 26 $48.10 $1.84 $47.84 Milk $0.88 102 $89.76 $1.01 $103.02 apples $1.46 30 $43.80 $1.56 $46.80 Orange Juice $1.58 40 $63.20 $1.70 $68.00 Coffee $4.40 12 $52.80 $4.62 $55.44 $336.16 $365.60 (Data from Table 15-3: Lind, Marchal, and Wathen, 2008) To calculate the LPI they determine the total amount spent for the six items in the base-period equaling $336.16. Then we take the 2005 price and multiply the 1995 quantities to establish a weighted value of $365.60. Now that the two values are calculated, the weighted price index can be computed. The final computed value is 108.8 indicating that there is an 8.8 percent increase in the cost over the ten-year period. $365 .60 P=( ptqo/ poqo) x 100 = 100 = 108.8 336 .16 The data from LPI does not reflect changes in any buying patterns that may have occurred over time. To compensate for this they can use the Paasche Price index using current year quantities to reflect current buying habits. The problem with using this price index is that it can provide greater weight to the prices whose quantities have decreased. Therefore, the use of Fisher‟s Ideal Index (FII) tries to balance the effects of the two price indexes by taking the geometric mean of the two indexes. However, the FII has similar issues as the Paashe Price Index in that it requires current quantity data for each period being used. Another use of CPI is when employees determine what the true amount of their current income is based on �������������������� ������������������������ inflation? They can calculate for real income (RI) by using the equation: �������� = 100. To see the work ������������ they take the annual income $20,000 from 1982-84 and set it as the base period (equal 100 CPI). Then they take the present year income or $40,000 and divide it by current CPI for that year which is 200. When they place it into the 40,000 RI = (MI/CPI) 100 = 100 = 20,000 the employee will realize that their income has the same purchasing power 200 as it did in 1982-84 and that the employers have properly adjusted their income to reflect the current CPI. Now this Final Assignment 11 is not always the case, for example if the CPI were 250 then their RI would be $16,000 indicating that the inflation of the market has weakened their income/purchasing power by $4,000. This concept is also called deflated income and is brought to light when labor unions negotiate new contracts for employees. Table 9: OCOLA for Army Major Living in Kaiserslautern, Germany: (Retained from: http://perdiem.hqda.pentagon.mil/cgi-bin/cola-oha/o_cola.pl) Businesses also use the CPI to determine cost-of-living allowance (COLA) increases within management-union contracts. For instance, the military pays Soldiers over seas a supplement (O-COLA) to offset the cost difference of the local economy with the US economy. Looking at Table 9, you will see that a Major in the Army living off post in Kaiserslautern Germany with three dependants will receive an additional $39.378 daily to offset the local economy‟s 0.32 inflationary index reflecting the difference between the US CPI and the EURO CPI. The OCOLA ensures that service members are not penalized with their income‟s purchasing power because they are serving in another country. The Producer Price Index (PPI), another version of the CPI, is a vital tool for business owners when they need to provide daily budget analysis. The PPI reflects the prices charged the manufacturer for the materials purchased to produce the end product and is used to calculate if and where they need to adjust their budget in the future. If the company were a bakery, they would want to know what the PPI is for crude goods to determine if the current allotted budget will be enough for the next month‟s production requirement. This is also a good indicator as to if the Final Assignment 12 cost of their product needs to increase or not. The business owners will also be able to determine the ratio of growth between raw material cost and sales as a means to determine at what point they will need to increase or decrease the product price and by how much. Conclusion: Studying Statistics is important for any business owner to establish a baseline for becoming successful. Statistical analysis of their company‟s cost for goods and services and how they relate to periodic influxes in the economy help to establish new goal and benchmarks for the company. Without constant statistical review, an owner may never realize that they need to change the price of their goods or services in order to stay in business or that they are losing employees because their wages do not support their current cost of living. Final Assignment 13 References: Lind, D., Marchal, W., & Wathen, S. (2008). Statistical Techniques in Business & Economics. (3rd ed.). New Delhi: Tata McGraw-Hill Publishing Company Limited. Pgs. 409-593. “Overseas Cost of Living”. (2008). Department of Defense Per Deim, Travel and Transportation Allowance Committee. Retrieved October 6, 2008, from: http://perdiem.hqda.pentagon.mil/cgi-bin/cola-oha/o_cola.pl

DOCUMENT INFO

Shared By:

Tags:

Stats:

views: | 1262 |

posted: | 2/14/2010 |

language: | English |

pages: | 14 |

Description:
When running a business it is imperative to perform four types of statistical analysis: ANOVA testing, Linear Regression, Correlation Analysis, and Pricing Indexes. By looking at each of the statistical analysis, the business owners and employees can determine how prices and wages will change along with what variables are the main influences for the change. By determining this, they are able to create models that can predict the future outcome with a statistical accuracy of 90-99%.

OTHER DOCS BY bkratzsr

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.