Document Sample

10/29/2009 We discussed one statistic quantity, which was the mean. These quantities are Variance and supposed to extract some understanding of our data. Standard You probably have some intuition for what Deviation (7.5) the mean is. The standard deviation measures how much, on average, our data differs from the average. 1 2 Here is the typical example. Consider the following In this case, the mean is distributions for students’ scores on a quiz. In the first one, everyone received 40%. 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Total Score (out of 10) 3 Total Score (out of 10) 4 1 10/29/2009 Here is a different distribution, where there are 5 Here we can also compute the mean score, and it is different scores, and 20% of the students have scores of 2, 3, 4, 5, and 6. 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Total Score (out of 10) 5 Total Score (out of 10) 6 Continuing to denote the mean by μ, you So, in both cases, the mean is the same, but would probably guess that the best way to the distributions are quite different. There is define the average difference from the a quantity that distinguishes between these mean is two, which measures the average “spread” of the data about the mean. However, notice that in the second example It is called the standard deviation. from before, the positive and negative contributions will cancel, and so we will take the square to eliminate this problem. 7 8 2 10/29/2009 So we define the variance as So, if you’d like to see some algebra performed on random variables, we have that, for a probability distribution Just note that in the case of having some sample from which we understand something about our larger population, there is a slightly different, better estimator for the sample variance. I don’t want to mention here this so as not to confuse you, but I’ll say some Here we use linearity of expected value and words about this in class. 9 the fact that E(X) is a constant. 10 Example (7.5.7): This table has the relative frequency distribution for the weekly sales of two businesses, A and B. a.) Compute the population mean and the variance for each business. b.) Which business has the better sales record? c.) Which business has the more consistent sales record? Relative freq. Relative freq. Sales A B Sales A B Putting these numbers 100 0.1 0.0 Part (a) is the only 100 0.1 0.0 into a calculator, we have that 101 0.2 0.2 one with actual 101 0.2 0.2 102 0.3 0.0 work. Let’s first 102 0.3 0.0 103 0.0 0.2 find the mean in 103 0.0 0.2 104 0.0 0.1 each case. 104 0.0 0.1 105 0.2 0.2 105 0.2 0.2 106 0.2 0.3 11 106 0.2 0.3 12 3 10/29/2009 One way to find the variance now is to compute E(X2) in each case. Let’s recall the computation we just did for E(X) first: For part (b), we see that Business B has the higher mean number of sales, so they have the better sales record, and for part (c), Business B also has a smaller variance, so Now we just apply our formula for variance. they are more consistent as well. 13 14 Theorem (Chebychev’s Inequality): Definition: The standard deviation is the Let μ be the mean and let σ be the square root of the variance. It is usually standard deviation of a probability denoted by the letter σ. distribution. Then You are probably more used to hearing about the standard deviation than the The bounds that the theorem yields are in variance. practice not so great… 15 16 4 10/29/2009 Example (7.5.12): An electronics firm determines that the number of Example (7.5.12): An electronics firm determines that the number of defective transistors in each batch averages 15 with standard defective transistors in each batch averages 15 with standard deviation 10. Suppose that 100 batches are produced. Estimate the deviation 10. Suppose that 100 batches are produced. Estimate the number of batches having between 0 and 30 defective transistors. number of batches having between 0 and 30 defective transistors. We’ll use Chebychev’s Inequality. Here we’re trying to find the probability of being within 15 of the mean. That is, we’re trying to find Our theorem says that this will be So at least of the batches will have between 0 and 30 defective transistors, which is 17 approximately 56 batches. 18 Recall that last time we had a simply formula Example: What is the probability of success for a binomial random variable with 20 trials whose variance is 5? for the expected value of a binomial random variable, which was We just need to plug the numbers and into the formula There is also a simple formula for its variance, which is remembering that q = 1 – p. Then we get the equation 19 20 5 10/29/2009 Example (7.5.15): The probability distribution for the sum of numbers obtained from tossing a pair of dice is given in the table. a.) Compute the mean and the variance of this probability distribution. b.) Using the table, calculate the probability that the number is between 4 and 10, inclusive. This has solution p = ½, but we can solve it c.) Use the Chebychev inequality to estimate the probability that the number is between 4 and 10, inclusive. explicitly… Number 2 3 4 5 6 7 8 9 10 11 12 Probability Hopefully you can compute the mean easily now. Just to make things clear, I’ll compute the variance in two different ways. 21 22 (a). The mean is given by The variance is then If we wanted to, we could have computed the This is what you should have expected. Now variance using our original formula let’s start computing the variance. So let’s do that now, in case you like that way better. 23 24 6 10/29/2009 Number 2 3 4 5 6 7 8 9 10 11 12 Probability For part (b), we have to calculate the probability that the sum rolled is between 4 As expected, we get the same answer. Notice and 10 inclusive, using the table. This is just that the standard deviation is adding a few numbers together: 25 26 For part (c), we will use Chebychev’s inequality to estimate what this should be. Remember that the inequality states This is because we want to find Following the theorem with c = 3, In our case, μ = 7 and σ = 2.42. We want to estimate the probability that the number is between 4 and 10 inclusive, and so we should take c to be 3 . So the bounds that we get are not that great. 27 28 7

DOCUMENT INFO

Shared By:

Categories:

Tags:
standard deviation, expected return, sample variance, normal distribution, data set, population variance, hypothesis testing, square root, expected value, unsystematic risk, beta coefficient, risk ratio, analysis of variance, systematic risk, random variable

Stats:

views: | 20 |

posted: | 8/29/2010 |

language: | English |

pages: | 7 |

OTHER DOCS BY drr10525

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.