VIEWS: 70 PAGES: 8 POSTED ON: 9/27/2010 Public Domain
Statistic Definition and Use Method of Calculation Mean · The average of all data entries. · The sum of all the results · Measure of central tendency for normally divided by the number of distributed data. results. · DO NOT calculate a mean from values that are already averages. · DO NOT calculate a mean of ratios or percentages for groups of several difference sizes; go back to the raw data and recalculate. · DO NOT calculate a mean when the measurement scale is not linear (i.e. pH units are not measured on a linear scale). Median · The middle value of a range of results. · Arrange the data in · A good measure of central tendency for increasing rank order. skewed distributions. · Identify the middle value. · If there is an even number of data points, the median is calculated by adding the middle two values and dividing by two. Mode · The value that appears the greatest · Identify the category with number of times. the highest number of data · Suitable for bimodal distributions and entries using a tally chart qualitative data. or bar graph. Range · The difference between the smallest and · Identify the smallest and largest data values. largest values and find the · Provides a crude indication of data spread. difference between them. Standard Deviation Measuring the spread of the data Averages do not tell us everything about a sample. Samples can be very uniform with the data all bunched around the mean or they can be spread out a long way from the mean. The statistic that measures this spread is called the standard deviation. The standard deviation is a measure of the variation of the results, or the degree to which each data point in the set of data points varies (or deviates) from the mean The wider the spread of scores, the larger the standard deviation. For data that has a normal distribution, 68% of the data lies within one standard deviation of the mean. Calculate the standard deviation by subtracting the mean of a distribution from the value of each individual variable in the distribution, squaring each resulting difference, summing these squared differences, then dividing this sum by the number of variables, and finally taking the square root of this quotient. S = standard deviation Σ = sum of X = individual score M = mean of all scores n = sample size (number of scores) Example: Given the set of numbers {20, 23, 25, 26}, calculate the mean and the standard deviation. A. Mean = (20+23+25+26)/4 = 23.5 B. Standard deviation 1. Calculate (X-M) a. The mean of these numbers was found to be equal to 23.5. b. The deviations from the mean are respectively: · 23.5 - 20 = 3.5 · 23.5 - 23 = 0.5 · 25 - 23.5 = 1.5 · 26 - 23.5 = 2.5 2. Square each of these deviations to determine (X-M)2 · (3.5)2 = 12.25 · (0.5)2= 0.25 · (1.5)2= 2.25 · (2.5)2= 6.25 3. Add the values from step 2 together to get ∑(X-M)2 · 12.25 + 0.25 + 2.25 + 6.25 = 21. 4. Calculate (n-1) by subtracting 1 from your sample size · Since the were 4 original numbers, our n=4 · Therefore (n-1) = 3 5. Divide the answer from step 3 by the answer from step 4 to find ∑(X-M)2 n-1 · 21 / 3 = 7 6. Calculate the square root of your answer from step 5 to determine the standard deviation! The square root of 7 is approximately 2.65 7. Answer: the standard deviation of the set of numbers {20, 23, 25, 26} is 2.65. This means that 68% of the data lies within 2.65 of the mean (68% of the values are equal to 23.5 +/- 2.65). A Using EXCEL to calculate the mean and the standard deviation 1 Number of Pennies 1. Type the values you are trying to find the mean for in a column. You 2 134 can label the column, but you don’t have to. 3 130 2. Determine which box you want the mean to appear in. In the 4 136 example, I want the mean to appear in box A12. In that box, type: 5 132 =AVERAGE(A2:A11) where the A2:A11 are the box labels for 6 131 the data you want to average. Basically you are telling Excel to 7 137 average boxes A2 through A11. 3. Determine which box you want the standard deviation to appear in. In 8 131 the example, I want the standard deviation to appear in box A13. In 9 135 that box, type: =STDEV(A2:A11) where the A2:A11 are the box 10 130 labels for the data for which you want to find the standard deviation. 11 129 12 132.5 Calculating mean and standard deviation on the TI-83: 13 2.798809 1. First we have to enter the data. Hit the STAT button and you will see the options EDIT, CALC and TESTS atop the screen. Use the left and right arrows (if necessary) to move the cursor to EDIT, then select 1:Edit... 2. Now you will see a table with the headings L1 and L2. Enter the values under L1 (if you want to clear pre-existing data first, move the cursor to the top of the column, hit CLEAR and then ENTER.) 3. Once all the data is entered, go back to the STAT menu, but this time move the cursor to CALC instead of EDIT. 4. Once you're in the CALC menu, select 1-Var Stats, then hit ENTER. 5. The calculator will display the x-mean, some other stuff, and then the standard deviation ( sx). Note that sx is what we called s in class; the calculator refers to it as sx. This is followed by something called sigma x (which is what you would get as standard deviation if you had used n instead of n-1), and finally the sample size (there are n = 4 observations). T-Test A t-test is used to determine if the means of two samples (often an experimental and a control group) are truly, or at least significantly, different or if the difference between them is plausibly due to random variation not related to the hypothesis being tested. The formula for the t-test is a ratio. The top part of the ratio is the difference between the two means or averages. The bottom part is a measure of the variability of the data. Sample 1 Sample 2 Let’s us an example to help you learn the t-test: 7.85 12.50 8.51 12.94 Step 1: Find the means for each sample 13.66 6.26 11.03 6.10 Sample 1 mean = 8.96 6.59 13.19 Sample 2 mean = 11.36 8.04 10.74 14.16 6.06 Step 2: Find the absolute value of the difference between the means. 8.13 12.53 This is the top part of the t-test formula. 6.79 15.45 Mean 1 – mean 2 = 11.06 15.64 X1 – x2 = 5.83 15.19 10.73 14.93 8.96 – 11.36 = 6.68 7.94 -2.40 5.02 8.28 Absolute value = 2.40 10.37 12.65 Step 3: The bottom part is called the standard error of the difference. To compute it, first find then standard deviation for each sample. Sample 1 SD = 2.76 Sample 2 SD = 3.55 Step 4: Square the standard deviation for each group to find the “variance” for each group. Sample 1 variance = (2.76)2 = 7.63 Sample 1 variance = (3.55)2 = 12.57 Step 5: Divide each squared standard deviation by the sample size of that group. Sample 1: 7.63 / 15 = 0.51 Sample 2: 12.57 / 15 = 0.84 Step 6: Add these two values 0.51 + 0.84 = 1.35 Step 7: Take the square root of the number to find the “standard error of the difference” √1.35 = 1.16 Step 8: divide the difference in the means (step 2) by the standard error of the difference (step 7) T = 2.40 / 1.16 = 2.07 Step 9: You need to determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the sample sizes of both groups minus 2. DF = (15 +15) – 2 = 28 Step 10: Once you compute the t-value (answer from step 8) and the degrees of freedom (answer from step 9) you have to look it up in a table of significance to test whether the ratio is large enough to say that the difference between the groups is not likely to have been a chance finding. To test the significance, you need to set a risk level (called the alpha level). In most research, the "rule of thumb" is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically significant difference between the means even if there was none (i.e., by "chance"). Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of significance to determine whether the t-value is large enough to be significant. df .10 .05 .025 .01 .005 .000 1 3.078 6.314 12.706 31.821 63.657 636.619 2 1.886 2.920 4.303 6.965 9.925 31.598 3 1.638 2.353 3.182 4.541 5.841 12.941 4 1.533 2.132 2.776 3.747 4.604 8.610 5 1.476 2.015 2.571 3.365 4.032 6.859 6 1.440 1.943 2.447 3.143 3.707 5.959 7 1.415 1.895 2.365 2.998 3.499 5.405 8 1.397 1.860 2.306 2.896 3.355 5.041 9 1.383 1.833 2.262 2.821 3.250 4.781 10 1.372 1.812 2.228 2.764 3.169 4.587 11 1.363 1.796 2.201 2.718 3.106 4.437 12 1.356 1.782 2.179 2.681 3.055 4.318 13 1.350 1.771 2.160 2.650 3.012 4.221 14 1.345 1.761 2.145 2.624 2.977 4.140 15 1.341 1.753 2.131 2.602 2.947 4.073 16 1.337 1.746 2.120 2.583 2.921 4.015 17 1.333 1.740 2.110 2.567 2.898 3.965 18 1.330 1.734 2.101 2.552 2.878 3.922 19 1.328 1.729 2.093 2.539 2.861 3.883 20 1.325 1.725 2.086 2.528 2.845 3.850 21 1.323 1.721 2.080 2.518 2.831 3.819 22 1.321 1.717 2.074 2.508 2.819 3.792 23 1.319 1.714 2.069 2.500 2.807 3.767 24 1.318 1.711 2.064 2.492 2.797 3.745 25 1.316 1.708 2.060 2.485 2.787 3.725 26 1.315 1.706 2.056 2.479 2.779 3.707 27 1.314 1.703 2.052 2.473 2.771 3.690 28 1.313 1.701 2.048 2.467 2.763 3.674 29 1.311 1.699 2.045 2.462 2.756 3.659 30 1.310 1.697 2.042 2.457 2.750 3.646 40 1.303 1.684 2.021 2.423 2.704 3.551 60 1.296 1.671 2.000 2.390 2.660 3.460 120 1.289 1.658 1.980 2.358 2.617 3.373 c 1.282 1.645 1.960 2.326 2.576 3.291 If your calculated t value is greater than the number in the table, you can conclude that the difference between the means for the two groups is significantly different. In our example, the number in the table for our data is 1.701. So, since our calculated value (2.07) is greater than then number in the table, we must conclude that the difference between the two groups IS SIGNIFICANTLY DIFFERENT. To check your answers Sometimes it is nice to check your answers to make sure you are doing the calculations right. Use this website to check your results. Performing a t-test with Excel Excel calculates a T-test in a slightly different way. Rather than giving you the t value and comparing it to a table, Excel simply tells you the probability that the means are different simply due to chance. This is called a “P value.” Follow these steps to calculate a P value using a t-test with Excel: Step 1: Create two columns, side by side, for the data of interest. Each sample’s data should be in separate columns like in the example above. Step 2: Click on another blank cell where you wish the P value to appear. Step 3: Then click “fx” on the Excel toolbar and choose “statistical” from the “function” list, then “TTest” from the list. Step 4: Set the t-test parameters: ® For “Array1” highlight the data from one sample; for “Array2”, highlight the data in the second column. ® Enter “2” in the box for “Tails.” ® Lastly, you will have to select the “Type” of t-test. or our purposes type “2.” ® After answering these questions click “OK” and the P value will appear. The P value will fall between zero and one. Step 5: What does my P value mean? Using Excel with the same data from the sample given above, Excel give the number 0.05. This means that there is a 5% chance that the differences between the two samples are due to random chance alone. Another way to say this is that there is a 95% chance that the difference between these two samples is due to the variable being investigated. Normally will say that a P value of .05 or less is significant.