Statistic by hcj

VIEWS: 70 PAGES: 8

									  Statistic          Definition and Use                                 Method of Calculation

  Mean
                     ·    The average of all data entries.              ·    The sum of all the results
                     ·    Measure of central tendency for normally          divided by the number of
                         distributed data.                                  results.
                     ·    DO NOT calculate a mean from values
                         that are already averages.
                     ·    DO NOT calculate a mean of ratios or
                         percentages for groups of several
                         difference sizes; go back to the raw data
                         and recalculate.
                     ·    DO NOT calculate a mean when the
                         measurement scale is not linear (i.e. pH
                         units are not measured on a linear scale).

  Median
                     ·    The middle value of a range of results.       ·    Arrange the data in
                     ·    A good measure of central tendency for            increasing rank order.
                         skewed distributions.                          ·    Identify the middle value.
                                                                        ·    If there is an even number
                                                                            of data points, the median
                                                                            is calculated by adding the
                                                                            middle two values and
                                                                            dividing by two.

  Mode
                     ·    The value that appears the greatest           ·    Identify the category with
                         number of times.                                   the highest number of data
                     ·    Suitable for bimodal distributions and            entries using a tally chart
                         qualitative data.                                  or bar graph.

  Range
                     ·    The difference between the smallest and       ·    Identify the smallest and
                         largest data values.                               largest values and find the
                     ·    Provides a crude indication of data spread.       difference between them.


Standard Deviation
Measuring the spread of the data
Averages do not tell us everything about a sample. Samples can be very uniform with the data all
bunched around the mean or they can be spread out a long way from the mean. The statistic that
measures this spread is called the standard deviation.

      The standard deviation is a measure of the variation of the results, or the degree to which each
       data point in the set of data points varies (or deviates) from the mean

      The wider the spread of scores, the larger the standard deviation.

      For data that has a normal distribution, 68% of the data lies within one standard deviation of
       the mean.

Calculate the standard deviation by subtracting the mean of a distribution from the value of each
individual variable in the distribution, squaring each resulting difference, summing these squared
differences, then dividing this sum by the number of variables, and finally taking the square root of
this quotient.




                                                      S = standard deviation
                                                      Σ = sum of
                                                      X = individual score
                                                      M = mean of all scores
                                                      n = sample size (number of scores)



Example: Given the set of numbers {20, 23, 25, 26}, calculate the mean and the standard deviation.
   A. Mean = (20+23+25+26)/4 = 23.5

   B.    Standard deviation
        1. Calculate (X-M)
            a. The mean of these numbers was found to be equal to 23.5.
            b. The deviations from the mean are respectively:
               ·   23.5 - 20 = 3.5
               ·   23.5 - 23 = 0.5
               ·   25 - 23.5 = 1.5
               ·   26 - 23.5 = 2.5

        2.   Square each of these deviations to determine (X-M)2
               ·   (3.5)2 = 12.25
               ·   (0.5)2= 0.25
               ·   (1.5)2= 2.25
               ·   (2.5)2= 6.25

        3.   Add the values from step 2 together to get ∑(X-M)2
               ·   12.25 + 0.25 + 2.25 + 6.25 = 21.

        4.   Calculate (n-1) by subtracting 1 from your sample size
                ·   Since the were 4 original numbers, our n=4
                ·   Therefore (n-1) = 3

        5.   Divide the answer from step 3 by the answer from step 4 to find
                                               ∑(X-M)2
                                                  n-1
                ·   21 / 3 = 7
         6.    Calculate the square root of your answer from step 5 to determine the standard deviation!




                 The square root of 7 is approximately 2.65

         7.    Answer: the standard deviation of the set of numbers {20, 23, 25, 26} is 2.65. This means
              that 68% of the data lies within 2.65 of the mean (68% of the values are equal to 23.5 +/-
              2.65).
                                                                                                   A
Using EXCEL to calculate the mean and the standard deviation                         1     Number of
                                                                                           Pennies
    1.  Type the values you are trying to find the mean for in a column. You         2     134
       can label the column, but you don’t have to.                                  3     130
    2. Determine which box you want the mean to appear in. In the
                                                                                     4     136
       example, I want the mean to appear in box A12. In that box, type:
                                                                                     5     132
        =AVERAGE(A2:A11) where the A2:A11 are the box labels for
                                                                                     6     131
      the data you want to average. Basically you are telling Excel to
                                                                                     7     137
      average boxes A2 through A11.
    3. Determine which box you want the standard deviation to appear in. In          8     131
       the example, I want the standard deviation to appear in box A13. In           9     135
       that box, type: =STDEV(A2:A11) where the A2:A11 are the box                   10    130
       labels for the data for which you want to find the standard deviation.        11    129
                                                                                     12    132.5
Calculating mean and standard deviation on the TI-83:                                   13     2.798809
    1.    First we have to enter the data. Hit the STAT button and you will see
         the options EDIT, CALC and TESTS atop the screen. Use the left and right arrows (if
         necessary) to move the cursor to EDIT, then select 1:Edit...
    2.    Now you will see a table with the headings L1 and L2. Enter the values under L1 (if you want to
         clear pre-existing data first, move the cursor to the top of the column, hit CLEAR and then
         ENTER.)
    3.    Once all the data is entered, go back to the STAT menu, but this time move the cursor to
         CALC instead of EDIT.
    4.    Once you're in the CALC menu, select 1-Var Stats, then hit ENTER.
    5.    The calculator will display the x-mean, some other stuff, and then the standard deviation ( sx).
         Note that sx is what we called s in class; the calculator refers to it as sx. This is followed by
         something called sigma x (which is what you would get as standard deviation if you had used n
         instead of n-1), and finally the sample size (there are n = 4 observations).
T-Test
A t-test is used to determine if the means of two samples (often an experimental and a control group)
are truly, or at least significantly, different or if the difference between them is plausibly due to
random variation not related to the hypothesis being tested.




The formula for the t-test is a ratio. The top part of the ratio is the difference between the two
means or averages. The bottom part is a measure of the variability of the data.

Sample 1    Sample 2     Let’s us an example to help you learn the t-test:
7.85        12.50
8.51        12.94        Step 1: Find the means for each sample
13.66       6.26
11.03       6.10                 Sample 1 mean = 8.96
6.59        13.19                Sample 2 mean = 11.36
8.04        10.74
14.16       6.06         Step 2: Find the absolute value of the difference between the means.
8.13        12.53                This is the top part of the t-test formula.
6.79        15.45
                                 Mean 1 – mean 2 =
11.06       15.64
                                 X1 – x2 =
5.83        15.19
10.73       14.93
                                 8.96 – 11.36 =
6.68        7.94
                                 -2.40
5.02        8.28
                                 Absolute value = 2.40
10.37       12.65
Step 3: The bottom part is called the standard error of the difference. To compute it, first find
then standard deviation for each sample.

       Sample 1 SD = 2.76
       Sample 2 SD = 3.55

Step 4: Square the standard deviation for each group to find the “variance” for each group.

       Sample 1 variance = (2.76)2 = 7.63
       Sample 1 variance = (3.55)2 = 12.57
Step 5: Divide each squared standard deviation by the sample size of that group.

        Sample 1: 7.63 / 15 = 0.51
        Sample 2: 12.57 / 15 = 0.84

Step 6: Add these two values

        0.51 + 0.84 = 1.35

Step 7: Take the square root of the number to find the “standard error of the difference”

        √1.35 = 1.16

Step 8: divide the difference in the means (step 2) by the standard error of the difference (step 7)


        T = 2.40 / 1.16 =   2.07



Step 9: You need to determine the degrees of freedom (df) for the test. In the t-test, the degrees
of freedom is the sum of the sample sizes of both groups minus 2.


        DF = (15 +15) – 2 = 28



Step 10: Once you compute the t-value (answer from step 8) and the degrees of freedom (answer
from step 9) you have to look it up in a table of significance to test whether the ratio is large enough to
say that the difference between the groups is not likely to have been a chance finding. To test the
significance, you need to set a risk level (called the alpha level). In most research, the "rule of thumb"
is to set the alpha level at .05. This means that five times out of a hundred you would find a
statistically significant difference between the means even if there was none (i.e., by "chance").



Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of
significance to determine whether the t-value is large enough to be significant.




      df           .10              .05           .025            .01            .005            .000
       1          3.078            6.314         12.706          31.821         63.657         636.619
      2           1.886            2.920          4.303          6.965          9.925          31.598
      3           1.638            2.353          3.182          4.541           5.841          12.941
      4           1.533          2.132          2.776           3.747          4.604          8.610
      5           1.476          2.015          2.571           3.365          4.032          6.859
      6           1.440          1.943          2.447           3.143          3.707          5.959
      7           1.415          1.895          2.365           2.998          3.499          5.405
      8           1.397          1.860          2.306           2.896          3.355          5.041
      9           1.383          1.833          2.262           2.821          3.250          4.781
      10          1.372          1.812          2.228           2.764          3.169          4.587
      11          1.363          1.796          2.201           2.718          3.106          4.437
      12          1.356          1.782          2.179           2.681          3.055          4.318
      13          1.350          1.771          2.160           2.650          3.012          4.221
      14          1.345          1.761          2.145           2.624          2.977          4.140
      15          1.341          1.753          2.131           2.602          2.947          4.073
      16          1.337          1.746          2.120           2.583          2.921          4.015
      17          1.333          1.740          2.110           2.567          2.898          3.965
      18          1.330          1.734          2.101           2.552          2.878          3.922
      19          1.328          1.729          2.093           2.539          2.861          3.883
      20          1.325          1.725          2.086           2.528          2.845          3.850
      21          1.323          1.721          2.080           2.518          2.831          3.819
      22          1.321          1.717          2.074           2.508          2.819          3.792
      23          1.319          1.714          2.069           2.500          2.807          3.767
      24          1.318          1.711          2.064           2.492          2.797          3.745
      25          1.316          1.708          2.060           2.485          2.787          3.725
      26          1.315          1.706          2.056           2.479          2.779          3.707
      27          1.314          1.703          2.052           2.473          2.771          3.690
      28          1.313          1.701          2.048           2.467          2.763          3.674
      29          1.311          1.699          2.045           2.462          2.756          3.659
      30          1.310          1.697          2.042           2.457          2.750          3.646
      40          1.303          1.684          2.021           2.423          2.704          3.551
      60          1.296          1.671          2.000           2.390          2.660          3.460
     120          1.289          1.658          1.980           2.358          2.617          3.373
       c          1.282          1.645          1.960           2.326          2.576          3.291


 If your calculated t value is greater than the number in the table, you can
 conclude that the difference between the means for the two groups is
 significantly different.



In our example, the number in the table for our data is 1.701. So, since our calculated value (2.07) is
greater than then number in the table, we must conclude that the difference between the two groups
IS SIGNIFICANTLY DIFFERENT.


To check your answers
Sometimes it is nice to check your answers to make sure you are doing the calculations right. Use this
website to check your results.
Performing a t-test with Excel

Excel calculates a T-test in a slightly different way. Rather than giving you the t value and comparing it
to a table, Excel simply tells you the probability that the means are different simply due to
chance. This is called a “P value.”

Follow these steps to calculate a P value using a t-test with Excel:

Step 1: Create two columns, side by side, for the data of interest. Each sample’s data should be in
separate columns like in the example above.

Step 2: Click on another blank cell where you wish the P value to appear.

Step 3: Then click “fx” on the Excel toolbar and choose “statistical” from the “function” list, then
“TTest” from the list.

Step 4: Set the t-test parameters:
   ® For “Array1” highlight the data from one sample; for “Array2”, highlight the data in the second
       column.
   ® Enter “2” in the box for “Tails.”
   ® Lastly, you will have to select the “Type” of t-test. or our purposes type “2.”
   ® After answering these questions click “OK” and the P value will appear. The P value will fall
       between zero and one.

Step 5: What does my P value mean? Using Excel with the same data from the sample given above,
Excel give the number 0.05. This means that there is a 5% chance that the differences between the
two samples are due to random chance alone. Another way to say this is that there is a 95% chance
that the difference between these two samples is due to the variable being investigated. Normally will
say that a P value of .05 or less is significant.

								
To top