# Statistic by hcj

VIEWS: 70 PAGES: 8

• pg 1
```									  Statistic          Definition and Use                                 Method of Calculation

Mean
·    The average of all data entries.              ·    The sum of all the results
·    Measure of central tendency for normally          divided by the number of
distributed data.                                  results.
·    DO NOT calculate a mean from values
·    DO NOT calculate a mean of ratios or
percentages for groups of several
difference sizes; go back to the raw data
and recalculate.
·    DO NOT calculate a mean when the
measurement scale is not linear (i.e. pH
units are not measured on a linear scale).

Median
·    The middle value of a range of results.       ·    Arrange the data in
·    A good measure of central tendency for            increasing rank order.
skewed distributions.                          ·    Identify the middle value.
·    If there is an even number
of data points, the median
middle two values and
dividing by two.

Mode
·    The value that appears the greatest           ·    Identify the category with
number of times.                                   the highest number of data
·    Suitable for bimodal distributions and            entries using a tally chart
qualitative data.                                  or bar graph.

Range
·    The difference between the smallest and       ·    Identify the smallest and
largest data values.                               largest values and find the
·    Provides a crude indication of data spread.       difference between them.

Standard Deviation
Measuring the spread of the data
Averages do not tell us everything about a sample. Samples can be very uniform with the data all
bunched around the mean or they can be spread out a long way from the mean. The statistic that
measures this spread is called the standard deviation.

   The standard deviation is a measure of the variation of the results, or the degree to which each
data point in the set of data points varies (or deviates) from the mean

   The wider the spread of scores, the larger the standard deviation.

   For data that has a normal distribution, 68% of the data lies within one standard deviation of
the mean.

Calculate the standard deviation by subtracting the mean of a distribution from the value of each
individual variable in the distribution, squaring each resulting difference, summing these squared
differences, then dividing this sum by the number of variables, and finally taking the square root of
this quotient.

S = standard deviation
Σ = sum of
X = individual score
M = mean of all scores
n = sample size (number of scores)

Example: Given the set of numbers {20, 23, 25, 26}, calculate the mean and the standard deviation.
A. Mean = (20+23+25+26)/4 = 23.5

B.    Standard deviation
1. Calculate (X-M)
a. The mean of these numbers was found to be equal to 23.5.
b. The deviations from the mean are respectively:
·   23.5 - 20 = 3.5
·   23.5 - 23 = 0.5
·   25 - 23.5 = 1.5
·   26 - 23.5 = 2.5

2.   Square each of these deviations to determine (X-M)2
·   (3.5)2 = 12.25
·   (0.5)2= 0.25
·   (1.5)2= 2.25
·   (2.5)2= 6.25

3.   Add the values from step 2 together to get ∑(X-M)2
·   12.25 + 0.25 + 2.25 + 6.25 = 21.

4.   Calculate (n-1) by subtracting 1 from your sample size
·   Since the were 4 original numbers, our n=4
·   Therefore (n-1) = 3

5.   Divide the answer from step 3 by the answer from step 4 to find
∑(X-M)2
n-1
·   21 / 3 = 7
6.    Calculate the square root of your answer from step 5 to determine the standard deviation!

The square root of 7 is approximately 2.65

7.    Answer: the standard deviation of the set of numbers {20, 23, 25, 26} is 2.65. This means
that 68% of the data lies within 2.65 of the mean (68% of the values are equal to 23.5 +/-
2.65).
A
Using EXCEL to calculate the mean and the standard deviation                         1     Number of
Pennies
1.  Type the values you are trying to find the mean for in a column. You         2     134
can label the column, but you don’t have to.                                  3     130
2. Determine which box you want the mean to appear in. In the
4     136
example, I want the mean to appear in box A12. In that box, type:
5     132
=AVERAGE(A2:A11) where the A2:A11 are the box labels for
6     131
the data you want to average. Basically you are telling Excel to
7     137
average boxes A2 through A11.
3. Determine which box you want the standard deviation to appear in. In          8     131
the example, I want the standard deviation to appear in box A13. In           9     135
that box, type: =STDEV(A2:A11) where the A2:A11 are the box                   10    130
labels for the data for which you want to find the standard deviation.        11    129
12    132.5
Calculating mean and standard deviation on the TI-83:                                   13     2.798809
1.    First we have to enter the data. Hit the STAT button and you will see
the options EDIT, CALC and TESTS atop the screen. Use the left and right arrows (if
necessary) to move the cursor to EDIT, then select 1:Edit...
2.    Now you will see a table with the headings L1 and L2. Enter the values under L1 (if you want to
clear pre-existing data first, move the cursor to the top of the column, hit CLEAR and then
ENTER.)
3.    Once all the data is entered, go back to the STAT menu, but this time move the cursor to
4.    Once you're in the CALC menu, select 1-Var Stats, then hit ENTER.
5.    The calculator will display the x-mean, some other stuff, and then the standard deviation ( sx).
Note that sx is what we called s in class; the calculator refers to it as sx. This is followed by
something called sigma x (which is what you would get as standard deviation if you had used n
instead of n-1), and finally the sample size (there are n = 4 observations).
T-Test
A t-test is used to determine if the means of two samples (often an experimental and a control group)
are truly, or at least significantly, different or if the difference between them is plausibly due to
random variation not related to the hypothesis being tested.

The formula for the t-test is a ratio. The top part of the ratio is the difference between the two
means or averages. The bottom part is a measure of the variability of the data.

Sample 1    Sample 2     Let’s us an example to help you learn the t-test:
7.85        12.50
8.51        12.94        Step 1: Find the means for each sample
13.66       6.26
11.03       6.10                 Sample 1 mean = 8.96
6.59        13.19                Sample 2 mean = 11.36
8.04        10.74
14.16       6.06         Step 2: Find the absolute value of the difference between the means.
8.13        12.53                This is the top part of the t-test formula.
6.79        15.45
Mean 1 – mean 2 =
11.06       15.64
X1 – x2 =
5.83        15.19
10.73       14.93
8.96 – 11.36 =
6.68        7.94
-2.40
5.02        8.28
Absolute value = 2.40
10.37       12.65
Step 3: The bottom part is called the standard error of the difference. To compute it, first find
then standard deviation for each sample.

Sample 1 SD = 2.76
Sample 2 SD = 3.55

Step 4: Square the standard deviation for each group to find the “variance” for each group.

Sample 1 variance = (2.76)2 = 7.63
Sample 1 variance = (3.55)2 = 12.57
Step 5: Divide each squared standard deviation by the sample size of that group.

Sample 1: 7.63 / 15 = 0.51
Sample 2: 12.57 / 15 = 0.84

Step 6: Add these two values

0.51 + 0.84 = 1.35

Step 7: Take the square root of the number to find the “standard error of the difference”

√1.35 = 1.16

Step 8: divide the difference in the means (step 2) by the standard error of the difference (step 7)

T = 2.40 / 1.16 =   2.07

Step 9: You need to determine the degrees of freedom (df) for the test. In the t-test, the degrees
of freedom is the sum of the sample sizes of both groups minus 2.

DF = (15 +15) – 2 = 28

Step 10: Once you compute the t-value (answer from step 8) and the degrees of freedom (answer
from step 9) you have to look it up in a table of significance to test whether the ratio is large enough to
say that the difference between the groups is not likely to have been a chance finding. To test the
significance, you need to set a risk level (called the alpha level). In most research, the "rule of thumb"
is to set the alpha level at .05. This means that five times out of a hundred you would find a
statistically significant difference between the means even if there was none (i.e., by "chance").

Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of
significance to determine whether the t-value is large enough to be significant.

df           .10              .05           .025            .01            .005            .000
1          3.078            6.314         12.706          31.821         63.657         636.619
2           1.886            2.920          4.303          6.965          9.925          31.598
3           1.638            2.353          3.182          4.541           5.841          12.941
4           1.533          2.132          2.776           3.747          4.604          8.610
5           1.476          2.015          2.571           3.365          4.032          6.859
6           1.440          1.943          2.447           3.143          3.707          5.959
7           1.415          1.895          2.365           2.998          3.499          5.405
8           1.397          1.860          2.306           2.896          3.355          5.041
9           1.383          1.833          2.262           2.821          3.250          4.781
10          1.372          1.812          2.228           2.764          3.169          4.587
11          1.363          1.796          2.201           2.718          3.106          4.437
12          1.356          1.782          2.179           2.681          3.055          4.318
13          1.350          1.771          2.160           2.650          3.012          4.221
14          1.345          1.761          2.145           2.624          2.977          4.140
15          1.341          1.753          2.131           2.602          2.947          4.073
16          1.337          1.746          2.120           2.583          2.921          4.015
17          1.333          1.740          2.110           2.567          2.898          3.965
18          1.330          1.734          2.101           2.552          2.878          3.922
19          1.328          1.729          2.093           2.539          2.861          3.883
20          1.325          1.725          2.086           2.528          2.845          3.850
21          1.323          1.721          2.080           2.518          2.831          3.819
22          1.321          1.717          2.074           2.508          2.819          3.792
23          1.319          1.714          2.069           2.500          2.807          3.767
24          1.318          1.711          2.064           2.492          2.797          3.745
25          1.316          1.708          2.060           2.485          2.787          3.725
26          1.315          1.706          2.056           2.479          2.779          3.707
27          1.314          1.703          2.052           2.473          2.771          3.690
28          1.313          1.701          2.048           2.467          2.763          3.674
29          1.311          1.699          2.045           2.462          2.756          3.659
30          1.310          1.697          2.042           2.457          2.750          3.646
40          1.303          1.684          2.021           2.423          2.704          3.551
60          1.296          1.671          2.000           2.390          2.660          3.460
120          1.289          1.658          1.980           2.358          2.617          3.373
c          1.282          1.645          1.960           2.326          2.576          3.291

If your calculated t value is greater than the number in the table, you can
conclude that the difference between the means for the two groups is
significantly different.

In our example, the number in the table for our data is 1.701. So, since our calculated value (2.07) is
greater than then number in the table, we must conclude that the difference between the two groups
IS SIGNIFICANTLY DIFFERENT.

Sometimes it is nice to check your answers to make sure you are doing the calculations right. Use this
Performing a t-test with Excel

Excel calculates a T-test in a slightly different way. Rather than giving you the t value and comparing it
to a table, Excel simply tells you the probability that the means are different simply due to
chance. This is called a “P value.”

Follow these steps to calculate a P value using a t-test with Excel:

Step 1: Create two columns, side by side, for the data of interest. Each sample’s data should be in
separate columns like in the example above.

Step 2: Click on another blank cell where you wish the P value to appear.

Step 3: Then click “fx” on the Excel toolbar and choose “statistical” from the “function” list, then
“TTest” from the list.

Step 4: Set the t-test parameters:
® For “Array1” highlight the data from one sample; for “Array2”, highlight the data in the second
column.
® Enter “2” in the box for “Tails.”
® Lastly, you will have to select the “Type” of t-test. or our purposes type “2.”
® After answering these questions click “OK” and the P value will appear. The P value will fall
between zero and one.

Step 5: What does my P value mean? Using Excel with the same data from the sample given above,
Excel give the number 0.05. This means that there is a 5% chance that the differences between the
two samples are due to random chance alone. Another way to say this is that there is a 95% chance
that the difference between these two samples is due to the variable being investigated. Normally will
say that a P value of .05 or less is significant.

```
To top