Measures of Central Tendency and

Document Sample
Measures of Central Tendency and Powered By Docstoc
					Educational Statistics

Measures of Central Tendency
 and Variability (Dispersion)
Over the Counter Drug Sales
                (in Millions)
  Tylenol      855          Mylanta       135
   Advil       360           Tums         135
   Vicks       350         Excedrin       130
One Touch      220         Benadryl       130
 Robitussin    205           Halls        130
   Bayer       170         Metamucil      125
Alka-Seltzer   160          Sudafed       115
  Centrum      150
   What is the AVERAGE number of drug sales?
       Measures of Central Tendency

Measures of central tendency tell us
  something about the “typicalness” of a set
  of data.
• Tell us what the typical score is in a
  distribution of scores.
• Three measures of central tendency:
  – Mode
  – Median
  – Mean
     Measures of central tendency:
              the mode
The mode is the score that occurs most
  frequently in a distribution.
Sometimes more than one score occurs at
  frequencies distinctively higher than other
  scores, in which case there is(are) more
  than one mode:
  – Bi-modal distributions.
  – Multi-modal distributions.
Only appropriate measure of central
 tendency for nominal data.
Over the Counter Drug Sales
                   (in Millions)

  Tylenol       855            Mylanta        135
   Advil        360             Tums          135
   Vicks        350            Excedrin       130
One Touch       220            Benadryl       130
 Robitussin     205              Halls        130
   Bayer        170           Metamucil       125
Alka-Seltzer    160            Sudafed        115
  Centrum       150
       What is the MODE of this distribution?
     Measures of central tendency:
           the median (Md)
The median is the middle score in an ordered
  distribution of scores.
It is the score that divides a distribution in
  half.
It is also the score at the 50th percentile
  rank.
The median can be found by computing the
  median location: (N + 1)/2.
The median is the most appropriate measure
  of central tendency for ordinal data.
      Measures of central tendency:
            the median (Md)
In the distribution given   Score Freq Cum. F %c
  to the right, find the     6    2      2     3.8
  median.                    8    5      7    13.5
What is the percentile       9    0      7    13.5

  rank (PR) of a score of   10    8     15    28.8

  13?
                            11   11     26    50.0
                            12    9     35    67.3
What is the score cor-      13    6     41    78.8
  responding to a           14   4      45    86.6
  percentile rank of 29?    15    5     50    96.2
                            16    2     52   100.0
Over the Counter Drug Sales
                  (in Millions)
  Tylenol       855              Mylanta       135
   Advil        360               Tums         135
   Vicks        350             Excedrin       130
One Touch       220             Benadryl       130
 Robitussin     205               Halls        130
   Bayer        170             Metamucil      125
Alka-Seltzer    160              Sudafed        115
  Centrum       150
       What is the Median of this distribution?
    Measures of central tendency:
        the Mean (µ,M, or     )

• The most widely used measure of
  central tendency:



• In words, the mean ( ) is the sum (Σ)
  of the scores (the X’s) divided by the
  number of scores (N).
Over the Counter Drug Sales
                 (in Millions)
  Tylenol       855              Mylanta         135
   Advil        360                Tums          135
   Vicks        350              Excedrin        130
One Touch       220              Benadryl        130
 Robitussin     205                Halls         130
   Bayer        170             Metamucil        125
Alka-Seltzer    160               Sudafed        115
  Centrum       150
        What is the Mean of this distribution?
 Measures of Central Tendency
   with special Distributions
• The mode and bimodal distributions.
  – For distributions with more than one mode, the
    other measures of central tendency are
    misleading.
• The Median and skewed distributions.
  – When a distribution is skewed the use of the
    mean may be misleading
  – Skew can be determined by the relative
    positions of the mean, median, and mode.
    Measures of Variability
• How would you describe the   410   500
  variability in the
  distribution of SAT-V        450   515
  scores given at the right?
• In other words, how          465   535
  “spread-out” are the
  scores?                      485   545

• Think about it.              500   585
• Write these values down.
   Measures of Variability
• Three common measures of variability
  are
  – The Range.
  – The Variance.
  – The Standard deviation.
• Other measures of variability are
  – The interquartile range.
  – The quartile deviation or semi-interquartile
   range.
     Measures of Variability
        The Range: (R)
• R = The difference between the
  largest value in the distribution and
  the smallest value in the distribution.
• I.e. R = Xlargest – Xsmallest.
• Compute the Range for the
  distribution given.
• R = 175.
   Measures of Variability
    The Variance (Var):
• The variance is more computationally
  comples.
• Defined as the average squared
  deviation from the mean of the
  distribution.
• In symbols:
    Computing the Variance

• First, compute   X     X
  the sum:
                   410   500
                   450   515
                   465   535
                   485   545
                   500   585
    Computing the Variance

• First, compute    X     X
  the sum:
                    410   500
                    450   515
• Then, divide by   465   535
  N:                485   545
                    500   585
    Computing the Variance
• Next, subtract the X      d   X     d
  mean from each
  score (call these   410       500
  deviations from the
  mean, or, d ):      450       515

                      465       535

                      485       545

                      500       585
    Computing the Variance
• Next, subtract the X      d     X     d
  mean from each
  score (call these   410   -89   500   1
  deviations from the
  mean, or, d ):      450   -49   515   16

                      465   -34   535   36

                      485   -14   545   46

                      500   1     585   86
    Computing the Variance
• Next, Square the      d     d2   d    d2
  deviations from the
                        -89        1
  mean:
                        -49        16

                        -34        36

                        -14        46

                        1          86
    Computing the Variance
• Next, Square the      d     d2     d    d2
  deviations from the
                        -89   7921   1     1
  mean:
                        -49   2401   16   256

                        -34   1156   36   1296

                        -14   196    46   2116

                        1      1     86   7396
    Computing the Variance
Now, sum the           d     d2     d    d2
 squared deviations:
                       -89   7921   1     1

                       -49   2401   16   256

                       -34   1156   36   1296
And divide by N:
                       -14   196    46   2116

                       1      1     86   7396
  Measures of Variability
  The Standard Deviation:
Generally, we would prefer a measure of
  variability that tells us something about
  how far, on average, scores deviate from
  the mean.
This is what the standard deviation tells us.
Since the variance is the average squared
  deviation from the mean, the standard
  deviation, computed as the square root of
  the variance gives us the average deviation
  from the mean.
     Measures of Variability
 The Coefficient of Variation (CV)
Distributions with larger means tend to
 have larger variances (and SDs) than
 distributions with smaller means.
The CV provides convenient way to
 compare the variances of two or more
 distributions.
         SD
    CV 
          X
Using Statistics as Estimators
We are rarely interested in sample
 statistics…we are interested in
 population parameters.
Statistics are used to estimate (or
 make inferences about) parameters.
The best statistics are sufficient,
 unbiased, efficient, and robust (or
 resistant)
End of Presentation