Docstoc

lec.24 empirical.ppt

Document Sample
lec.24 empirical.ppt Powered By Docstoc
					EMPIRICAL RELATION BETWEEN
         THE MEAN,
   MEDIAN AND THE MODE



           Tariq Mahmood Bajwa
    Median in Case of a Frequency Distribution of a Continuous
                            Variable:
In case of a frequency distribution, the median is given by the
formula :
                        ~      hn   
                        X  l    c
                               f 2  
 where
 l=    lower class boundary of
       the median class (i.e. that class for which the
       cumulative frequency is just in excess of n/2).
 h=    class interval size of the median class
 f=    frequency of the median class
 n=    f (the total number of observations)
 c=    cumulative frequency of the class preceding the
       median class
    Example:
         Going back to the example of
    the EPA mileage ratings, we have
 Mileage      No. of       Class       Cumulative
  Rating      Cars      Boundaries     Frequency
                                                    Median
30.0 – 32.9     2      29.95 – 32.95       2
                                                     class
33.0 – 35.9     4      32.95 – 35.95       6
36.0 – 38.9    14      35.95 – 38.95      20
39.0 – 41.9            38.95 – 41.95
                                                      c
                8                         28
42.0 – 44.9     2      41.95 – 44.95      30



f             h= class interval = 3       l
              n/2 = 30/2 = 15
     In this example, n = 30 and n/2 = 15.
     Thus the third class is the median class. The
median lies somewhere between 35.95 and 38.95.
Applying the above formula, we obtain

          X  35.95  15  6
          ~            3
                      14
             35.95  1.93
             37.88
            
            ~ 37.9
          Interpretation

 Thisresult implies that half of
 the cars have mileage less than
 or up to 37.88 miles per gallon
 whereas the other half of the
 cars have mileage greater than
 37.88 miles per gallon.
                      Example
The following table contains the ages of 50 managers of child-
care centers in five cities of a developed country.


          Ages of a sample of managers
           of Urban child-care centers
           42       26        32        34       57
           30       58        37        50       30
           53       40        30        47       49
           50       40        32        31       40
           52       28        23        35       25
           30       36        32        26       50
           55       30        58        64       52
           49       33        43        46       32
           61       31        30        40       60
           74       37        29        43       54
Having converted this data into a frequency
 distribution, find the median age.
                        Solution
Following the various steps involved in the construction
  of a frequency distribution, we obtained:


               Frequency Distribution of
               Child-Care Managers Age
           Class Interval           Frequency
              20 – 29                   6
              30 – 39                  18
              40 – 49                  11
              50 – 59                  11
              60 – 69                   3
              70 – 79                   1
               Total                   50
Now, the median is given by,

                  ~      hn   
                  X  l    c
                         f 2 

 where
 l=      lower class boundary of the median class
 h=      class interval size of the median class
 f=      frequency of the median class
 n=      f (the total number of observations)
 c=      cumulative frequency of the class preceding the
         median class
First of all, we construct the column of class boundary
  as well as the column of cumulative frequencies.


                                                    Cumulative
                            Class       Frequency
           Class limits                             Frequency
                          Boundaries        f
                                                       c.f
             20 – 29      19.5 – 29.5       6           6
             30 – 39      29.5 – 39.5      18           24
             40 – 49      39.5 – 49.5      11           35
             50 – 59      49.5 – 59.5      11           46
             60 – 69      59.5 – 69.5       3           49
             70 – 79      69.5 – 79.5       1           50
              Total                        50
Now, first of all we have to determine the median class
 (i.e. that class for which the cumulative frequency is
 just in excess of n/2).



In this example,

            n = 50

            implying that

            n/2 = 50/2 = 25
                                                  Cumulative
                          Class       Frequency
         Class limits                             Frequency
                        Boundaries        f
                                                     c.f
           20 – 29      19.5 – 29.5       6           6
Median     30 – 39      29.5 – 39.5      18           24
 class     40 – 49      39.5 – 49.5      11           35
           50 – 59      49.5 – 59.5      11           46
           60 – 69      59.5 – 69.5       3           49
           70 – 79      69.5 – 79.5       1           50
            Total                        50
Hence,
l = 39.5
h = 10
f = 11
and
c = 24
Substituting these values in the formula, we obtain:


                     10
         X  39.95     25  24
                     11
            39.95  0.9
            40.4
              Interpretation
Thus, we conclude that the median age is 40.4
  years.
In other words, 50% of the managers are
  younger than this age, and 50% are older.
       Example
    WAGES OF WORKERS
         IN A FACTORY
Monthly Income       No. of
   (in Rupees)      Workers
Less than 2000/-      100
 2000/- to 2999/-     300
 3000/- to 3999/-     500
 4000/- to 4999/-     250
5000/- and above      50
      Total          1200
In this example, both the first class and the last class are open-
ended classes. This is so because of the fact that we do not have
exact figures to begin the first class or to end the last class. The
advantage of computing the median in the case of an open-ended
frequency distribution is that, except in the unlikely event of the
median falling within an open-ended group occurring in the
beginning of our frequency distribution, there is no need to
estimate the upper or lower boundary.
     EMPIRICAL RELATION BETWEEN THE MEAN,
             MEDIAN AND THE MODE


• This is a concept which is not based on a rigid
  mathematical formula; rather, it is based on
  observation. In fact, the word ‘empirical’ implies
  ‘based on observation’.
•     This concept relates to the relative positions of
  the mean, median and the mode in case of a hump-
  shaped distribution.
•     In a single-peaked frequency distribution, the
  values of the mean, median and mode overlap if the
  frequency distribution is absolutely symmetrical.
    THE SYMMETRIC CURVE
f




                             X
      Mean = Median = Mode
But in the case of a skewed distribution, the mean,
median and mode do not all lie on the same point.
They are pulled apart from each other, and the
empirical relation explains the way in which this
happens. Experience tells us that in a unimodal
curve of moderate skewness, the median is usually
sandwiched between the mean and the mode.
       The second point is that, in the case of many
real-life data-sets, it has been observed that the
distance between the mode and the median is
approximately double of the distance between the
median and the mean, as shown below:
       f




                                                 X

                             Median
                      Mode




                                      Mean
        This diagrammatic picture is equivalent to the
 following algebraic expression:
Median - Mode   
                ~ 2 (Mean - Median) ------ (1)
        The above-mentioned point can also be expressed in
the following way:
    Mean – Mode   
                  ~ 3 (Mean – Median)   ---- (2)

      Equation (1) as well as equation (2) yields the
approximate relation given below:
               EMPIRICAL RELATION
               BETWEEN THE MEAN,
              MEDIAN AND THE MODE

              Mode
                      
                      ~ 3 Median – 2 Mean
     An exactly similar situation holds in case of a
moderately negatively skewed distribution.
       An important point to note is that this empirical
relation    does    not    hold     in    case   of    a
J-shaped or an extremely skewed distribution.
Let us try to verify this relation for the
 data of EPA Mileage Ratings that we
 have been considering for the past
 few lectures.
Frequency Distribution for EPA Mileage
               Ratings

       Class Limit   Class Boundaries   Frequency


      30.0 – 32.9    29.95 – 32.95         2
      33.0 – 35.9    32.95 – 35.95         4
      36.0 – 38.9    35.95 – 38.95         14
      39.0 – 41.9    38.95 – 41.95         8
      42.0 – 44.9    41.95 – 44.95         2
                             Total         30
                                   Number of Cars




                                0
                                2
                                4
                                6
                                8
                               10
                               12
                               14
                               16
                                                    Y
                   29
                        .9
                           5

                   32
                        .9
                           5

                   35
                        .9
                           5

                   38
                        .9
                           5



Miles per gallon
                   41
                                                        Histogram




                        .9
                           5

                   44
                        .9
                           5
                               X
                                  Frequency polygon
                                         and
                                    Frequency cure
                      Y
                 16
                 14
Number of Cars




                 12
                 10
                  8
                  6
                  4
                  2
                  0                                                             X
                         5


                                 5


                                         5


                                                   5


                                                           5


                                                                    5


                                                                            5
                      .4


                              .4


                                      .4


                                                .4


                                                        .4


                                                                 .4


                                                                         .4
                  28


                             31


                                     34


                                               37


                                                       40


                                                                43


                                                                        46
                                             Miles per gallon
• As mentioned above, the empirical relation
  between mean, median and mode holds for
  moderately skewed distributions and not for
  extremely skewed ones.


• Hence, in this example, since the distribution
  is only very slightly skewed,
• Therefore we can expect the empirical relation
  between mean, median and the mode to hold
  reasonable well.
Arithmetic Mean:

          X  37.85
Median:

          X  37.88
Mode:

          ˆ
          X  37.825
        Interesting Observation
The close proximity of the three measures of
 central tendency provides a strong indication
 of the fact that this particular distribution is
 indeed very slightly skewed.
   EMPIRICAL RELATION
   BETWEEN THE MEAN,
  MEDIAN AND THE MODE


Mode   3 Median  2 Mean
3 Median  2 Mean
 3(37.88)  2(37.85)
 113.64  75.70
 37.94
Now, the mode = 37.825 which means that the left-
  hand side is indeed very close to 37.94 i.e. the
  right-hand side of the empirical relation.
Hence, the empirical relation


        Mode       3 Median  2 Mean
  is verified.

				
DOCUMENT INFO
Shared By:
Stats:
views:11
posted:1/9/2013
language:Unknown
pages:33
Description: empirical.ppt