# lec.24 empirical.ppt

Document Sample

```					EMPIRICAL RELATION BETWEEN
THE MEAN,
MEDIAN AND THE MODE

          Tariq Mahmood Bajwa
Median in Case of a Frequency Distribution of a Continuous
Variable:
In case of a frequency distribution, the median is given by the
formula :
~      hn   
X  l    c
f 2  
where
l=    lower class boundary of
the median class (i.e. that class for which the
cumulative frequency is just in excess of n/2).
h=    class interval size of the median class
f=    frequency of the median class
n=    f (the total number of observations)
c=    cumulative frequency of the class preceding the
median class
Example:
Going back to the example of
the EPA mileage ratings, we have
Mileage      No. of       Class       Cumulative
Rating      Cars      Boundaries     Frequency
Median
30.0 – 32.9     2      29.95 – 32.95       2
class
33.0 – 35.9     4      32.95 – 35.95       6
36.0 – 38.9    14      35.95 – 38.95      20
39.0 – 41.9            38.95 – 41.95
c
8                         28
42.0 – 44.9     2      41.95 – 44.95      30

f             h= class interval = 3       l
n/2 = 30/2 = 15
In this example, n = 30 and n/2 = 15.
Thus the third class is the median class. The
median lies somewhere between 35.95 and 38.95.
Applying the above formula, we obtain

X  35.95  15  6
~            3
14
 35.95  1.93
 37.88

~ 37.9
Interpretation

 Thisresult implies that half of
the cars have mileage less than
or up to 37.88 miles per gallon
whereas the other half of the
cars have mileage greater than
37.88 miles per gallon.
Example
The following table contains the ages of 50 managers of child-
care centers in five cities of a developed country.

Ages of a sample of managers
of Urban child-care centers
42       26        32        34       57
30       58        37        50       30
53       40        30        47       49
50       40        32        31       40
52       28        23        35       25
30       36        32        26       50
55       30        58        64       52
49       33        43        46       32
61       31        30        40       60
74       37        29        43       54
Having converted this data into a frequency
distribution, find the median age.
Solution
Following the various steps involved in the construction
of a frequency distribution, we obtained:

Frequency Distribution of
Child-Care Managers Age
Class Interval           Frequency
20 – 29                   6
30 – 39                  18
40 – 49                  11
50 – 59                  11
60 – 69                   3
70 – 79                   1
Total                   50
Now, the median is given by,

~      hn   
X  l    c
f 2 

where
l=      lower class boundary of the median class
h=      class interval size of the median class
f=      frequency of the median class
n=      f (the total number of observations)
c=      cumulative frequency of the class preceding the
median class
First of all, we construct the column of class boundary
as well as the column of cumulative frequencies.

Cumulative
Class       Frequency
Class limits                             Frequency
Boundaries        f
c.f
20 – 29      19.5 – 29.5       6           6
30 – 39      29.5 – 39.5      18           24
40 – 49      39.5 – 49.5      11           35
50 – 59      49.5 – 59.5      11           46
60 – 69      59.5 – 69.5       3           49
70 – 79      69.5 – 79.5       1           50
Total                        50
Now, first of all we have to determine the median class
(i.e. that class for which the cumulative frequency is
just in excess of n/2).

In this example,

n = 50

implying that

n/2 = 50/2 = 25
Cumulative
Class       Frequency
Class limits                             Frequency
Boundaries        f
c.f
20 – 29      19.5 – 29.5       6           6
Median     30 – 39      29.5 – 39.5      18           24
class     40 – 49      39.5 – 49.5      11           35
50 – 59      49.5 – 59.5      11           46
60 – 69      59.5 – 69.5       3           49
70 – 79      69.5 – 79.5       1           50
Total                        50
Hence,
l = 39.5
h = 10
f = 11
and
c = 24
Substituting these values in the formula, we obtain:

10
X  39.95     25  24
11
 39.95  0.9
 40.4
Interpretation
Thus, we conclude that the median age is 40.4
years.
In other words, 50% of the managers are
younger than this age, and 50% are older.
Example
WAGES OF WORKERS
IN A FACTORY
Monthly Income       No. of
(in Rupees)      Workers
Less than 2000/-      100
2000/- to 2999/-     300
3000/- to 3999/-     500
4000/- to 4999/-     250
5000/- and above      50
Total          1200
In this example, both the first class and the last class are open-
ended classes. This is so because of the fact that we do not have
exact figures to begin the first class or to end the last class. The
advantage of computing the median in the case of an open-ended
frequency distribution is that, except in the unlikely event of the
median falling within an open-ended group occurring in the
beginning of our frequency distribution, there is no need to
estimate the upper or lower boundary.
EMPIRICAL RELATION BETWEEN THE MEAN,
MEDIAN AND THE MODE

• This is a concept which is not based on a rigid
mathematical formula; rather, it is based on
observation. In fact, the word ‘empirical’ implies
‘based on observation’.
•     This concept relates to the relative positions of
the mean, median and the mode in case of a hump-
shaped distribution.
•     In a single-peaked frequency distribution, the
values of the mean, median and mode overlap if the
frequency distribution is absolutely symmetrical.
THE SYMMETRIC CURVE
f

X
Mean = Median = Mode
But in the case of a skewed distribution, the mean,
median and mode do not all lie on the same point.
They are pulled apart from each other, and the
empirical relation explains the way in which this
happens. Experience tells us that in a unimodal
curve of moderate skewness, the median is usually
sandwiched between the mean and the mode.
The second point is that, in the case of many
real-life data-sets, it has been observed that the
distance between the mode and the median is
approximately double of the distance between the
median and the mean, as shown below:
f

X

Median
Mode

Mean
This diagrammatic picture is equivalent to the
following algebraic expression:
Median - Mode   
~ 2 (Mean - Median) ------ (1)
The above-mentioned point can also be expressed in
the following way:
Mean – Mode   
~ 3 (Mean – Median)   ---- (2)

Equation (1) as well as equation (2) yields the
approximate relation given below:
EMPIRICAL RELATION
BETWEEN THE MEAN,
MEDIAN AND THE MODE

Mode

~ 3 Median – 2 Mean
An exactly similar situation holds in case of a
moderately negatively skewed distribution.
An important point to note is that this empirical
relation    does    not    hold     in    case   of    a
J-shaped or an extremely skewed distribution.
Let us try to verify this relation for the
data of EPA Mileage Ratings that we
have been considering for the past
few lectures.
Frequency Distribution for EPA Mileage
Ratings

Class Limit   Class Boundaries   Frequency

30.0 – 32.9    29.95 – 32.95         2
33.0 – 35.9    32.95 – 35.95         4
36.0 – 38.9    35.95 – 38.95         14
39.0 – 41.9    38.95 – 41.95         8
42.0 – 44.9    41.95 – 44.95         2
Total         30
Number of Cars

0
2
4
6
8
10
12
14
16
Y
29
.9
5

32
.9
5

35
.9
5

38
.9
5

Miles per gallon
41
Histogram

.9
5

44
.9
5
X
Frequency polygon
and
Frequency cure
Y
16
14
Number of Cars

12
10
8
6
4
2
0                                                             X
5

5

5

5

5

5

5
.4

.4

.4

.4

.4

.4

.4
28

31

34

37

40

43

46
Miles per gallon
• As mentioned above, the empirical relation
between mean, median and mode holds for
moderately skewed distributions and not for
extremely skewed ones.

• Hence, in this example, since the distribution
is only very slightly skewed,
• Therefore we can expect the empirical relation
between mean, median and the mode to hold
reasonable well.
Arithmetic Mean:

X  37.85
Median:

X  37.88
Mode:

ˆ
X  37.825
Interesting Observation
The close proximity of the three measures of
central tendency provides a strong indication
of the fact that this particular distribution is
indeed very slightly skewed.
EMPIRICAL RELATION
BETWEEN THE MEAN,
MEDIAN AND THE MODE

Mode   3 Median  2 Mean
3 Median  2 Mean
 3(37.88)  2(37.85)
 113.64  75.70
 37.94
Now, the mode = 37.825 which means that the left-
hand side is indeed very close to 37.94 i.e. the
right-hand side of the empirical relation.
Hence, the empirical relation

Mode       3 Median  2 Mean
is verified.

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 11 posted: 1/9/2013 language: Unknown pages: 33
Description: empirical.ppt