# Summary Descriptive Measures by rogerholland

VIEWS: 4 PAGES: 11

• pg 1
```									                   Summary Descriptive Measures

Projects Completed Early

35
30
Percentage

25
20
15
10
5
0

0
10

20

30

40

50

60

70

80

90
10
Percent

Location is an indicator of where the data is located.
Projects Completed Early

40

30

% 20

10
Plant B
0
10 15 20                     Plant A
25 30 35
40 45 50
Percent

Scale is a measure of how “spread out” data is.
Criteria for Measures of Location and Scale

Must be well defined for:   Raw Data

Grouped Data

Theoretical Curves

For Business Purposes:      Must be arithmetic
Measures of Location

Mode
Simply the most frequent value in a data set.

Problems:

Raw Data:   Many data sets have no repeat values, therefore mode does not
exist.
Grouped Data:    Mode is taken as midpoint of the bin with the greatest
frequency.

But consider the data discussed in the last lecture.

Histogram of Labor Costs

30

25
Frequency

20

15

10

5

0
20   30     40        50        60    70   80
Labor Cost

Histogram of Labor Costs

35
30
25
Frequency

20
15
10
5
0
25     35        45        55        65    75
Labor Costs
Theoretical Data: Mode may not exist; consider the theoretical distribution of
random numbers which should look like:

Uniform Density Function

1.2
1
0.8
f(x)

0.6
0.4
0.2
0
0

1
1

2

3

4

5

6

7

8

9
0.

0.

0.

0.

0.

0.

0.

0.

0.
x= random number
Measures of Location

Median
The median is that data value which has approximately the same percentage
of observations below it as above it (for large data sets this proportion will approach
50%).

The word “median” comes from the Latin word “medius”, meaning
“middle”.

Raw Data:

Finding the median from raw data is a two step process. First you
must put the data in order, then you need to find the middle value.

Example:      Data = 3, -1, 6, 10, 11

Ordered Data = -1, 3, 6, 10, 11

Median = 6

If sample size is odd then median will be the value occupying
position (n+1)/2 in the ordered data.

Example:      Data = 3, -1, 6, 10, 11, 7

Ordered Data= -1, 3, 6, 7, 10, 11

Median = any value between 6 and 7. Usually average two
points to get 6.5 .

If sample size is even then median is the arithmetic average of
the values occupying positions (n/2) and (n/2) +1 in the ordered
data.

Notice: Median is not computed, it is found. For example replace the value of 11 in
the above example by 12,000. The median remains 6.5

Cannot be manipulated algebraically.
Finding the Median of Raw Data Using EXCEL

Open the file “thickdat.xls” in the MBA Mod 1 folder.

Find an empty cell and type in =median(

Then highlight the range of the data. You should see something that looks like the
following:

Finally, type in the right parenthesis.

The result is 355 which is the average of the 30th and 31st values, both of which
happen to be 355.
Finding the Median from Grouped Data

Suppose you did not have the raw data for steel thickness, but only had the data
grouped as shown below:

m(i)         f(i)
Interval            Midpoint      Freq         F

341.5      344.5      343         1          1
344.5      347.5      346         3          4
347.5      350.5      349         8          12
350.5      353.5      352         8          20
353.5      356.5      355         20         40
356.5      359.5      358         13         53
359.5      362.5      361         5          58
362.5      365.5      364         2          60

Using the column labeled “F”, it is clear that the 30th and 31st observations lie in the
interval [353.5 to 356.5].

Altogether there are 20 observations in the interval [353.5 to 356.5].

Since there are 20 observations below 353.5, we need 10 more to get to the 30th
value.

ASSUMPTION:           The data points in the interval are equi-spaced throughout the
interval

To get the 30th value, we need to go 10/20ths (or .5) into the interval. Since the bin is
3 units wide, we need to go a distance of (10/20)*3 = 1.5 into the interval. Therefore
we estimate the 30th value as 353.5 + 1.5 = 355

To get the 31st value, we need to go 11/20ths (or .55) into the interval. Since the bin
is 3 units wide, we need to go a distance of (11/20)*3 = 1.65 into the interval.
Therefore we estimate the 31st value as 353.5 + 1.65 = 355.15.

The median is estimated as median = (355 + 355.15)/2 = 355.075.
Finding the Median From Theoretical Probability
Distributions

If f(x) is the probability density function of x, the median is that value med
satisfying the integral equation:

   med

 f ( x)dx .5

Problems with the Median
Suppose you had two groups of people. In Group 1 you had 50 people with a
median hourly wage of \$15.00 per hour. In Group 2 you had 100 people with a
median hourly wage of \$17.00 per hour. Given this information can you determine
the median hourly wage of all 150 people?

Consider the following data:

Time 1     Time 2 change

5     4        -1
10    12       2
15    18       3
20    19       -1
25    23       -2

median          15    18       -1

Change in median is 18 -
15 =3

Median Change
is -1

```
To top