# cumulative frequency - DOC by xdKhg06

VIEWS: 24 PAGES: 4

• pg 1
```									MATH 2441                    Probability and Statistics for Biological Sciences

Histograms and Other Distribution Graphs
One of the most common ways of visualizing the information in a frequency table is by constructing a
histogram, a bar graph of the frequencies.

Example SalmonCa0:
In the preceding step, we developed a frequency table for some data relating to calcium content of salmon
fillets sanitized with chlorine dioxide. For reference, the resulting table of values is repeated here:

Class                                                                                Relative
Limits                              Class        Relative Class    Cumulative       Cumulative
Class:      (ppm)       Tally:                Frequency       Frequency        Frequency        Frequency

1   20 - 29                                 1          1/40 = 0.025            1         1/40 = 0.025

2   30 - 39                                 0          0/40 = 0.000            1         1/40 = 0.025

3   40 - 49                                 2          2/40 = 0.050            3         3/40 = 0.075

4   50 - 59                                 8          8/40 = 0.200           11         11/40 = 0.275

5   60 - 69                                 9          9/40 = 0.225           20         20/40 = 0.500

6   70 79                                   7          7/40 = 0.175           27         27/40 = 0.675

7   80 - 89                                 2          2/40 = 0.050           29         29/40 = 0.725

8   90 - 99                                 4          4/40 = 0.100           33         33/40 = 0.825

9   100 - 109                               5          5/40 = 0.125           38         28/40 = 0.950

10   110 - 119                               0          0/40 = 0.000           38         38/40 = 0.950

11   120 - 129                               2          2/40 = 0.050           40         40/40 = 1.000

To create a class frequency histogram, we just create a (usually) vertical bar chart with the following
characteristics:
    the bars are labelled along the horizontal axis with the class limits
    the height of the bars are equal to the class frequencies
    there are no gaps between the bars
    if it doesn't cause too much clutter, sometimes the individual columns of the histogram are
labelled with the class frequencies.

From the class frequency table above, the following histogram results:

David W. Sabo (1999)            Histograms and Other Distribution Graphs                          Page 1 of 4
Distribution of ppm Calcium in Salmon Fillets
10
9
8

number of specimens
9
7                                                               8
6
7
5
4
3
2
4          5
2                                          2
1                                                                                                                                         2
1
0
-9

9

9

9

9

9

9

9

9

9

09

19

29

39

49
-1

-2

-3

-4

-5

-6

-7

-8

-9

-1

-1

-1

-1

-1
0

10

20

30

40

50

60

70

80

90

0

0

0

0

0
10

11

12

13

14
ppm Calcium

A relative class frequency histogram results when the heights of the columns are proportional to relative
class frequencies rather than the class frequencies themselves:

Distribution of ppm Calcium in Salmon Fillets
0.25

0.2
fraction of specimens

0.225
0.2
0.15
0.175

0.1

0.05
0.05                                                     0.1    0.125
0.05                                    0.05
0.025
0
-9

9

9

9

9

9

9

9

9

9

09

19

29

39

49
-1

-2

-3

-4

-5

-6

-7

-8

-9

-1

-1

-1

-1

-1
0

10

20

30

40

50

60

70

80

90

0

0

0

0

0
10

11

12

13

14

ppm Calcium

In effect, the only real difference between a class frequency histogram and a relative class frequency
histogram is the vertical scale. In fact, people often combine the two types of histograms by constructing the
frequency scale along the left edge of the chart area, and the relative frequency scale along the right edge of
the chart area.

Both of these types of histograms give a visual image of the distribution of data values -- where they are
clustered or how widely spread out they are. Note, as well, that if we consider the width of each column in
the chart to be 1, then the area covered by regions of the chart correspond either to frequencies or relative
frequencies. This visualization of relative class frequencies as areas covered by a sections of a histogram
will be used to organize computations when we study probability distributions.

Histograms are not as useful for visualization of cumulative frequencies. Graphs of cumulative frequencies
are often rendered in one of two forms. To create a cumulative frequency polygon (or ogive), the
cumulative frequencies are plotted as points against the class midpoint, and then these points are joined by
straight line segments. To calculate the location of a class midpoint, simply take the average of the lower
limits of consecutive classes. So, for this example, we have:

Page 2 of 4                                                               Histograms and Other Distribution Graphs                                                             David W. Sabo (1999)
class #1 lower limit = 20

class #2 lower limit = 30

so

midpoint for class # 1 = (20 + 30)/2 = 25.

Repeating this process for each row of the table above gives:

Class                                            Relative
Limits           Class       Cumulative         Cumulative
Class:         (ppm)           midpoint     Frequency          Frequency

1        20 - 29            25              1            1/40 = 0.025

2        30 - 39            35              1            1/40 = 0.025

3        40 - 49            45              3            3/40 = 0.075

4        50 - 59            55             11            11/40 = 0.275

5        60 - 69            65             20            20/40 = 0.500

6        70 79              75             27            27/40 = 0.675

7        80 - 89            85             29            29/40 = 0.725

8        90 - 99            95             33            33/40 = 0.825

9        100 - 109          105            38            28/40 = 0.950

10        110 - 119          115            38            38/40 = 0.950

11        120 - 129          125            40            40/40 = 1.000

The resulting cumulative frequency polygon is:

Cumulative Frequency Polygon
45

40

35
cumula tive frequency

30

25

20

15

10

5

0
0        10      20       30    40     50   60    70     80      90    100    110   120   130      140
ppm Calcium

David W. Sabo (1999)                                      Histograms and Other Distribution Graphs                              Page 3 of 4
The cumulative frequencies are also often usefully plotted as horizontal lines at the appropriate vertical
height for the interval corresponding to each class. For the present data, the resulting plot looks like:

Cumulative Frequency Plot

45

40

35
cumulative frequency

30

25

20

15

10

5

0
0         20         40        60        80      100      120        140
ppm calcium

Comparing this with the table just above, you see that, for example, the class 60 - 69 has a cumulative
frequency of 20, and in the chart, the interval between the class beginning at 60 and the class beginning at
70 corresponds to a horizontal line at a vertical coordinate of 20. This is a bit of a strange looking step
graph, but it represents the same information as the more standard cumulative frequency polygon. You will
see this kind of cumulative frequency plot again right at the end of the course when we look briefly at the
Kolmogorov-Smirnoff test.



Page 4 of 4                                          Histograms and Other Distribution Graphs           David W. Sabo (1999)

```
To top