# Graphical representation of Data Palgrave

Document Sample

```					GRAPHICAL REPRESENTATION OF DATA
In this lecture two methods, histograms and cumulative frequency diagrams, are described
in detail for presenting frequency data. You also need to read Chapter 2 of 'Business
Statistics – A One Semester Course' which describes other presentation methods: barcharts,
piecharts, boxplots, frequency polygons, dotplots, and stem-and-leaf diagrams. Also make
sure that all the diagrams in this handout are completed before next week's lecture as they
will be used again for estimating summary statistics.

A frequency distribution is simply a grouping of the data together, generally in the form
of a frequency distribution table, giving a clearer picture than the individual values. The
most usual presentation is in the form of a histogram and/or a frequency polygon.

A Histogram is a pictorial method of representing data. It appears similar to a Bar Chart
but has two fundamental differences:
 The data must be measurable on a standard scale; e.g. lengths rather than colours.
   The Area of a block, rather than its height, is drawn proportional to the Frequency, so
if one column is twice the width of another it needs to be only half the height to
represent the same frequency.

Method
   Construct a Frequency Distribution Table, grouping the data into a reasonable number
of classes, (somewhere in the order of 10). Too few classes may hide some information
about the data, too many classes may disguise its overall shape.
   Make all intervals the same width at this stage.
   Construct the frequency distribution table. Tally charts may be used if preferred, but
any method of finding the frequency for each interval is quite acceptable.
   Intervals are often left the same width but if the data is scarce at the extremes then
classes may be joined. If frequencies are very high in the middle of the data classes
may be split.
   If the intervals are not all the same width, calculate the frequency densities, i.e.
frequency per constant interval. (Often the most commonly occurring interval is used
throughout.) If some intervals are wider than others care must be taken that the areas
of the blocks and not their heights are proportional to the frequencies.
   Construct the histogram labelling each axis carefully. For a given frequency
distribution, you may have to decide on sensible limits if the first or last Class Interval
is specified as 'under ....' or 'over....', i.e. is open ended.
   Hand drawn histograms usually show the frequency or frequency density vertically.
Computer output may be horizontal as this format is more convenient for line printers.

(Your histogram will be used again next week for estimating the modal mileage.)

A frequency polygon is constructed by joining the midpoints at the top of each column of
the histogram. The final section of the polygon often joins the mid point at the top of each
extreme rectangle to a point on the x-axis half a class interval beyond the rectangle. This
makes the area enclosed by the rectangle the same as that of the histogram.

1
Example: You are working for the Transport manager of a large chain of supermarkets
which hires cars for the use of its staff. Your boss is interested in the weekly distances
covered by these cars. Mileages recorded for a sample of hired vehicles from 'Fleet 1'
during a given week yielded the following data:

138       164   150      132    144     125     149     157   161     150
146       158   140      109    136     148     152     144   145     145
168       126   138      186    163     109     154     165   135     156
146       183   105      108    135     153     140     135   142     128
Minimum = 105             Maximum = 186              Range = 186 - 105 = 81

Nine intervals of 10 miles width seems reasonable, but extreme intervals may be wider.
Next calculate the frequencies within each interval. (Not frequency density yet.)

Frequency distribution table
Class interval           Tally                   Frequency       Frequency density
100 and < 110            ||||                           4                   2.0
110 and < 120                                           0
120 and < 130            |||                            3                   3.0
130 and < 140            |||| ||                        7                   7.0
140 and < 150            |||| |||| |                   11                  11.0
150 and < 160            |||| |||                       8                   8.0
160 and < 170            ||||                           5                   5.0
170 and < 180                                           0                   1.0
180 and < 190            ||                             2
Total           40

The data is scarce at both extremes so join the extreme two classes together. This doubles
the interval width so the frequency needs to be halved to produce the frequency per 10
miles, i.e. the frequency density. Calculate all the frequency densities. Next draw the
histogram and add the frequency polygon.

Freq. per 10 mile interval       Histogram and frequency polygon

12

10

8

6

4

2

0
100           120           140         160         180         200
Mileages

2
Stem and leaf plots
A stem and leaf plot displays the data in the same 'shape' as the histogram, though it tends
to be shown horizontally. The main difference is that it retains all the original information
as the numbers themselves are included in the diagram so that no information is 'lost'.
The 'stem' indicates, in the same example as above, the first two digits, (hundreds and
tens), and the 'leaf' the last one, (the units). The lowest value below is 105 miles.
Sometimes the frequency on each stem is included.

Example (Same data as for the previous histogram)

Frequency         Stem Leaf
4               10 | 5789
0               11 |
3               12 | 568
7               13 | 2555688
11               14 | 00244556689
8               15 | 00234678
5               16 | 13458
0               17 |
2               18 | 36

Cumulative frequency diagrams (Cumulative frequency polygons, Ogives)
A Cumulative Frequency Diagram is a graphical method of representing the accumulated
frequencies up to and including a particular value - a running total. These cumulative
frequencies are often calculated as percentages of the total frequency. This method, as we
shall see next week, is used for estimating median and quartile values and hence the
interquartile or semi-interquartile ranges of the data. It can also be used to estimate the
percentage of the data above or below a certain value.

Method
   Construct a frequency table as before.
   Use it to construct a cumulative frequency table noting that the end of the interval is
the relevant plotting value.
   Calculate a column of cumulative percentages.
   Plot the cumulative percentage against the end of the interval and join up the points
with straight lines. N.B. Frequencies are always plotted on the vertical axis.

Using the data in the section on Histograms, we work through the above method for
drawing the cumulative frequency diagram. Next week you will use it to estimate the
median mileage, the interquartile range and the semi-interquartile range of the data.

3
Cumulative frequency table

Cumulative         % cumulative
Interval less than      Frequency
frequency           frequency
100                   0                    0                   0.0
120                   4                    4                  10.0
130                   3                    7                  17.5
140                   7                   14                  35.0
150                  11                   25                  62.5
160                   8                   33                  82.5
170                   5                   38                  95.0
190                   2                   40                 100.0

% C.F.                 Cumulative frequency diagram
100

80

60

40

20

0
100      120          140          160       180         200
Mileages

We can estimate from this diagram that the percentage of vehicles travelling, say, less than
125 miles is 14%.

Next week we shall use the same histogram and cumulative frequency diagram to:
   Estimate the mode of the frequency data.
   Estimate the Median value by dotting in from 50% on the Cumulative Percentage axis
as far as your ogive and then down to the values on the horizontal axis. The value
indicated on the horizontal axis is the estimated Median.
   Estimate the Interquartile Range by dotting in from 25% and down to the horizontal
axis to give the Lower Quartile value. Repeat from 75% for the Upper Quartile value.
The Interquartile Range is the range between these two Quartile values.
The Semi-interquartile Range is half of the Interquartile range.
   The same technique can be employed to estimate a particular value from the percentage
of the data which lies above or below it.

4
Completed diagrams from lecture handout (Not for the student handouts)
Example You are working for the Transport manager of a large chain of supermarkets
which hires cars for the use of its staff. He is interested in the weekly distances covered by
these cars. Mileages recorded for a sample of hired vehicles from Fleet 1during a given
week yielded the following data:
138       164   150      132    144     125     149     157   161     150
146       158   140      109    136     148     152     144   145     145
168       126   138      186    163     109     154     165   135     156
146       183   105      108    135     153     140     135   142     128
Minimum = 105             Maximum = 186              Range = 186 - 105 = 81
Nine intervals of 10 miles width reasonable, but extreme intervals may be wider.
Next calculate the frequencies only within each interval.

Frequency distribution table
Class interval           Tally                   Frequency       Frequency density
100 and < 110            ||||                           4
2.0
110 and < 120                                           0
120 and < 130            |||                            3                   3.0
130 and < 140            |||| ||                        7                   7.0
140 and < 150            |||| |||| |                   11                  11.0
150 and < 160            |||| |||                       8                   8.0
160 and < 170            ||||                           5                   5.0
170 and < 180                                           0                   1.0
180 and < 190            ||                             2
Total           40
The data is scarce at both extremes so join the extreme two classes together. This doubles
the interval width so the frequency needs to be halved to produce the frequency per 10
miles, i.e. the frequency density. Next draw the histogram and add the frequency
polygon.

Freq. per 10 mile interval       Histogram and frequency polygon

12

10

8

6

4

2

0
100           120           140         160         180         200
Mileages

5
Cumulative frequency table

Cumulative          % cumulative
Interval less than      Frequency
frequency            frequency
100                    0                  0                    0.0
120                    4                  4                   10.0
130                    3                  7                   17.5
140                    7                 14                   35.0
150                   11                 25                   62.5
160                    8                 33                   82.5
170                    5                 38                   95.0
190                    2                 40                  100.0

Cumulative frequency diagram
% C.F.
100

80

60

40

20

0
100       120         140       160          180         200
Mileages

We can estimate from this diagram that the percentage of vehicles travelling, say, less than
125 miles is 14%.

6

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 5 posted: 9/27/2012 language: English pages: 6
How are you planning on using Docstoc?