Graphs Pie Charts, Bar Graphs, and XY (Line) Graphs
Document Sample


Graphs: Pie Charts, Bar Graphs, and XY (Line) Graphs
2 types of data:
Categorical Data :
A set of data is said to be categorical if the values can be sorted according to non-overlapping categories
based on some qualitative trait.
Examples: Hair color (with categories black, brown, red, etc..); Gender (with categories male, female);
Opinion about President (with categories strongly like, somewhat like, somewhat dislike, strongly dislike)
Numerical (Quantitative) Data:
Numerical data (or quantitative data) is data measured or identified on a numerical/quantitative scale.
Examples: Cholesterol Level, Height, ACT score, Number of students enrolled in MTH 101, Time it takes to
run a mile.
EXAMPLES: Determine if the following data are categorical or numerical/quantitative. (answers in blue)
1. Number of pets we have (quantitative)
2. Number of hours a week spent watching TV (quantitative)
3. Favorite sport (categorical)
4. Most common zip code (categorical)
5. Kind of music most preferred (categorical)
6. How many hours a week spent talking on the phone (quantitative)
7. Kinds of snacks we like (categorical)
8. How much our bags weigh (quantitative)
9. How much candy we eat each week (quantitative)
10. Most common area codes of our cell phones (categorical)
3 types of graphs: (some portions below from http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#pie)
I. Pie Charts
~A pie chart is a way of summarizing a set of categorical data.
~It is a circle which is divided into segments. Each segment represents a particular category.
~ Pie Charts can only be used where both the categories and the quantities each add up to a whole.
~The area of each segment is proportional to the number of cases in that category.
~CATEGORIES MUST BE DISJOINT!
~ The most common errors with pie charts are to use a pie chart on a set of categories that do not make a
whole and to use a pie chart when the categories overlap.
Examples:
Consider the data set on the US Population by race in 1990 and 2000, which is available at:
(http://faculty.elgin.edu/nscherger/math101/Week4_USPopulationByRace.xls).
To create a pie chart for race in 2000:
~Highlight the data in the appropriate columns (so, cells A9-A14 and then holding down control, also
highlight cells C9-C14)
~Go to Insert→Pie→2DPie (selecting the first choice, although you are encouraged to experiment with
the other options) You should now have a basic pie chart.
There are many features you will want to experiment with. Here are just a few you should know:
~To delete the legend (the labels, which here list the races), simply tap on it and hit delete. [However,
with pie charts, you may often decide to leave it. With future types of graphs, we will definitely want
to remove it.
~To resize the pie chart, tap on the “invisible” white square around the pie chart and then use your
cursor to make that box (the plot area) larger or smaller.
~To add a title, make sure you have tapped somewhere on your pie chart and “Chart Tools” options
will appear at the top. Go to Layout→Chart Title→Above Chart and then a box will appear above your
pie chart, where you can enter a title.
~To add labels and percents directly on your pie chart, go to Layout→Data Labels→More Data Label
Options, and then make the appropriate selections (here, check Category Name, Percentage, and Show
Leader Lines). [The Category Name is especially important if you are printing in black-and-white, in
which case the legend does little good.]
~To “clean up” the graph, you can often just click and drag to make it more readable. You will want to
experiment to get efficient at making nice charts. For example, to increase the font of the labels, if you
tap on a label and then right click, you will be able to adjust the font.
~To copy and paste your graph into a Word document, simply right click on the area and select copy
and then in your Word document, right click and select paste. Once in Word, you can right click and
select “Format Picture” to make further adjustments. *Recommendation: Generally, when you adjust
the sizes of images, select “lock aspect ratio” to keep the picture looking the same.+ Here is the graph:
For more practice, use the following data set on Energy Consumption as another example, which is
available at: (http://faculty.elgin.edu/nscherger/math101/Week4_EnergyConsumption.xls).
II. Bar Charts
~A bar chart is a way of summarizing a set of categorical data or quantitative data.
~For categorical data:
-There must be an associated quantitative variable (most commonly, the number of cases in that
category).
-It displays the data using a number of rectangles, of the same width, each of which represents a
particular category. The length (and hence area) of each rectangle is proportional to the number of
cases in the category it represents, for example, age group, religious affiliation.
~Bar charts can be displayed horizontally or vertically and they are usually drawn with a gap between the
bars (whereas the bars of a histogram (which is a special type of bar chart) are drawn immediately next to
each other).
Examples:
Consider the data set on home heating sources in 1950 and 1997, which is available at:
(http://faculty.elgin.edu/nscherger/math101/Week4_HomeHeating.xls).
To create a bar chart for home heating in 1950:
~Highlight the data in the appropriate columns (so, cells A7-A15 and cells B7-B15).
~Go to Insert→Column→2DColumn (selecting the first choice, although you are encouraged to
experiment with the other options) You should now have a basic bar chart.
There are many features you will want to experiment with. Here are just a few you should know:
~To delete the legend (which here is meaningless and says “series 1”), tap on it and hit delete.
~To resize the chart, tap on the chart and then use your cursor in one of the corners to adjust the size.
~To add a title (make sure you have tapped somewhere on your pie chart) and there are “Chart Tools”
options that will appear at the top. Go to Layout→Chart Title→Above Chart and then a box will appear
above your chart, where you can enter a title.
~To add axes labels, go to Layout→Axis Titles, and then make the appropriate selections.
~ You will want to experiment to get efficient at making nice charts. For example, to adjust the y-axis
scale, you can right click on the y-axis labels and select “Format Axis” and make the desired
adjustments. (For example, this graph below adjusted the y-axis to only go to 35% instead of 40%.)
For more practice, make a similar graph for 1997. Here is a result below:
Notice that it is difficult to compare the two years on two different bar charts. It would be better here (for
the reader, who should always be kept in mind) to have a double-bar chart.
To create a multiple-bar chart for home heating in 1950 and 1997:
~Highlight the data in the appropriate columns (so, cells A7-A15, B7-B15, and C7-C15).
~Go to Insert→Column→2DColumn. You should now have a basic double-bar chart.
~With a multiple bar chart, you must leave the legend! Unfortunately, the label on the initial legend
the legend is very non-intuitive (Series1 and Series2). To edit the legend, click on it and initially the
entire legend is selected. Then click again on just Series1, so that only Series1 is selected. Now, right
click and choose “Select Data.” In the box that appears (shown below), with Series1 highlighted, select
Edit and then type in an appropriate label (here, 1950). Then to the same for Series2.
~Add a title and axes labels as before and you should get a graph like the one below.
There are advantages and disadvantages to multiple-bar charts. Their main advantage is their succinctness
and the ability they afford to make comparisons within categories and across categories; they are best
used in printed works so that a viewer can study them carefully. Their disadvantage is that they often
present far too much information to viewers of presentations; it is hard to make a single, clear point with
them, and presenters tend not to leave them up long enough to absorb the information fully.
Finally, a note of caution when making bar graphs:
Click on sheet 2 of the home heating file (which gives US Energy Consumption for some years from 1990-
1999) and make a bar chart as before (Insert→Column→2DColumn). You end up with the meaningless
graph below.
Very frequently, students make bar charts with quantitative data on the x-axis, but they are not careful
about the fact that a bar chart treats the x-axis categorically. Y
YOU MUST ALWAYS LOOK AT YOUR GRAPHS AND DECIDE IF WHAT YOU HAVE CREATED MAKES SENSE!!
[One way to fix the problem and create a meaningful single bar graph is to put quotes around all the dates
(ie, “1990”) and then create a bar graph, as shown below.]
III. XY Graphs
~An XY graph is a way of summarizing data, often used in exploratory data analysis to illustrate the major
features of the distribution of the data in a convenient form.
~XY graphs can be dot plots or line graphs [Caution: What Excel calls a line graph is NOT what we want].
~XY graphs should “tell a story” as you look at the graph from left to right. (For example, some bar graphs
make no sense as an XY graph.) The vocabulary used to describe these graphs is the math terms we use to
“tell the story” and include:
-increasing (graph is going “uphill” as you look from left to right).
-decreasing (graph is going “downhill” as you look from left to right).
-relative minimums and maximums (“valleys” and “peaks” of the graph).
-absolute minimum and maximum (the smallest value and largest value on the whole graph).
-increasing and decreasing at a faster/slower rate (judged by the steepness of the increase/decrease).
-periodic (graphs that have a pattern that repeats)
Examples:
Consider the data set on Chicago’s population, which is available at:
(http://faculty.elgin.edu/nscherger/math101/Week4_ChicagoPopulation1830-2000.xls).
To create an XY graph for Chicago’s population:
~Highlight the data in the appropriate columns (so, cells A6-A23 and cells B6-B23).
~Go to Insert→Scatter (selecting the first choice). Note: Even if you wanted to make a line graph as
opposed to just a dot plot (like we are doing here), choose one of the plots where the points are
actually on the graph. (It is beneficial both for seeing the data and being able to place the cursor on
the point and actually see what that data point is.) You should now have a basic XY dot plot.
~As before, delete the legend and add a title and axes labels.
~Again, there are many features you will want to experiment with. Often you can adjust features by
highlighting them and right clicking and selecting the appropriate command. (For example, in the
graph below, the x axis was highlighted and then “format axis” was selected and the range was
changed to go from 1820 to 2010, using increments of 20. )
~Now suppose we decided we wanted a line graph instead. Click on the graph and then (under the
“Chart Tools” and “Design” tab, which should be the default when you first tap on the graph) select
“Change Chart Type” which is at the top left. Under XY Scatter, select the second type, as indicated
below.
The graph should now have changed to the line graph below:
To describe this graph, “tell the story,” using the proper terminology from above (increasing,
decreasing, etc…). For example, consider the following:
“Chicago’s population remained relatively constant from 1830 to 1850 and then began increasing
slowly from 1860 to 1880. From 1880 to 1930, it continued to increase, but did so at a faster rate,
increasing from 503,185 in 1880 to 3,376,438 in 1930. The population dipped a bit in 1940, and then
the population was at its absolute maximum of 3,620,962 in 1950. The population began decreasing
after that and reached a relative minimum of 2,783,726 in 1990, after which 2000 showed a small
amount of growth to a population of 2, 896, 016.”
Finally, use the data on sun spots (which are regions on the solar surface that appear dark because they
are cooler than the surrounding photosphere (from http://csep10.phys.utk.edu/astr162/lect/sun/sunspots.html)) to
create an XY graph.
The data set is available at: (http://faculty.elgin.edu/nscherger/math101/Week4_SunspotNumbers.xls). You may only
want to use the last 40 years in making the graph. Again, include appropriate titles and labels.
Observe the cyclical nature of this graph. That is, how its behavior appears to “repeat” roughly every 10
years. We would describe sun spots as periodic phenomenon, and the value of 10 is what we would say is
the period of this graph. There are a plethora of real-world data that is periodic in nature, including sound
waves and electro-magnetic radiation.
Misleading and Bad Graphs
The following links are misleading and bad graphs. What makes them bad and/or misleading?
1. These first two graphs emphasize the effect of the scale of the y-axis (along with a title) on the overall
appearance of graphs.
http://faculty.elgin.edu/nscherger/math101/salariesupgraph.gif :
http://faculty.elgin.edu/nscherger/math101/salariesstablegraph.gif :
2. These next two graphs also emphasize the effect of the scale of the y-axis and were done by CNN.
http://faculty.elgin.edu/nscherger/math101/CNN Graph One.bmp :
http://faculty.elgin.edu/nscherger/math101/CNN Graph Two.bmp :
3. This next graph takes the impact of the scale of the y-axis to the extreme!
http://faculty.elgin.edu/nscherger/math101/zantac75.jpg:
4. This final graph illustrates the issue of manipulating the scale on the y-axis. The ad not only appeared
in the Chicago Tribune but also appeared on billboards along the Kennedy Expressway. A driver on the
Kennedy is in no position to carefully examine the y-axis, which I believe was the intention of the
makers of the ad. Notice also how the Hollywood Casino column breaks the frame of the graph.
http://faculty.elgin.edu/nscherger/math101/casinograph.gif:
Related docs
Get documents about "