Assignment preparation – regression lines and equations We have seen how to plot out variables which we believe to be associated and to measure the strength of that association. We also need to be able to use one variable to predict another. In the assignment, we will be using data from a weather station to predict the data from a crime scene. The purpose of this tutorial is to see how to use Excel to both plot data and to give us the equation which will allow us to make that kind of prediction. We start with an example, where we examine data taken from an experiment in which the circumferences and radii of several circular objects were measured. The data is displayed in the screen shot below. Of course, the equation associated with this data is C = 2r, meaning that the circumference of a circle is equal to two times pi times the circle's radius. In this experiment, the circumferences and radii are measured, and we use Excel to investigate the relationship between them Enter the data onto the worksheet as shown in the above screen shot. Select the data, then click on ‘Insert’, choose ‘Scatter’ and then the graph with no line on it. A graph should appear on your page that looks like this (except that you will have different data values). You now want to improve the labels. Click on ‘Layout’, then ‘Axis titles’ and then either horizontal or vertical axis titles: Do the same thing for ‘Chart Title’ Now you may want to add a line to represent the association between the two variables. This is not a line that joins the dots, but rather one that is the ‘best fit’ between the points. To do this, right-click on any of the data points and select ‘Add Trendline’: On the Format Trendline box, select Linear; Display Equation on Chart; Display R-squared value on chart When the OK button is pressed the best fit line is drawn and the equation of the line and R- squared value will be displayed on the graph. It will look something like this. You can move the equation by clicking and dragging it to the desired location. The R-squared value is actually the square of the correlation coefficient. The correlation coefficient, R, gives us a measure of the reliability of the linear relationship between the x and y values. A value of R = 1 indicates an exact linear relationship between x and y. Values of R close to 1 indicate excellent linear reliability. If the correlation coefficient is relatively far away from 1, the predictions based on the linear relationship, y = mx + b, will be less reliable. The equation displayed on the graph is y = 6.179x + 0.2327. To change ‘x’ and ‘y’ to the names of your own variables, click on the equation and edit it. The end result should be a graph that looks like this: C = 6.179r means that the slope of the graph is 6.179. That is fairly close to 2 x π. Make sure you understand why. For our purposes, what we need to understand is that, once we have a regression equation (which Excel does for us), we can predict the variable on the y axis from the variable on the x axis. In this particular case, we take the variable on the x axis, multiply it by 6.179, add 0.2327, and we have the variable on the y axis. This will allow you to estimate the temperature in one location based on the temperature in another location.
Pages to are hidden for
"Assignment preparation � regression lines and equations"Please download to view full document