# 14 5 5 Minitab Correlation and regression

Document Sample

```					      14.5.5                  Correlation and Regression with Minitab
During this tutorial you will learn how to use Minitab to investigate the association
between two continuous variables and how to describe it graphically

5.5.1 In this practical session you will analyse the some bivariate data.
To enter the data:

File / Open Worksheet / Merlin1 (saved in the ANOVA worksheet, 14.5.1)

In order to see what this file contains: Open the Information window; Window / Info.
This window gives details of the data stored in the file merlin4.mtw which you are
going to analyse.

We shall give each of the cases a salary, in £’000, and then see if it is associated with
their age.

In the first empty column: name the variable Salary, in £’000, and type the following
in one column: (Work down these columns one after the other)

38.1     38.9            23.2    22.9        19.8           19.7        15.6        31.7   17.3   37.8
18.7     42.8            19.6    47.5        31.3            8.5        28.5        14.1   33.5   32.9
42.3     60.1            15.5    15.8        59.3           15.9        37.3        20.3   13.7    9.8
25.9     60.7            20.7    35.9        33.8           39.3        32.9        19.8    6.4   15.2
53.6     75.2            40.2    25.3        24.5           14.5        23.9        28.2   35.3    8.7
37.6     10.9            28.5    63.2        32.0           10.2         8.6        25.3   12.5   38.4

The variables of interest in this practical session are the continuous variables
SALARY and AGE. We shall investigate the relationship between the Salaries
earned by the employees of Merlin and their ages.

5.5.2 Save revised worksheet as Merlin5.mtw

5.5.3 Produce a Scatterplot to see if there appears to be a relationship?

Graph / Scatterplot / Simple Select SALARY for Y and AGE for X

Scatterplot of Salary vs Age
80

70

60

50
Salary

40

30

20

10

0
20      30         40         50          60          70
Age

1
   Examine the plot. Does it suggest a linear relationship?                         Yes but weak . . . .

5.5.4 Calculate the correlation coefficient.

Stat / Basic Statistics / Correlation Select SALARY and AGE as the variables.

Correlations: Age, Salary

Pearson correlation of Age and Salary = 0.398
P-Value = 0.002

   What is the value of the correlation coefficient? . . . . . . . . . . . . . . . . . 0.398 . . . . .

   What is the probability of it being zero?              . . . . . . . . . . . . . . . . . . . 0.002. . . . . . .

   Is this significant at 5%?                                . . . . . . . . . . . . . . . . . Yes. . . . . . . .

If the p value is less than 0.05 the correlation coefficient is significant at the 5% level
of significance.

5.5.5 Find the regression equation:

Stat / Regression / Regression Select SALARY for Response, AGE for Predictors.
Regression Analysis: Salary versus Age

The regression equation is
Salary = 7.73 + 0.554 Age

   Write down the regression equation                         Salary = 7.73 + 0.554 Age

The last output was for the default setting. Minitab can calculate and store the fitted
values and the standardised residuals for each observation.

Edit / Edit Last Dialog The last dialogue box reopens.
Under Storage select Residuals, Standardised residuals and Fits.
OK and check that three new columns have been added to your worksheet.

5.5.6 Save this altered version of your file at this stage under a new name Merlin6

File / Save Worksheet as Merlin6

5.5.7 Investigate the Fitted values:

Graph / Scatterplot Select FITS1 as the Y-variable and AGE as the X-variable.

This produces a straight line as all the fitted values lie on the regression line.

2
To fit this line to a scatterplot requires a graphics plot:

Graph / Scatterplot / With regression SALARY against AGE as before.

Scatterplot of FITS1 vs Age
45

40

35
FITS1

30

25

20

20      30          40            50     60    70
Age

Scatterplot of Salary vs Age
80

70

60

50
Salary

40

30

20

10

0
20      30          40            50      60   70
Age

5.5.8 To print this graph which will not appear in your Session file:

File / Print Graph

5.5.9 To predict a Y-value for a given X-value, a 40 year old:

Stat / Regression / Regression
Select SALARY and AGE as before and then select Options.
Type 40 in the Prediction intervals for new observations.

New
Obs     Fit            SE Fit       95% CI                    95% PI
1   29.91              1.87   (26.16, 33.66)            (1.20, 58.62)

Values of Predictors for New Observations
Obs   Age
1 40.0

3
The output gives the predicted value for y when x is 40 with a confidence interval and
prediction interval for this predicted y-value.

   What is the predicted salary of a 40 year old employee?                               . . . .£29 900

5.5.10To produce the regression line with its confidence interval and prediction interval:

Stat / Regression / Fitted Line Plot / Options: Display Confidence and Prediction
bands. Select SALARY and AGE as before. Print this window as before.

Fitted Line Plot
Salary = 7.728 + 0.5545 Age
80                                                       Regression
95% CI
95% PI

60                                                S           14.2202
R-Sq         15.9%

40
Salary

20

0

20   30      40         50        60    70
Age

5.5.11 Print your session and/or graphs if required.

4

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 10 posted: 5/24/2012 language: English pages: 4