# IT233 Applied Statistics TIHE 2005 Lecture 07 Simple Linear Regression Analysis In this section we do an in depth analysis of the linea by svh16277

VIEWS: 36 PAGES: 5

• pg 1
```									IT233: Applied Statistics                       TIHE 2005                   Lecture 07

Simple Linear Regression Analysis
In this section we do an in-depth analysis of the linear association between two
variables X (called an independent variable or regressor) and Y (called a
dependent variable or response).

Simple Liner Regression makes 3 basic assumptions:

1.    Given a value ( x ) of X , the corresponding value of Y is a
0
random variable whose mean                 (the mean of Y given the value
Y x0
x0 ) is a linear function of X .

i.e.               x0           or, E Y X  x0      x0
         
Y x0

2.    The variation of Y around this given value x is Normal.
0

3.    The variance of Y is same for all given values of X

i.e.   2        2,          for any x
0
Y x0

Example: In simple linear regression, suppose the variance of Y when X = 4 is
16, what is the variance of Y when X = 5?

Simple Linear Regression Model:

Using the above assumptions, we can write the model as:

Y    x 

Where  is a random variable (or error term) and follows normal distribution
with E ( )  0 and Var ( )   2

i.e.         
  N 0,  2   

Illustration of Linear Regression:

Let X = the height of father
Y = the height of son

For fathers whose height is x , the heights of the sons will vary randomly.
0
Linear regression states that the height on son is a linear function of the height
of father i.e.

Y    x
Scatter Diagrams:

A scatter diagram will suggest whether a simple linear regression model would
be a good fit.

Figure (A) suggests that a simple linear regression model seems OK, though
the fit is not very good (Wide scatter around the fitted lines).

Figure (B) suggest that a simple linear regression model seems to fit well
(points are closed to the fitted line).

In (C) a straight line could be fitted, but relationship is not linear.

In (D) there is no relationship between Y and X .
Fitting a Linear Regression Equation:

Keep in mind that there are two lines

Y
Y     x (True line)

ˆ ˆ ˆ
Y     x (Estimated line)

X

Notation:

a (or  ) = Estimate of 
ˆ
ˆ
b (or  ) = Estimate of 
y         = The observed value of Y corresponding to x
i                                                           i
ˆ
y         = The fitted value of Y corresponding to x
i                                                  i
ei       = yi  yi = The residual .
ˆ

Residual: The error in Fit:

The residual, denoted by e = yi  yi , is the difference between the observed
ˆ
i
and fitted values of Y . It estimates  .
i

Y                        yi     ˆ ˆ ˆ
Y    x

ei 

ˆ
yi

xi                   X
The Method of Least Squares:
ˆ
We shall find a (or  ) and b (or  ) , the estimates of  and  , so that the sum
ˆ
of the squares of the residuals is minimum. The residual sum of squares is often
called the sum of squares of the errors about the regression line and denoted by
SSE . This minimization procedure for estimating the parameters is called the
method of least squares. Hence, we shall find a and b so as to minimize

                                  
n        n                2     n                   2
SSE   e2   y  y
ˆ                      y  a  bx
i      i   i                        i        i
i 1     i 1                     i 1

Differentiating SSE with respect to a and b and setting the partial derivatives
equal to zero, we obtain the equations (called the normal equations).

n          n
na  b  x   y
i     i
i 1       i 1

n        n          n
a  x  b  x2   x y
i       i      i i
i 1     i 1       i 1

Which may be solved simultaneously for a and b .

```
To top