Section 1.1 by VII9jovw

VIEWS: 2 PAGES: 79

									Correlation and Regression

        Example # 1
For the following exercise, complete these steps.
a. Draw the scatter plot for
   the variables.
b. Compute the value of the
   correlation coefficient.
c. State the hypotheses.
d. Test the significance of
   the correlation coefficient
   at  = 0.05, using Table I.
e. Give a brief explanation
   of the type of
   relationship.
A researcher wishes to determine if a person’s age is
related to the number of hours he or she exercises per
week. The data for the sample are shown below.
a. Draw the scatter plot for the variables.
    Age x     18   26 32 38    52    59
  Hours y     10   5   2   3   1.5   1
   10
    8
Hours




        6
        4
        2
                                     Age
        0   10 20 40 30 50 60 70
b. Compute the value of the correlation coefficient.
 Age x    18   26 32 38         52    59
Hours y   10   5     2    3     1.5   1
  x = 225          y = 22.5
 x 2 = 9653     y 2 = 141.25
 xy = 625           n=6
 x = 225  x 2 = 9653  xy = 625
 y = 22.5  y 2 = 141.2 n = 6

                
            n  xy –  x          y 
r=
      n
      
           
           x
                2
                     
                    – x      2 
                                
                                     
                                      n y
                                             2
                                                 y      2
                                                           
                                                           
 x = 225  x 2 = 9653  xy = 625
 y = 22.5  y 2 = 141.2 n = 6

                6  625  –  225  22.5 
r=
     
     6
        9653  –  225 2  6  141.25  –  22.5 2 
                            
                                                      
                                                        


 r = – 0.832
 c. State the hypotheses.
H0 :  = 0 and H1 :   0
  Age x     18   26 32 38       52    59
  Hours y   10   5   2      3   1.5   1
d. Test the significance of the correlation coefficient at
    = 0.05, using Table I.
  Age x     18    26 32 38       52    59
 Hours y    10     5   2     3   1.5   1
H0 :  = 0 and H1 :   0
n=6        d.f . = 4     r = – 0.832

 C.V. = ± 0.811


 Decision: Reject H0 .
e. Give a brief explanation of the type of relationship.

  Age x    18    26 32 38       52    59
Hours y    10     5   2     3   1.5    1
H0 :  = 0 and H1 :   0
n=6       d.f . = 4     r = – 0.832
 There is a significant linear
 relationship between a person’s age
 and the number of hours he or she
 exercises per week.
Decision: Reject H0 .
Correlation and Regression

        Example # 2
For the following exercise, complete these steps.
a. Draw the scatter plot for
   the variables.
b. Compute the value of the
   correlation coefficient.
c. State the hypotheses.
d. Test the significance of
   the correlation coefficient
   at  = 0.05, using Table I.
e. Give a brief explanation
   of the type of
   relationship.
The director of an alumni association for a small college
wants to determine whether there is any type of
relationship between the amount of an alumnus’s
contribution (in dollars) and the years
the alumnus has been out of school.
The data are shown here.
   Years x     1   5    3   10 7   6
Contribution y 500 100 300 50 75 80
  a. Draw the scatter plot for the variables.
                Years x       1   5   3   10     7   6
     Contribution y 500 100 300           50     75 80

               500
Contribution




               400
               300
               200
               100

                 0   2    4   6 8     10 20 30
                              Years
b. Compute the value of the correlation coefficient.
    Years x      1      5    3     10   7   6
Contribution y 500 100 300         50   75 80
  x = 32             y = 1105
 x 2 = 220       y 2 = 364,525
 xy = 3405            n=6
b. Compute the value of the correlation coefficient.
 x = 32  x 2 = 220  xy = 3405
 y = 1105  y 2 = 364,52 n = 6

                
            n  xy –  x          y 
r=
      n
      
           
           x
                2
                     
                    – x      2 
                                
                                     
                                      n y
                                             2
                                                 y      2
                                                           
                                                           
b. Compute the value of the correlation coefficient.
 x = 32  x 2 = 220  xy = 3405
 y = 1105  y 2 = 364,52 n = 6

              6  3405  –  32  1105 
r=
                       2                         2
     6  220  –  32   6  364,525  –  1105  
                                                  




r = – 0.883
 c. State the hypotheses.

     Years x     1    5     3   10   7   6
 Contribution y 500 100 300     50   75 80
H0 :  = 0 and H1 :   0
 d. Test the significance of the correlation coefficient at
      = 0.05, using Table I.
     Years x      1      5   3    10   7    6
 Contribution y 500 100 300       50   75 80
H0 :  = 0 and H1 :   0
 n=6     d.f . = 4 r = – 0.883

 C.V. = ± 0.811


 Decision: Reject H0 .
 e. Give a brief explanation of the type of relationship.

     Years x      1      5   3    10    7   6
 Contribution y 500 100 300       50    75 80
H0 :  = 0 and H1 :   0
 n=6     d.f . = 4 r = – 0.883
  There is a significant linear
  relationship between a person’s age
  and his or her contribution.


 Decision: Reject H0 .
Correlation and Regression

        Example # 3
For the following exercise, complete these steps.
a. Draw the scatter plot for
   the variables.
b. Compute the value of the
   correlation coefficient.
c. State the hypotheses.
d. Test the significance of
   the correlation coefficient
   at  = 0.05, using Table I.
e. Give a brief explanation
   of the type of
   relationship.
A criminology student wishes to see if there is a
relationship between the number of larceny crimes and
the number of vandalism crimes on college campuses
in Southwestern Pennsylvania. The data are shown. Is
there a relationship between the two
types of crimes?
 Number of larceny 24 6 16 64 10 25 35
    crimes, x
    Number of      21 3 6 15 21 61 20
vandalism crimes y
  a. Draw the scatter plot for the variables.
            Number of larceny   24 6 16 64 10 25      35
               crimes, x
              Number of         21 3   6   15 21 61   20
          vandalism crimes y
vandalism crimes




                   80
                   60
                   40
                   20

                    0 10 20 30 40 50 60 70 80
                          larceny crimes
b. Compute the value of the correlation coefficient.
 Number of larceny     24 6 16 64 10 25        35
    crimes, x
     Number of         21 3   6     15 21 61   20
 vandalism crimes y
  x = 180             y = 147

 x 2 = 6914          y 2 = 5273
 xy = 4013             n=7
 x = 180  x 2 = 6914  xy = 4013
 y = 147  y 2 = 527 n = 7

                
            n  xy –  x        y 
r=
      n
      
           
           x
                2
                     
                    – x    2 n
                               
                                    
                                     y
                                          2
                                                y 2 
                                                      
 x = 180  x 2 = 6914  xy = 4013
 y = 147  y 2 = 527 n = 7

             6  4013  –  180  147 
r=
                         2                     2
     6  6914  –  180   6  5273  –  147  
                                                




r = 0.104
 c. State the hypotheses.

  Number of larceny    24 6 16 64 10 25      35
     crimes, x
      Number of        21 3   6   15 21 61   20
  vandalism crimes y

H0 :  = 0 and H1 :   0
d. Test the significance of the correlation coefficient at
    = 0.05, using Table I.
 Number of larceny    24 6 16 64 10 25       35
    crimes, x
     Number of        21 3   6    15 21 61   20
 vandalism crimes y

n = 7 d.f .= 5        r = 0.104

C.V. = ± 0.754



Decision: Do not reject H0 .
e. Give a brief explanation of the type of relationship.

 Number of larceny    24 6 16 64 10 25       35
    crimes, x
     Number of        21 3   6    15 21 61   20
 vandalism crimes y

n = 7 d.f .= 5        r = 0.104
 There is not a significant linear
 relationship between the number of
 larceny crimes and the number of
 vandalism crimes.

Decision: Do not reject H0 .
Correlation and Regression

        Example # 4
For the following exercise, complete these steps.
a. Draw the scatter plot for
   the variables.
b. Compute the value of the
   correlation coefficient.
c. State the hypotheses.
d. Test the significance of
   the correlation coefficient
   at  = 0.05, using Table I.
e. Give a brief explanation
   of the type of
   relationship.
The average daily temperature (in degrees Fahrenheit)
and the corresponding average monthly precipitation
(in inches) for the month of June are shown
here for seven randomly selected cities in
the United States. Determine if there is a
relationship between the two variables.
  Average daily   86 81 83 89 80 74 64
 temperature, x
Average monthly 3.4 1.8 3.5 3.6 3.7 1.5 0.2
 precipitation, y
 a. Draw the scatter plot for the variables.
            Average daily      86   81   83   89   80   74   64
           temperature, x
      Average monthly          3.4 1.8 3.5 3.6 3.7 1.5 0.2
       precipitation, y

                5
                4
Precipitation




                3
                2
                1

                    0   60     70    80       90   100
                             Temperature
b. Compute the value of the correlation coefficient.
   Average daily     86   81   83   89   80   74   64
  temperature, x
 Average monthly     3.4 1.8 3.5 3.6 3.7 1.5 0.2
  precipitation, y
  x = 557             y = 17.7
 x 2 = 44,739        y 2 = 55.99
 xy = 1468.9             n=7
 x = 557  x 2 = 44,739  xy = 1468.9
 y = 17.7  y 2 = 55.99 n = 7

               
           n  xy –  x        y 
r=
     n
     
          
          x
               2
                    
                   – x    2 n
                              
                                   
                                    y
                                         2
                                               y 2 
                                                     
 x = 557  x 2 = 44,739  xy = 1468.9
 y = 17.7  y 2 = 55.99 n = 7

              7(1468.9) – ( 557)(17.7)
r=
                                               
        ( 44,739) – ( 557) 2 7( 55.99) – (17.7) 2
     7
                           
                                                
                                                  


r = 0.883
c. State the hypotheses.

   Average daily     86   81   83   89   80   74   64
  temperature, x
 Average monthly     3.4 1.8 3.5 3.6 3.7 1.5 0.2
  precipitation, y
H0 :  = 0 and H1 :   0
 d. Test the significance of the correlation coefficient at 
    = 0.05, using Table I.
    Average daily     86   81   83   89   80   74   64
   temperature, x
  Average monthly     3.4 1.8 3.5 3.6 3.7 1.5 0.2
   precipitation, y
H0 :  = 0 and H1 :   0
n =7 d.f . = 5 r = 0.883

C.V. = ± 0.754


 Decision: Reject H0 .
 e. Give a brief explanation of the type of relationship.

    Average daily     86   81   83   89   80   74   64
   temperature, x
  Average monthly     3.4 1.8 3.5 3.6 3.7 1.5 0.2
   precipitation, y
H0 :  = 0 and H1 :   0
n =7 d.f . = 5 r = 0.883
  There is a significant linear
  relationship between temperature
  and precipitation.
 Decision: Reject H0 .
Correlation and Regression

        Example # 5
Find the equation of the regression line and find the y
value for the specified x value. Remember that no
regression should be done when r is not significant.
Ages and Exercise
     Age x      18 26 32 38 52 59
   Hours y      10 5 2 3 1.5 1
Find y  when x = 35 years.
Ages and Exercise
     Age x      18 26 32 38 52 59
   Hours y      10 5 2 3 1.5 1


a=
      y      x   xy 
               x2 –

            n  x2  –  x 
                               2


      22.5  9653  –  225  625 
a=
                                2
           6  9653  –  225 
Find y  when x = 35 years.
Ages and Exercise
     Age x      18 26 32 38 52 59
   Hours y      10 5 2 3 1.5 1


a=
              x   xy 
      y     x2 –

          n  x2  –  x 
                             2



a = 10.499
Find y  when x = 35 years.
Ages and Exercise
     Age x      18 26 32 38 52 59
   Hours y      10 5 2 3 1.5 1

     n   xy  –   x   y 
b=
          
       n  x2 –  x 
                              2



     6  625  –  225  22.5 
b=
                              2
       6  9653  –  225 
Find y  when x = 35 years.
Ages and Exercise
     Age x      18 26 32 38 52 59
   Hours y      10 5 2 3 1.5 1

     n   xy  –   x   y 
b=
          
       n  x2 –  x 
                            2



b = – 0.18
Find y  when x = 35 years.
Ages and Exercise
     Age x      18 26 32 38 52 59
   Hours y      10 5 2 3 1.5 1
a = 10.499     b = – 0.18
y  = a + bx
y  = 10.499 – 0.18x
y  = 10.499 – 0.18(35)
y  = 4.199 hours
Correlation and Regression

        Example # 6
Find the equation of the regression line and find the y
value for the specified x value. Remember that no
regression should be done when r is not significant.
Years and Contribution
    Years x         1     5     3     10   7    6
Contribution y, $   500   100   300   50   75   80
Find y when x = 4 years.
Years and Contribution
      Years x        1     5     3     10   7    6
 Contribution y, $   500   100   300   50   75   80




a=
      y      x   xy 
                 x2 –

            n  x2  –  x 
                               2



      1105  220  –  32  3405 
a=
                               2
            6  220  –  32 
      1105  220  –  32  3405 
a=
                               2
            6  220  –  32 

   243,100 – 108,960
a=
     1320 – 1024
   134,140
a=
     296

a = 453.176
Find y when x = 4 years.
Years and Contribution
     Years x         1      5       3     10   7    6
 Contribution y, $   500    100     300   50   75   80


     n   xy  –   x   y 
b=
          
       n  x2 –  x 
                                2


     6(3405) – (32)(1105)
b=
                        2
         6( 220) – (32)
     6(3405) – (32)(1105)
b=
        6( 220) – (32) 2

   20,430 – 35,360
b=
        296

     – 14,930
b=
       296

b = – 50.439
Find y when x = 4 years.
Years and Contribution
     Years x         1     5     3     10   7    6
 Contribution y, $   500   100   300   50   75   80

a = 453.176           b = – 50.439
y  = a + bx
y  = 453.176 – 50.439x
y  = 453.176 – 50.439(4)
y  = $251.42
Correlation and Regression

        Example # 7
Find the equation of the regression line and find the y
value when x = 70 ºF. Remember that no regression
should be done when r is not significant.
Temperatures ( in. F ) and precipitation (in.)
 Avg. daily temp. x   86   81   83   89   80   74   64
Avg. mo. Precip. y 3.4 1.8 3.5 3.6 3.7 1.5 0.2

  x = 557               y = 17.7
 x 2 = 44,739          xy = 1468.9
  x = 557            y = 17.7
 x 2 = 44,739       xy = 1468.9


a=
              x   xy 
      y     x2 –

          n  x  
                2 – x 2
                      

     (17.7)(44,739) – (557)(1468.9)
a=
          7(44,739) – (557)2

a = – 8.994
  x = 557             y = 17.7
 x 2 = 44,739        xy = 1468.9

     n   xy  –   x   y 
b=
          
       n x 2 – x 2
                

b= 7(1468.9) – (557)(17.7)
     7(44,739) – (557)2

b = 0.1448
  x = 557           y = 17.7     a = – 8.994
 x 2 = 44,739      xy = 1468.9   b = 0.1448

y  = a+ bx
y  = – 8.994 + 0.1448x

y  = – 8.994+ 0.1448(70)
y  = 1.1 inches
Correlation and Regression

  Coefficient of Determination
   and Standard Error of the
            Estimate
Find the coefficients of determination and non-
determination when r = 0.70 and explain the meaning
of:


        r 2 = 0.49


 49% of the variation of y is due to
 the variation of x.
Find the coefficients of determination and non-
determination when r = 0.70 and explain the meaning
of:

       1– r 2 = 0.51



51% of the variation of y is due
to chance.
Chapter 10
Correlation and Regression



Section 10-5
Exercise #15
Correlation and Regression
         Example # 2


  Coefficient of Determination
   and Standard Error of the
            Estimate
Compute the standard error of the estimate.
  x = 225       y = 22.5

 x = 9653
          2
                          y 2 = 141.25
 xy = 625                   n=6
a = 10.499              b = – 0.18

                 2
               y – a  y – b  xy
s         =
    est               n –2

              141.25 – 10.499(22.5) – ( – 0.18)(625)
s         =
    est                       6– 2
         141.25 – 10.499(22.5) – ( – 0.18)(625)
sest =
                         6 2

sest =   4.380625


sest = 2.09
Correlation and Regression
         Example # 3


  Coefficient of Determination
   and Standard Error of the
            Estimate
Find the 90% prediction interval when x = 20 years.
       Age x     18   26   32   38   52    59
    Hours y      10   5    2    3    1.5   1

 x = 225         xy = 625
 y = 22.5          n=6
   2                a = 10.499
 x = 9653
                    b = – 0.18
 y 2 = 141.25
  y  = 10.499 – 0.18x
      = 10.499–0.18(20) = 6.899
                         1         n( x  X ) 2
y  – t 2  sest     1+ n +                       y
                                 n x 2  ( x )2 <
                                           

                             1          n( x  X ) 2
< y  + t 2  sest       1+ n +
                                 n  x 2 – (  x )2

 x = 225  xy = 625 y  = 6.899
 y = 22.5   n=6
   2         a = 10.499
 x = 9653
                    b = – 0.18
   y 2 = 141.25
6.899 – (2.132)(2.09) 1 + 1 + 6(20 – 37.5) 2 < y
                          6 6(9653)  2252

                                            2
< 6.899 + (2.132)(2.09) 1+ 1 + 6(20 – 37.5)
                           6 6(9653) – 2252

6.899– (2.132)(2.09)(1.19)< y
                < 6.899 + (2.132)(2.09)(1.19)


      1.60 < y < 12.20
Correlation and Regression
         Example # 4


  Coefficient of Determination
   and Standard Error of the
            Estimate
Find the 90% prediction interval when x = 4 years.
       Years x          1   5   3    10   7    6
 Contributions y, $    500 100 300   50   75   80

 x = 35               xy = 3405
 y = 1105               n=6
   2                     a = 453.176
 x = 220
                         b = – 50.439
 y 2 = 364,525
  y  = 453.176 – 50.439x
       = 453.176 – 50.439(4) = 251.42
                         1       n( x – X ) 2
y  – t 2  sest     1+ n +                     y
                               n x 2 – ( x )2 <
                                         

                            1         n( x – X )2
< y  + t 2  sest      1+ n +
                                n  x 2 – (  x )2

  x = 35            xy = 3405
  y = 1105            n=6
    2                  a = 453.176
  x = 220
                       b = – 50.439
  y 2 = 364,525
                      y  = 251.42
251.42 – (2.132)(94.22) 1 + 1 + 6(4 – 5.33) 2 < y
                            6 6(220) – 322


< 251.42 + (2.132)(94.22) 1 + 1 + 6(4 – 5.33) 2
                              6 6(220) – 322


251.42 – (2.132)(94.22)(1.1) < y
               < 251.42+ (2.132)(94.22)(1.1)

    $30.46 < y < $472.38
Chapter 10
Correlation and Regression



Section 10-6
Multiple Regression
Correlation and Regression


      Multiple Regression
         Example # 1
A manufacturer found that a significant relationship
exists among the number of hours an assembly line
employee works per shift x1, the total
number of items produced x2, and the
number of defective items produced y.
The multiple regression equation is
y  = 9.6 + 2.2x1 – 1.08x 2. Predict the
number of defective items
produced by an employee who has
worked 9 hours and produced
24 items.
y  = 9.6 + 2.2x1 – 1.08x 2
y  = 9.6 + 2.2  9  – 1.08  24 

 y  = 3.48 or 3 items
Correlation and Regression


      Multiple Regression
         Example # 2
An educator has found a significant relationship among
a college graduate’s IQ x1, score on the verbal section
of the SAT x2, and income for the first year
following graduation from college y.
Predict the income of a college graduate
whose IQ is 120 and verbal SAT score is
650. The regression equation is
y ' = 5000 + 97x1+ 35x 2 .

y = 5000+ 97(120) – 35(650)
y  = $39,390

								
To top