; PPP19 Multiple and Logistic Regression
Learning Center
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

PPP19 Multiple and Logistic Regression


  • pg 1
►Linear regression determines the
 strength & direction of a relationship
 between two variables measured at
 the interval or ratio level.

►Linear regression formalizes the
 relationship by designating (X) as
 the independent variable and (Y) as
 the dependent variable.
►Regression   develops an equation
 that allows us to predict the value of
 our “outcome” variable Y, on the
 basis of a specified value of our
 “predictor” variable X.
►The simple linear regression formula

              Y = a + bX
►Statisticalprediction using linear
 regression is widely used in “real
 world” research. Insurance firms
 use predictor variables to determine
 auto insurance premiums. Employers
 administer aptitude & personality
 tests to predict job performance.
 Professional schools do the same with
 undergraduate marks & aptitude tests
 (e.g., LSAT, MSAT, GRE, GMAT,
►If two variables are perfectly correlated,
 knowing the value of X, allows us to
 perfectly predict the value of Y.
►As long as 2 variables are significantly
 correlated, we can use scores on X to
 predict scores on Y. For example, the
 strong correlation between social support
 (X) and mental health (Y) implies that if
 we know a person’s social support level,
 we can accurately predict their mental
 health level (Y).
►           Y = a + bx
►   Y the sum of: (1) the average
 duration of employment denoted
 by the intercept a, and (2) an
 additional average amount of
 employment duration due to a
 counseling session denoted by the
 slope b.
►The   intercept a is the value Y takes
 when X equals 0. It is the average
 duration of post-release employment
 if a young offender attends no
 counseling sessions at all.
►The slope b (the regression
 coefficient) is the amount of change
 in Y (up or down), that is caused by
 a one unit increase in X. Here it is
 the average increase in employment
 duration caused by attending each
► Differencebetween observed & predicted
 values of Y is the error in the regression. The
 higher the correlation, the more accurate our
 predictions of Y using X .
►r²is improvement in our predictions of Y
 using X….square Pearson’s r = .85, we
 obtain r² =.72. This is the coefficient of
 determination and means we improve our
 predictions of Y by 72% using X.
 Coefficient of non-determination is 1- r²,
 and it is the percentage of variability in Y
 not explained by X.
  Multiple Linear Regression:
►Is a logical and mathematical
 extension of simple linear
 regression to situations where
 we have one interval-level
 dependent variable, and two
 or more interval-level
 independent (predictor
     Y = a + b1X1 + b2X2
Y … respondent score on the dependent
a … the intercept.
b1…the regression coefficient for the first
  predictor (X1).
b2…the regression coefficient for the
  second predictor (X2).
X1…respondent score on first predictor
X2…respondent score on second predictor
      Logistic Regression

►Logistic   regression predicts
 the probability of a dependent
 variable Y (measured as a
 nominal/ordinal dichotomous
 outcome) using a predictor
 variable measured at the
 interval level.
       Logistic Regression 2
►Used in medical research to
 determine variables that predict
 whether a tumor is likely to be
 cancerous or benign, what variables
 predict probability of a heart attack or
 stroke, etc.,
►In social science, used to predict the
 odds that convicts will recidivate or
 not; which variables predict if couples
 will get divorced or not, etc.,
►In linear regression, we predict
 values of Y using a combination of
 each predictor variable multiplied by
 its respective regression coefficient
 as illustrated in the formula:

          Y = a + b1X1
      P(Y) = 1/1+ e

     where z = a + b1X1

P(Y)…probability of Y occurring.
e…base of natural logarithms
z…sum of simple linear
regression coefficients (intercept
and slope).

To top