# Outline Outline Linear Models Generalized Linear Models by mfuw0ekd999

VIEWS: 36 PAGES: 3

• pg 1
```									Outline                                                                           Outline                                                                               Linear Models
So far we have discussed linear models: ANOVA, regression, and
ANCOVA
We can even build our models in a “mixed effects” framework by
incorporating different error terms
1   Generalized Linear Models                                                                                                                                              We have always used a response variable (y ) that is continuous
and normally distributed
1   Generalized Linear Models
If the model errors weren’t normally distributed, we simply
transformed y and made it somewhat normally distributed

Week 11 - Generalized linear models              ,                                Week 11 - Generalized linear models                  ,                              Week 11 - Generalized linear models                ,
B. Aukema, NRES 798                     1 / 16                                    B. Aukema, NRES 798                         2 / 16                                  B. Aukema, NRES 798                       3 / 16

Generalized Linear Models                                                         Generalized Linear Models                                                             Generalized Linear Models
Generalized linear models are models that do not depend on a                      I’ll demonstrate how a generalized linear model works with a                        A generalized linear model uses something called a link function
normally distributed response variable                                            regression equation                                                                 between the response (not normally distributed!) and the “mean”
The two most common types are logistic models for binary data                                                                                                         Each different “family” (e.g., Poisson, binomial, etc.) uses a
yi = β0 + β1 xi + ǫi
and Poisson models for Poisson distributed data                                                                                                                       different link function, sometimes known as a “canonical link”
Generalized linear models, because they are linear models, can                    The β0 + β1 xi is the “mean” part of the model if the response is                   (there is only one per family)
still do ANOVA, regression, and ANCOVA. In fact, they can even                    normally distributed. It contains the coefﬁcients in which we are
incorporate random effects as well. The only difference is that we                most interested
can specify that the response variable is not normally distributed
Note that generalized linear models are abbreviated GLM. (SAS
corporation is very sloppy with their terminology. They have
named one procedure GLM for general linear model, which is
nonsensical).

Week 11 - Generalized linear models              ,                                Week 11 - Generalized linear models                  ,                              Week 11 - Generalized linear models                ,
B. Aukema, NRES 798                     4 / 16                                    B. Aukema, NRES 798                         5 / 16                                  B. Aukema, NRES 798                       6 / 16
Generalized Linear Models                                                                Generalized Linear Models                                                                Generalized Linear Models
Logistic models                                                                          Logistic models: the (sort of) confusing part                                            Logistic regression: example

The link function for the binomial family is known as the “logit”:                       Now, the mean response in a logical model, the average of all the
0/1 data, will be the probability of the measured event                                 Let’s say you wanted to ﬁnd the probability of y occurring at
p(Y )                                                                                                                                         x = 1124. From the summary table, you can ﬁnd the equation
log(             ) = β0 + β1 x                                 The link function is simply a transformation of the probability of
1 − p(Y )
Estimate Std. Error z value Pr(>|z|)
or whatever your “mean” part of the model is (e.g., could be µ + αi                      To retrieve this probability, you need to backtransform the model                  (Intercept) 10.753080   3.870467   2.778 0.005465 **
if a one-way ANOVA)                                                                      coefﬁcients                                                                        x           -0.009576   0.003534 -2.709 0.006739 **
In practice, you specify to the computer                                                 For example, in our regression, the back transformation to get the
1   The response variable                                                              direct probability back, would be                                                       y = 10.75 − 0.009576 × 1124 = −0.0134
2   The model
3   The data distribution family                                                                                             exp(β0 + β1 x)                                    To get the probability, we back transform this value:
p(Y ) =
You do not transform the original data!                                                                                      1 + exp(β0 + β1 x)                                                                 exp(−0.0134)
p(Y ) =                    = 0.497
1 + exp(−0.0134)

Week 11 - Generalized linear models               ,                                       Week 11 - Generalized linear models              ,                                     Week 11 - Generalized linear models             ,
B. Aukema, NRES 798                      7 / 16                                           B. Aukema, NRES 798                     8 / 16                                         B. Aukema, NRES 798                    9 / 16

Generalized Linear Models                                                                Generalized Linear Models                                                                Generalized Linear Models
Logistic models: model ﬁtting                                                            Logistic model ﬁtting                                                                    Logistic models: the concept of deviance

Fitting a model is pretty similar to what we have already done                           Parameter estimation is performed via likelihood, not least squares                New things:
We use glm() instead of lm()                                                             Hence, you will not ﬁnd an R 2 value but instead an AIC value                           The “residual” error is called “deviance” in logistic regression
In a mixed model framework, we use glmmPQL() instead of                                                                                                                          With deviance, we no longer use our trusty t and F tests
lme()
The syntax is entirely the same for model speciﬁcation (i.e.,
specify the response, the model, the data source, AND now the
data family as well (e.g., family=“binomial” for logistic regression))

Week 11 - Generalized linear models               ,                                       Week 11 - Generalized linear models              ,                                     Week 11 - Generalized linear models             ,
B. Aukema, NRES 798                     10 / 16                                           B. Aukema, NRES 798                    11 / 16                                         B. Aukema, NRES 798                   12 / 16
Generalized Linear Models                                                             Generalized Linear Models                                                          Generalized Linear Models
Logistic models: χ2 tests                                                             Logistic models: Wald tests                                                        Logistic regression: Model assumptions

New things:                                                                           New things:                                                                          1   Data are independent
F tests have been replaced with            χ2   tests                                 Instead of a t test to check whether or not your model parameters              2   Model is correct
So, for example, when using anova(), you will see χ2 tests,                           are different from zero, you now use a Z distribution
especially in model selection                                                         For logistic models, each test is called a Wald test

Week 11 - Generalized linear models             ,                                  Week 11 - Generalized linear models             ,                                    Week 11 - Generalized linear models             ,
B. Aukema, NRES 798                   13 / 16                                      B. Aukema, NRES 798                   14 / 16                                        B. Aukema, NRES 798                   15 / 16

Generalized Linear Models
Logistic regression: model checking

There is still some residual, which you can read from the model
summary. It can be helpful to see how much noise there is
We do not, however, check residual plots: we cannot transform
the data anyway!

Week 11 - Generalized linear models             ,
B. Aukema, NRES 798                   16 / 16

```
To top