Outline Outline Linear Models Generalized Linear Models by mfuw0ekd999

VIEWS: 36 PAGES: 3

									Outline                                                                           Outline                                                                               Linear Models
                                                                                                                                                                           So far we have discussed linear models: ANOVA, regression, and
                                                                                                                                                                           ANCOVA
                                                                                                                                                                           We can even build our models in a “mixed effects” framework by
                                                                                                                                                                           incorporating different error terms
1   Generalized Linear Models                                                                                                                                              We have always used a response variable (y ) that is continuous
                                                                                                                                                                           and normally distributed
                                                                                  1   Generalized Linear Models
                                                                                                                                                                           If the model errors weren’t normally distributed, we simply
                                                                                                                                                                           transformed y and made it somewhat normally distributed




                             Week 11 - Generalized linear models              ,                                Week 11 - Generalized linear models                  ,                              Week 11 - Generalized linear models                ,
                             B. Aukema, NRES 798                     1 / 16                                    B. Aukema, NRES 798                         2 / 16                                  B. Aukema, NRES 798                       3 / 16




Generalized Linear Models                                                         Generalized Linear Models                                                             Generalized Linear Models
     Generalized linear models are models that do not depend on a                      I’ll demonstrate how a generalized linear model works with a                        A generalized linear model uses something called a link function
     normally distributed response variable                                            regression equation                                                                 between the response (not normally distributed!) and the “mean”
     The two most common types are logistic models for binary data                                                                                                         Each different “family” (e.g., Poisson, binomial, etc.) uses a
                                                                                                               yi = β0 + β1 xi + ǫi
     and Poisson models for Poisson distributed data                                                                                                                       different link function, sometimes known as a “canonical link”
     Generalized linear models, because they are linear models, can                    The β0 + β1 xi is the “mean” part of the model if the response is                   (there is only one per family)
     still do ANOVA, regression, and ANCOVA. In fact, they can even                    normally distributed. It contains the coefficients in which we are
     incorporate random effects as well. The only difference is that we                most interested
     can specify that the response variable is not normally distributed
     Note that generalized linear models are abbreviated GLM. (SAS
     corporation is very sloppy with their terminology. They have
     named one procedure GLM for general linear model, which is
     nonsensical).



                             Week 11 - Generalized linear models              ,                                Week 11 - Generalized linear models                  ,                              Week 11 - Generalized linear models                ,
                             B. Aukema, NRES 798                     4 / 16                                    B. Aukema, NRES 798                         5 / 16                                  B. Aukema, NRES 798                       6 / 16
Generalized Linear Models                                                                Generalized Linear Models                                                                Generalized Linear Models
Logistic models                                                                          Logistic models: the (sort of) confusing part                                            Logistic regression: example

      The link function for the binomial family is known as the “logit”:                       Now, the mean response in a logical model, the average of all the
                                                                                               0/1 data, will be the probability of the measured event                                 Let’s say you wanted to find the probability of y occurring at
                                         p(Y )                                                                                                                                         x = 1124. From the summary table, you can find the equation
                                log(             ) = β0 + β1 x                                 The link function is simply a transformation of the probability of
                                       1 − p(Y )
                                                                                               your response                                                                      Coefficients:
                                                                                                                                                                                               Estimate Std. Error z value Pr(>|z|)
      or whatever your “mean” part of the model is (e.g., could be µ + αi                      To retrieve this probability, you need to backtransform the model                  (Intercept) 10.753080   3.870467   2.778 0.005465 **
      if a one-way ANOVA)                                                                      coefficients                                                                        x           -0.009576   0.003534 -2.709 0.006739 **
      In practice, you specify to the computer                                                 For example, in our regression, the back transformation to get the
        1   The response variable                                                              direct probability back, would be                                                       y = 10.75 − 0.009576 × 1124 = −0.0134
        2   The model
        3   The data distribution family                                                                                             exp(β0 + β1 x)                                    To get the probability, we back transform this value:
                                                                                                                        p(Y ) =
      You do not transform the original data!                                                                                      1 + exp(β0 + β1 x)                                                                 exp(−0.0134)
                                                                                                                                                                                                          p(Y ) =                    = 0.497
                                                                                                                                                                                                                    1 + exp(−0.0134)


                                   Week 11 - Generalized linear models               ,                                       Week 11 - Generalized linear models              ,                                     Week 11 - Generalized linear models             ,
                                   B. Aukema, NRES 798                      7 / 16                                           B. Aukema, NRES 798                     8 / 16                                         B. Aukema, NRES 798                    9 / 16




Generalized Linear Models                                                                Generalized Linear Models                                                                Generalized Linear Models
Logistic models: model fitting                                                            Logistic model fitting                                                                    Logistic models: the concept of deviance

      Fitting a model is pretty similar to what we have already done                           Parameter estimation is performed via likelihood, not least squares                New things:
      We use glm() instead of lm()                                                             Hence, you will not find an R 2 value but instead an AIC value                           The “residual” error is called “deviance” in logistic regression
      In a mixed model framework, we use glmmPQL() instead of                                                                                                                          With deviance, we no longer use our trusty t and F tests
      lme()
      The syntax is entirely the same for model specification (i.e.,
      specify the response, the model, the data source, AND now the
      data family as well (e.g., family=“binomial” for logistic regression))




                                   Week 11 - Generalized linear models               ,                                       Week 11 - Generalized linear models              ,                                     Week 11 - Generalized linear models             ,
                                   B. Aukema, NRES 798                     10 / 16                                           B. Aukema, NRES 798                    11 / 16                                         B. Aukema, NRES 798                   12 / 16
Generalized Linear Models                                                             Generalized Linear Models                                                          Generalized Linear Models
Logistic models: χ2 tests                                                             Logistic models: Wald tests                                                        Logistic regression: Model assumptions

New things:                                                                           New things:                                                                          1   Data are independent
      F tests have been replaced with            χ2   tests                                 Instead of a t test to check whether or not your model parameters              2   Model is correct
      So, for example, when using anova(), you will see χ2 tests,                           are different from zero, you now use a Z distribution
      especially in model selection                                                         For logistic models, each test is called a Wald test




                                  Week 11 - Generalized linear models             ,                                  Week 11 - Generalized linear models             ,                                    Week 11 - Generalized linear models             ,
                                  B. Aukema, NRES 798                   13 / 16                                      B. Aukema, NRES 798                   14 / 16                                        B. Aukema, NRES 798                   15 / 16




Generalized Linear Models
Logistic regression: model checking

      There is still some residual, which you can read from the model
      summary. It can be helpful to see how much noise there is
      We do not, however, check residual plots: we cannot transform
      the data anyway!




                                  Week 11 - Generalized linear models             ,
                                  B. Aukema, NRES 798                   16 / 16

								
To top