Docstoc

Revealing the Mode of Inheritance in Genetic Association

Document Sample
Revealing the Mode of Inheritance in Genetic Association Powered By Docstoc
					     UNIVERSITY OF THESSALY
          School of Medicine
     Laboratory of Biomathematics



Revealing the Mode of Inheritance in
   Genetic Association Studies
Genetics background
•   Genetics is the science that studies the heredity of traits

•   Genetic information is contained in DNA which consists of nucleotides

•   Gene is a sequence of nucleotides that translates a protein

•   Genes (through proteins) determine traits (phenotypes)
•   A gene may have different forms called alleles

•   An allele can be mutant type-mt (change in nucleotudes) or wild type-wt

•   For each gene there are two alleles due to diploidy of humans (homologous
    chromosome pairs)

•   In an individual the genotype distribution of gene can be homozygous (wtwt
    or mtmt) or heterozygous (wtmt)

•   The multiple alleles of a gene is called polymorphism or variant (preserved
    mutations), they usually expressing different phenotypes
Genetic association studies (GAS)

• The evaluation of possible associations
  between phenotypic traits (diseases) and
  genetic variants (gene polymorphisms) is
  carried out using GAS

• In the case of a genetic variant with two
  alleles (mutant type-mt and wild type-wt),
  where mt is thought to be associated with a
  disease, GAS will collect information on the
  numbers of diseased subjects and control
  subjects with each of the three genotypes
  (wt/wt, wt/mt, mt/mt)
• In an illustrative example of GAS with
  8261/4374 cases/controls investigated the
  association between ACE D/I (wt/mt) and
  CAD, the genotype distribution was

  Genotype         Cases with CAD             Controls
  mt/mt            1788                       874
  mt/wt            4145                       2165
  wt/wt            2328                       1335

The association between disease status and the
genetic variant is tested using a chi-squared (x2) test
with (3-1)x(2-1)=2 df
When the association is significant, various genetic models
of genotypes are tested by merging genotypes

These models include:

•   additive model: homozygous for mt vs.
                    homozygous for wt

•   recessive model: homozygous for mt vs. wt-carriers

•   dominant model: mt-carriers vs. non-mt-carriers

•   co-dominant model: heterozygous vs. all homozygotes
    The significance of the genetic model is
    assessed using the respective odds ratio
    (OR) and its 95% confidence interval (CI).

    The OR for the additive model is

     " probability" a subject of being diseased when mtmt
OR 
     " probability" a subject of being diseased when wtwt

    For OR>1: an mt subject has greater chance
    of being diseased than a wt subject

    If the 95% CI does not include 1, then, the OR
    is significant (P<0.05) (i.e. the variant is
    associated with the disease).
ACE D/I (wt=D/mt=I) vs. CAD

  Genotype      Cases with CAD       Controls
  mt/mt         1788                 874
  mt/wt         4145                 2165
  wt/wt         2328                 1335

• x2=9.42, P<0.05
(http://people.ku.edu/~preacher/chisq/chisq.htm)

There is significant association between ACE D/I
  gene variant and development of CAD
But, what is the mode of inheritance,
                  or
  what is the real genetic model?
   Recessive model:

Genotype                              Cases with CAD                                 Controls
  mt/mt                               1788                                           874
  mt/wt+wt/wt                         4145+ 2328=6473                                2165+1335=3500
                                                                  1788
         " probability" a subject of being withCAD when mtmt           874  1.11
 OR                                                            
      " probability" a subject of being withCAD when mtwt  wtwt 6473
                                                                      3500

                                  1   1    1   1                             1   1    1   1
               ln( OR )1.96*                          ln( OR )1.96*              
 95%CI  ( e                    1788 6473 874 3500
                                                     ,e                    1788 6473 874 3500
                                                                                                )  ( 1.01,1.21 )


 Since “1” is not included in the 95% CI, we conclude
  that the OR is significant (P<0.05).

 Since OR>1, we conclude that homozygous for the mt
  allele have 11% greater risk for CAD than wt-carriers
  Dominant model:

Genotype                              Cases with CAD                                 Controls
  mt/mt+mt/wt                         1788+4145=5933                                 874+2165=3039
  wt/wt                               2328                                           1335
                                                                  5933
     " probability" a subject of being withCAD when mtmt  mtwt        3039  1.12
OR                                                             
         " probability" a subject of being withCAD when wtwt      2328
                                                                       1335
                                1    1    1     1                           1    1    1     1
              ln( OR )1.96*                           ln( OR )1.96*               
95%CI  ( e                    5933 2328 3039 1335
                                                     ,e                    5933 2328 3039 1335
                                                                                                 )  ( 1.03,1.21 )

 Since “1” is not included in the 95% CI, we conclude
  that the OR is significant (P<0.05).

 Since OR>1, we conclude that carriers of the mt allele
  have 12% greater risk for CAD than homozygous for
  the wt allele
   Additive model:

Genotype                              Cases with CAD                                  Controls
  mt/mt                               1788                                            874
  wt/wt                               2328                                            1335
                                                               1788
         " probability" a subject of being withCAD when mtmt       1335  1.17
    OR                                                      
         " probability" a subject of being withCAD when wtwt   874
                                                                   2328
                                  1   1    1    1                            1   1    1    1
               ln( OR )1.96*                          ln( OR )1.96*              
 95%CI  ( e                    1788 2328 874 1335
                                                     ,e                    1788 2328 874 1335
                                                                                                )  ( 1.06 ,1.30 )

 Since “1” is not included in the 95% CI, we conclude
  that the OR is significant (P<0.05).

 Since OR>1, we conclude that homozygous for the mt
  allele have 17% greater risk for CAD than
  homozygous for the wt allele
     Co-dominant model:

Genotype                               Cases with CAD                                 Controls
  mt/wt                                4145                                           2165
  mt/mt+wt/wt                          1788+2328=4116                                 874+1335=2209

                                                                 4145
        " probability" a subject of being withCAD when mtwt           2165  1.03
OR                                                            
     " probability" a subject of being withCAD when wtwt  mtmt 4116
                                                                      2209
                                 1    1    1    1                            1     1   1     1
               ln( OR )1.96*                           ln( OR )1.96*               
 95%CI  ( e                    4145 4116 2165 2209
                                                      ,e                    4145 4116 2165 2209
                                                                                                  )  ( 0.96 ,1.11 )



 Since “1” is included in the 95% CI, we conclude
  that the OR is not significant (P≥0.05).
 Recessive model: OR=1.11 (1.01, 1.21), significant

  Homozygous for the mt allele have greater risk than
  wt-carriers

 Dominant model: OR=1.12 (1.03, 1.21), significant

  Carriers of the mt allele have greater risk than non-
  carriers

 Additive model: OR=1.17 (1.06-1.30), significant

  Homozygous for the mt allele have greater risk than
  homozygous for the wt allele

 Co-dominant model: OR=1.03 (0.96, 1.11), non-sign
Is the genetic model,
      recessive,
      dominant or
       additive?



       Mess!
The source of the problem

 The ORs of the genetic models
 (recessive, dominant, additive, co-dominant)
 are not independent

 the testing of association between genotype
 distribution and outcome (disease/controls) is based
 on 2 df (the df for the Chi-squared test)
How can we avoid the hash of possible
genetic models making the interpretation of
the results straightforward at the same
time?
Zintzaras (2010, Stat Appl Genet)
introduced the concept of a generalized
odds ratio (ORG) as a metric for
describing the association between
disease status (disease vs. healthy or
disease progression) and genotype
(biallelic or multiallelic)
 The ORG is a single statistic that
 utilizes the complete genotype
 distribution and provides an estimate
 of the overall risk effect

General definition:

 The ORG is the probability of a subject
 being more diseased relative to the
 probability of being less diseased,
 given that the more diseased subject
 has a higher mutational load
Definition for bi-allelic variant and binary
phenotype:

  ORG is the probability of a subject being
  diseased relative to probability of being free of
  disease, given that the diseased subject has a
  higher mutational load than the non-diseased

              Probability being diseased, diseased has high mutational load
OR G =
         Probability of being non-diseased, non-diseased has low mutational load



  When ORG>1 then an increased genetic
  exposure (mutational load) implies disease
• “ORGGASMA”: a software for
  implementing the generalized odds
  ratio methodology for the analysis and
  meta-analysis of GAS

• The software “ORGGASMA” (together
  with instructions how to operate it) is
  freely available and it can be
  downloaded form the web site
  http://biomath.med.uth.gr
ACE D/I (wt=D/mt=I) vs. CAD

  Genotype      Controls    Cases with CAD
  mt/mt         1788           874
  mt/wt         4145           2165
  wt/wt         2328           1335

• Assumption: Subjects who are homozygous for
  I allele have the highest mutational load, those
  homozygous for D allele have the lowest, and
  heterozygous have an intermediate level.

• ORG=1.13 with 95% CI: (1.08-1.19),
• ORG=1.13 with 95% CI: (1.08-1.19)

 For any two subjects, diseased and
 healthy, the probability of being
 diseased is 13% higher (relative to the
 probability of being non-diseased)
 given that the diseased subject has
 higher mutational load than the healthy
 one.
Disease progression

   ADH2 *2/*1            Controls    Alcoholics         Alcoholicswith
                                                        liver disease
    *1/*1                 188`       874                321
    *2/*1                 145        265                456
    *2/*2                 238        135                231

   ORG =1.37 (1.10-1.72): Risk of disease progression is related to mutational load

   A subject has 37% higher risk of being more diseased (relative to the risk of being
   less diseased) given that the subject has a higher mutational load.

Multiallelic variant

   APOE                Controls      CAD
   e2/e2               23`           44
   e2/e3               45            65
   e2/e4               28            35
   e3/e3               32            21
   e3/e4               87            45
   e4/e4               34            44

   ORG=1.13 (1.07-1.19): Mutational load of APOE plays a role in disease susceptibility

   Diseased subjects with higher mutational load than healthy ones have 13% higher
   risk for disease susceptibility.
     The ORG is a good solution, but,
           it is not enough!

• Zintzaras and Santos (2010, Stat Med)
  provided the whole solution!

Problem: The ORs of the genetic models
(recessive, dominant, additive, co-dominant)
are not independent

Solution: Inferences should be based solely on
the additive and co-dominant models
Instead of talking for
  recessive,
  dominant,
  additive,
  co-dominant models
we could talk for
Dominance and Co-Dominance
or even better for
Degree of dominance
Co-dominance
 In the extreme case where there is co-
 dominance (i.e., perfect additivity), the
 heterozygote wtmt “lies” exactly in the
 middle of the two homozygotes, with mtmt
 having the maximum susceptibility of being
 diseased and wtwt having the least




• Co-dominant model is non-significant (P>0.05)
• Additive model is highly significant (P<0.01)
  Dominance
   The heterozygote wtmt lies towards
   mtmt or wtmt




Co-dominant model is significant (P<0.05)

Additive model can be significant (P<0.05) or non-
significant (P>0.05)
          Degree of dominance
The degree of dominance could be derived
from the ratio of the logarithms of the OR of
co-dominant vs. the OR of the additive model

                   ln  ORco 
              h                 ,
                   ln  ORa 

the sign of ln(θco) determines the direction of
dominance, and the value of ln(θco) relative to the
absolute value of ln(θa) the magnitude of dominance
deviation (i.e. deviation from the middle)
• -1<h<0: wtmt is expected to have a risk of
  being diseased somewhere in between the
  middle of the two homozygotes and towards
  to wtwt

• 0<h<1: wtmt is expected to have a risk of
  being diseased somewhere in between the
  middle of the two homozygotes and towards
  to mtmt

• h>1: wtmt has a higher risk of being
  diseased than mtmt

• h<-1: wtmt has least chance of being
  diseased than wtwt
Once significance in dominance is
detected (i.e. co-dominant model has
P<0.05) and h is obtained, the degree of
dominance is inferred as follows:
To summarize, inferences regarding any
degree of dominance are obtained from the
following order:
• If the co-dominant model is non-significant and the
  additive model is significant (i.e. co-dominance), the
  risk of disease for the heterozygote is in the middle
  of the two homozygotes.
• If the co-dominant model is significant (i.e.
  dominance), we then test for the direction of
  dominance.
   – If 0<|h|<1, wtmt has a risk of disease closer to
     mtmt or wtwt according to the sign (+ or -,
     respectively) of h
   – If |h|>1 is significant then, there is over- or under-
     dominance.
ACE D/I (wt=D/mt=I) vs. CAD

 Genotype     Cases with CAD     Controls
 mt/mt        1788               874
 mt/wt        4145               2165
 wt/wt        2328               1335

• The co-dominant model is not significant
  (P≥0.05) and the additive model is
  significant (P<0.05) the risk of disease for
  the heterozygote is in the middle of the two
  homozygotes.
  A GAS investigating the association between the alleles
  ADH2*1 and ADH2*2 with alcoholism produced the
  following genotype distributions:
                       Genotype Controls Cases
                        *2 *2     448     238
                        *2 *1     93      85
                        *1 *1      4      17



• Both co-dominant and additive models are significant.
  Since the co-dominant model is significant, we proceed
  to inquiry about the degree of dominance, which here is
            h  ln  θco  ln  θa   0.48 2.08  0.23

• indicating that the risk-associated allele *1 is dominant,
  or that dominance exists.
In other words, the homozygous *1/*1 (mt/mt)
has a greater risk of being alcoholic than the
homozygous *2/*2 (wt/wt), and the heterozygote
*2/*1 has a risk of alcoholism closer to the
homozygote *1/*1 than to the midpoint between
the two homozygotes.
Ancient Theater of Larissa

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:9
posted:4/8/2012
language:English
pages:40