Ohio State Talk, October 2004 by 55q3aEg


									Semiparametric Methods for Colonic
         Crypt Signaling

            Raymond J. Carroll
         Department of Statistics
    Faculty of Nutrition and Toxicology

          Texas A&M University

• Problem: Modeling correlations among
  functions and apply to a colon carcinogenesis
• Biological background
• Semiparametric framework
• Nonparametric Methods
• Asymptotic summary
• Analysis
• Summary
Biologist Who Did The Work

                 Meeyoung Hong

                 Postdoc in the lab of
                 Joanne Lupton and
                 Nancy Turner

                 Data collection
                 represents a year of
              Basic Background

• Apoptosis: Programmed cell death
• Cell Proliferation: Effectively the opposite
• p27: Differences in this marker are thought to
  stimulate and be predictive of apoptosis and cell
• Our experiment: understand some of the
  structure of p27 in the colon when animals are
  exposed to a carcinogen
                Data Collection

• Structure of Colon
• Note the finger-like
• These are colonic
• We measure
  expression of cells
  within colonic crypts
                  Data Collection

• p27 expression:
  Measured by staining
• Brighter intensity = higher
• Done on a cell by cell
  basis within selected
  colonic crypts
• Very time intensive
               Data Collection

• Animals sacrificed at 4 times: 0 = control,
  12hr, 24hr and 48hr after exposure
• Rats: 12 at each time period
• Crypts: 20 are selected
• Cells: all cells collected, about 30 per crypt
• p27: measured on each cell, with logarithmic
           Nominal Cell Position

• X = nominal cell
• Differentiated
  cells: at top, X =
• Proliferating
  cells: in middle,
• Stem cells: at
  bottom, X=0
                 Standard Model

• Hierarchical structure: cells within crypts
  within rats within times

    Yrc (x) = η(x)+γ r (x)+θrc (x)+ε rc (x)

    η(x)+γ r (x) = rat-level function

    θrc (x) = crypt-level functions,
                    typically assumed independent
               Standard Model

• Hierarchical structure: cell locations of cells
  within crypts within rats within times

• In our experiment, the residuals from fits at the
  crypt level are essentially white noise

• However, we also measured the location of the
  colonic crypts
    Crypt Locations at 24 Hours, Nominal

Scale: 1000’s
of microns

at between
               Standard Model

• Hypothesis: it is biologically plausible that the
  nearer the crypts to one another, the greater the
  relationship of overall p27 expression.

• Expectation: The effect of the carcinogen
  might well alter the relationship over time

• Technically: Functional data where the
  functions are themselves correlated
                 Basic Model

• Two-Levels: Rat and crypt level functions

• Rat-Level: Modeled either nonparametrically or

• Semiparametrically: low-order regression

• Nonparametrically: some version of a kernel
                          Basic Model

 • Crypt-Level: A regression spline, with few
   knots, in a parametric mixed-model
                                             Linear spline:
θrc (x) =  rc,I + x  rc,L + C(x) rc,S ;
                                             In practice, we used
                                             a quadratic spline.

C(x)  spline basis functions                The covariance
                                             matrix of the
                                             quadratic part used
                                             the Pourahmadi-
                    p      0              Daniels construction
cov( rc )   S           2 
                    0      pI 
                               
                                David Ruppert   Matt Wand


Please buy our book! Or steal it 
                    Basic Model

• Crypt-Level: We modeled the functions across
  crypts semiparametrically as regression
• The covariance matrix of the parameters of the
  spline modeled as separable
• Matern family used to model correlations
      Matern correlation : (d, , m )
                         1   2d m   2d m 
      (d, , m )  m 1    
                               Km   
                                          
                   2  (m )              

      K m  Modified Bessel function
                        Basic Model

• Crypt-Level: regression spline, few knots
• Separable covariance structure
   θrc (x) =  rc,I +  rc,L x + C(x) rc,S ;

                                1          d(i, j), , m 
   cov( ri ,  rj )  
                         d(i, j), , m                     S
                                                  1         

   d(i, j)  dis tan ce between crypts ( i, j)
      Theory for Parametric Version

• The semiparametric approach we used
fits formally into a new general theory

• Recall that we fit the marginal function
at the rat level “nonparametrically”.        Xihong Lin

• At the crypt level, we used a
parametric mixed-model representation
of low-order regression splines
             General Formulation

• Yij = Response
• Xij, Zij = cell and crypt locations
• Likelihood (or criterion function)

       Yi ,Zi ,β, θ(X i1 ),...,θ(Xim ) 
• The key is that the function θ() is evaluated
  multiple times for each rat
• This distinguishes it from standard
  semiparametric models
• The goal is to estimate θ() and β efficiently
     General Formulation: Overview

• Likelihood (or criterion function)

       Yi ,Zi ,β, θ(X i1 ),...,θ(Xim )  
• For iid versions of these problems, we have
  developed constructive kernel-based methods
  of estimation with
   • Asymptotic expansions and inference available
• If the criterion function is a likelihood function,
  then the methods are semiparametric efficient.
   • Methods avoid solving integral equations
    General Formulation: Overview

• We also show
  • The equivalence of profiling and backfitting
  • Pseudo-likelihood calculations
• The results were submitted 7 months ago, and
  we confidently expect 1st reviews within the
  next year or two , depending on when the
  referees open their mail
     General Formulation: Overview

• In the application, modifications necessary both
  for theoretical and practical purposes
• Techniques described in the talk of Tanya
  Apanasovich can be used here as well to cut
  down on the computational burden due to the
  correlated functions.
             Nonparametric Fits

• Equal Spacing: Assume cell locations are
  equally spaced

• Define V(x1 ,x2 ,Δ) = covariance between crypt-
  level functions that are D apart, one at location
  x1 and the other at location x2

• Assume separable covariance structure

            V(x1 ,x2 ,Δ) =G(x1 ,x2 ) ρ(Δ)
             Nonparametric Fits

• Discrete Version: Pretend D, x1 and x2 take on
  a small discrete set of values (we actually use a
  kernel-version of this idea)
• Form the sample covariance matrix per rat at D,
  x1 and x2 , then average across rats.
• Call this estimate
                   V(x1 ,x2 ,Δ)
             Nonparametric Fits

• Separability: Now use the separability to get a
  rough estimate of the correlation surface.

          V(x1 ,x 2 ,Δ) = G(x1 ,x 2 ) ρ(Δ)

                    ˆ
                     V(x1 ,x 2 ,Δ)
                   x1 ,x 2
          ρ(Δ) =
                    ˆ
                     V(x1 ,x 2 ,0)
                   x1 ,x 2
             Nonparametric Fits

• The estimate      ρ(Δ)      is not a proper
  correlation function
• We fixed it up using a trick due to Peter Hall
  (1994, Annals), thus forming ρ(Δ) , a real
  correlation function
• Basic idea is to do a Fourier transform, force it
  to be non-negative, then invert
• Slower rates of convergence than the parametric
  fit, more variability, etc.
• Asymptotics worked out (non-trivial)
      Semiparametric Method Details

• In the example, a cubic polynomial actually
  suffices for the rat-level functions
• Maximize in the covariance structure of the
  crypt-level splines, including Matern-order
• The covariance structure includes
  • Matern order m (gridded)
  • Matern parameter 
  • Spline smoothing parameter
  • Quadratic part’s covariance matrix (Pourahmadi-
  • AR(1) for residuals
               Results: Part I

• Matern Order: Matern order of 0.5 is the classic
  autoregressive model
• Our Finding: For all times, an order of about
  0.15 was the maximizer
• Simulations: Can distinguish from
• Different Matern orders lead to different
  interpretations about the extent of the
Marginal over time Spline Fits

                          • time
                          • Location
24-Hour Fits: The Matern order matters
Nonparametric and Semiparametric
Fits at 24 Hours
              Comparison of Fits

• The semiparametric and nonparametric fits are
  roughly similar
• At 200 microns, quite far apart, the correlations
  in the functions are approximately 0.40 for both
• Surprising degree of correlation

• We have studied the problem of crypt-signaling
  in colon carcinogenesis experiments
• Technically, this is a problem of hierarchical
  functional data where the functions are not
  independent in the standard manner
• We developed (efficient, constructive)
  semiparametric and nonparametric methods,
  with asymptotic theory
• The correlations we see in the functions are
  surprisingly large.
     Statistical Collaborators

Yehua Li           Naisyin Wang

for this

To top