# Ohio State Talk, October 2004 by 55q3aEg

VIEWS: 0 PAGES: 35

• pg 1
Semiparametric Methods for Colonic
Crypt Signaling

Raymond J. Carroll
Department of Statistics
Faculty of Nutrition and Toxicology

Texas A&M University
http://stat.tamu.edu/~carroll
Outline

• Problem: Modeling correlations among
functions and apply to a colon carcinogenesis
experiment
• Biological background
• Semiparametric framework
• Nonparametric Methods
• Asymptotic summary
• Analysis
• Summary
Biologist Who Did The Work

Meeyoung Hong

Postdoc in the lab of
Joanne Lupton and
Nancy Turner

Data collection
represents a year of
work
Basic Background

• Apoptosis: Programmed cell death
• Cell Proliferation: Effectively the opposite
• p27: Differences in this marker are thought to
stimulate and be predictive of apoptosis and cell
proliferation
• Our experiment: understand some of the
structure of p27 in the colon when animals are
exposed to a carcinogen
Data Collection

• Structure of Colon
• Note the finger-like
projections
• These are colonic
crypts
• We measure
expression of cells
within colonic crypts
Data Collection

• p27 expression:
Measured by staining
techniques
• Brighter intensity = higher
expression
• Done on a cell by cell
basis within selected
colonic crypts
• Very time intensive
Data Collection

• Animals sacrificed at 4 times: 0 = control,
12hr, 24hr and 48hr after exposure
• Rats: 12 at each time period
• Crypts: 20 are selected
• Cells: all cells collected, about 30 per crypt
• p27: measured on each cell, with logarithmic
transformation
Nominal Cell Position

• X = nominal cell
position
• Differentiated
cells: at top, X =
1.0
• Proliferating
cells: in middle,
X=0.5
• Stem cells: at
bottom, X=0
Standard Model

• Hierarchical structure: cells within crypts
within rats within times

Yrc (x) = η(x)+γ r (x)+θrc (x)+ε rc (x)

η(x)+γ r (x) = rat-level function

θrc (x) = crypt-level functions,
typically assumed independent
Standard Model

• Hierarchical structure: cell locations of cells
within crypts within rats within times

• In our experiment, the residuals from fits at the
crypt level are essentially white noise

• However, we also measured the location of the
colonic crypts
Crypt Locations at 24 Hours, Nominal
zero

Scale: 1000’s
of microns

Our
interest:
relationships
at between
25-200
microns
Standard Model

• Hypothesis: it is biologically plausible that the
nearer the crypts to one another, the greater the
relationship of overall p27 expression.

• Expectation: The effect of the carcinogen
might well alter the relationship over time

• Technically: Functional data where the
functions are themselves correlated
Basic Model

• Two-Levels: Rat and crypt level functions

• Rat-Level: Modeled either nonparametrically or
semiparametrically

• Semiparametrically: low-order regression
spline

• Nonparametrically: some version of a kernel
fit
Basic Model

• Crypt-Level: A regression spline, with few
knots, in a parametric mixed-model
formulation
Linear spline:
θrc (x) =  rc,I + x  rc,L + C(x) rc,S ;
In practice, we used

C(x)  spline basis functions                The covariance
matrix of the
 p      0              Daniels construction
cov( rc )   S           2 
 0      pI 
            
David Ruppert   Matt Wand

http://stat.tamu.edu/~carroll/semiregbook/

Basic Model

• Crypt-Level: We modeled the functions across
crypts semiparametrically as regression
splines
• The covariance matrix of the parameters of the
spline modeled as separable
• Matern family used to model correlations
Matern correlation : (d, , m )
m
1   2d m   2d m 
(d, , m )  m 1    
   Km   
       
2  (m )              

K m  Modified Bessel function
Basic Model

• Crypt-Level: regression spline, few knots
• Separable covariance structure
θrc (x) =  rc,I +  rc,L x + C(x) rc,S ;

         1          d(i, j), , m 
cov( ri ,  rj )  
  d(i, j), , m                     S

                           1         

d(i, j)  dis tan ce between crypts ( i, j)
Theory for Parametric Version

• The semiparametric approach we used
fits formally into a new general theory

• Recall that we fit the marginal function
at the rat level “nonparametrically”.        Xihong Lin

• At the crypt level, we used a
parametric mixed-model representation
of low-order regression splines
General Formulation

• Yij = Response
• Xij, Zij = cell and crypt locations
• Likelihood (or criterion function)


 Yi ,Zi ,β, θ(X i1 ),...,θ(Xim ) 
• The key is that the function θ() is evaluated
multiple times for each rat
• This distinguishes it from standard
semiparametric models
• The goal is to estimate θ() and β efficiently
General Formulation: Overview

• Likelihood (or criterion function)


 Yi ,Zi ,β, θ(X i1 ),...,θ(Xim )  
• For iid versions of these problems, we have
developed constructive kernel-based methods
of estimation with
• Asymptotic expansions and inference available
• If the criterion function is a likelihood function,
then the methods are semiparametric efficient.
• Methods avoid solving integral equations
General Formulation: Overview

• We also show
• The equivalence of profiling and backfitting
• Pseudo-likelihood calculations
• The results were submitted 7 months ago, and
we confidently expect 1st reviews within the
next year or two , depending on when the
referees open their mail
General Formulation: Overview

• In the application, modifications necessary both
for theoretical and practical purposes
• Techniques described in the talk of Tanya
Apanasovich can be used here as well to cut
down on the computational burden due to the
correlated functions.
Nonparametric Fits

• Equal Spacing: Assume cell locations are
equally spaced

• Define V(x1 ,x2 ,Δ) = covariance between crypt-
level functions that are D apart, one at location
x1 and the other at location x2

• Assume separable covariance structure

V(x1 ,x2 ,Δ) =G(x1 ,x2 ) ρ(Δ)
Nonparametric Fits

• Discrete Version: Pretend D, x1 and x2 take on
a small discrete set of values (we actually use a
kernel-version of this idea)
• Form the sample covariance matrix per rat at D,
x1 and x2 , then average across rats.
• Call this estimate
ˆ
V(x1 ,x2 ,Δ)
Nonparametric Fits

• Separability: Now use the separability to get a
rough estimate of the correlation surface.

V(x1 ,x 2 ,Δ) = G(x1 ,x 2 ) ρ(Δ)

 ˆ
V(x1 ,x 2 ,Δ)
x1 ,x 2
ρ(Δ) =
 ˆ
V(x1 ,x 2 ,0)
x1 ,x 2
Nonparametric Fits

• The estimate      ρ(Δ)      is not a proper
correlation function
• We fixed it up using a trick due to Peter Hall
(1994, Annals), thus forming ρ(Δ) , a real
ˆ
correlation function
• Basic idea is to do a Fourier transform, force it
to be non-negative, then invert
• Slower rates of convergence than the parametric
fit, more variability, etc.
• Asymptotics worked out (non-trivial)
Semiparametric Method Details

• In the example, a cubic polynomial actually
suffices for the rat-level functions
• Maximize in the covariance structure of the
crypt-level splines, including Matern-order
• The covariance structure includes
• Matern order m (gridded)
• Matern parameter 
• Spline smoothing parameter
Daniels)
• AR(1) for residuals
Results: Part I

• Matern Order: Matern order of 0.5 is the classic
autoregressive model
• Our Finding: For all times, an order of about
0.15 was the maximizer
• Simulations: Can distinguish from
autocorrelation
• Different Matern orders lead to different
interpretations about the extent of the
correlations
Marginal over time Spline Fits

Note
• time
effects
• Location
effects
24-Hour Fits: The Matern order matters
Nonparametric and Semiparametric
Fits at 24 Hours
Comparison of Fits

• The semiparametric and nonparametric fits are
roughly similar
• At 200 microns, quite far apart, the correlations
in the functions are approximately 0.40 for both
• Surprising degree of correlation
Summary

• We have studied the problem of crypt-signaling
in colon carcinogenesis experiments
• Technically, this is a problem of hierarchical
functional data where the functions are not
independent in the standard manner
• We developed (efficient, constructive)
semiparametric and nonparametric methods,
with asymptotic theory
• The correlations we see in the functions are
surprisingly large.
Statistical Collaborators

Yehua Li           Naisyin Wang
Summary

Insiration
for this
work:
Water
goanna,
Kimberley
Region,
Australia

To top