# HOMEWORK ON BANDWIDTH CHOICE due Tue Feb 20 Suppose that we have

Document Sample

```					HOMEWORK ON BANDWIDTH CHOICE                                       due Tue Feb 20

Suppose that we have data Yi , xi , i = 1, . . . , n with Yi = m(xi ) + i , with the
2
i ’s independent mean zero and variance σ . We’ll estimate m using a local
quadratic ﬁt with kernel K. We can show that, for xi ’s equally spaced on [0, 1],
under regularity conditions,

m (x)    u4 K(u) du      m (x)
Bias(m (x)) ∼ h2
ˆ                                 ≡ h2       Kb
6       u2 K(u) du       6

σ2     u2 K 2 (u) du   σ2
var(m (x)) ∼
ˆ             3 [ u2 K(u) du]2
≡     Kv
nh                     nh3

1a. What is the asymptotic integrated mean squared error? What is hopt , the
value of h that minimizes the asymptotic integrated mean squared error?
(You don’t need to do any fancy theory here - assume that you can just
go ahead and use the above formulae.)
√
1b. Suppose that K(u) = (1/ 2π) exp(−u2 /2), that is, K is the standard
normal kernel. Find the values of Kb and Kv in the formulae for the
asymptotic bias and variance. (For your calculations, you can reference a
probability book or wikipedia or ... for moments of a normal distriubtion.
But do state your reference, and exactly what you are getting from the
reference.)
1c. Suppose that m(x) = a exp(bx) + cx + d. What is the value of hopt from
part 2? (It will possibly depend on a, b, c, d.)

Data analysis: For 2a)-2b) use the dataset used in class - onebms.txt - that
has body mass as a function of week (weeks -1 to 60 with some missing). For
ease, rescale to (0,1) and assume that these xi ’s are evenly spaced. You’ll esti-
mate the growth rate (the derivative of the regression function) using locpoly,
with local quadratic and normal kernel (assume locpoly uses the standard nor-
mal).

2a. Find a “ﬁrst generation” rule of thumb value of h as follows. Estimate a,
b, c and d in the model m(x) = a exp(bx) + cx + d using non-linear least
squares (see my R code - I had trouble with R’s nls, so wrote this). Use
ˆ
the estimated m: m(x) = a exp(ˆ + cx + d and Rice’s estimate of σ 2 to
ˆ      ˆ      bx) ˆ
plug into the hopt from part 1c above. What is your estimate of hopt ?
2b. Use locpoly to estimate m , using the h from 2a.

1
For 3a)-3c): use another data analysis. Use the R data set beav2: body tem-
peratures of a beaver, taken ever 10 minutes.
library(MASS)
help(beav2) # gives information
x <- (1:100)/101; y <- beav2\$temp                 ## gives the temperature
plot(x,y)
Smooth y in three ways:
3a. Use KernSmooth’s dpill choice of h in locpoly, for a local linear estimator.
3b.] Use smooth.spline with default choice of smoothing (gcv) - you don’t
need to code gcv.
3c. Use the cross-validation R code I wrote for lecture, to choose a bandwidth
for a local linear estimator.
For 3a)-3c): hand in a plot showing the data and your three estimates (or three
separate plots, if you like). For each estimate, how many effective parameters
did you use? (For some of these, you may need to use the hatmatrix R code.)

2

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 3 posted: 3/25/2010 language: English pages: 2