# log likelihood function will have the form

Document Sample

Year 2008

Solution to exercise 1
a) Assume that we have a sample of size n (x1, x2, ….) independently drawn from the
population with the density of probability ( gamma)
  1 x / 
f (x | ,  )        x e        0 x 
( )

Assuming that  is constant find the maximum likelihood estimator for . What is the
observed and expected information?

Solution:
Maximum likelihood estimator:
Note:  is also called shape parameter and  is scale parameter.
Likelihood function for n observation will be:
n
L(x1,..., x n | ,  )   f (x i | ,  )          n  i
x  1ex i / 
(( ))
log-likelihood function will have the form:
1
l(x1,...,x n | , )  n ln( )  nln(( ))  ( 1)ln x i   x i


Using the fact that  is a constant, taking the derivative of this function wrt  and
solving the maximum likelihood equation gives:
    dl

n 1                     ˆ  xi  x
 2  x i  0 ==>  
d                                   n 
Since there is only one solution to this equation it is the maximum likelihood
estimator. So the maximum likelihood estimator is the mean value of observations
divided by .
   Observed information:
The second derivative of the log likelihood function is:
d 2 l n 2                 n 2n ˆ
 2  3  x i  2  3 
d  2
                     
Then using the definition of the observed information we get.
d 2l      n 2n ˆ         ˆ
2 
Io ( )   2   2  3  n
                d                        3
That is the observed information. Value of this function at the maximum:
ˆ ˆ
2    n
ˆ
Io ( )  n            2
ˆ
3    ˆ


Expected information is the expectation value of the observed information.
Expectation is taken over observations. Since in observed information does not
depend on the observations then expected information and observed information are
   equal to each other:
n 2             n 2
I()  E(Io ())  E( 2  3  x i )   2  3  E(x i )
                  


To find expected information (Fisher’s information) we need to get expected value of
the random variable with gamma distribution. We can derive it using direct
integration or Moment generating function. Let us use moment generation function.

   x /     (  1)
E(x)   xf (x; , )dx         x e dx  ( )  1  
0                 ( ) 0
Here we used the fact that the function:
 1  x / 
x e
         (  1)
is a density of a gamma distribution and therefore integral of it should be equal to one.
Now using the expected value we can write for the expected information:
n 2                 n 2           n
         I()  E(Io ())   2  3  E(x i )   2  3 n  2
                                  

        b)
Negative binomial distribution has the form:

f (k | p)  pk (1 p), where k= 0,1,2....

This is a probability mass function for the number of successful trials before a failure
occurs. Probability of success is p and that of failure is 1-p. If we have a sample of n
   independent points drawn from the population with the distribution of negative
binomial (n times we had ki failures and (ki+1)-st trial was success). Find the
maximum likelihood estimator for p. What is the observed and expected information?

Solution:
Maximum likelihood estimator:
Let us write the likelihood function:
L(k1,...,kn | p)   pki (1 p)  (1 p)n  pki
Loglikelihood function will have the form:
l(k1,...,kn | p)  nln(1 p)  ln pki
        Taking the derivative wrt p and solving the maximum likelihood equation we get:
dl        n       ki ==> - p(n  k ) + k = 0 ==> p   ki  k
dp

1 p

p
i i              ˆ
n   k i 1 k

Since there is only one solution it is the maximum likelihood estimator. Here we used
the notation for averages:
         k
 ki
n

Observed information:
Using the equation for information we can write:

d 2l
Io ( p)   2  (
n       ki (1 p)2 k  p2
 2 )n
dp       (1 p) 2   p       (1 p)2 p2
It is observed information.


Using the expression for the expected information:
(1 p) 2 E(k)  p 2
I( p)  E(Io ( p))  n
(1 p) 2 p 2
So to derive the expected information we need:

E(k) 
 E (ki )  E (k)
               n
Here we used the fact that all observations have the same expectations. And it can be
derived using:
                    
   E(k)   kpk (1 p)  (1 p) kpk
k 0                 k 0

Now we use the relation for geometric progression:

1
    pk  1 p
k 0
If we get derivative of both sides wrt p:

1
 kpk1  (1 p)2
    k 0
Multiplying both sides by p and 1-p gives:

p
E(k)   kpk 
             k 0      1 p
Finally we can write:
p
(1 p) 2       p2
1 p         (1 p) p  p 2       1
    I( p)  n            2 2
n         2 2
n
(1 p) p           (1 p) p        (1 p) 2 p
1
I( p)  n
(1 p) 2 p
Information becomes bigger as value of p goes close to 0 or 1.



DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 4 posted: 3/8/2012 language: pages: 3