log likelihood function will have the form

Document Sample
log likelihood function will have the form Powered By Docstoc
					                                                                                  Year 2008

                                       Solution to exercise 1
     a) Assume that we have a sample of size n (x1, x2, ….) independently drawn from the
     population with the density of probability ( gamma)
                                        1 x / 
                     f (x | ,  )        x e        0 x 
                                     ( )

     Assuming that  is constant find the maximum likelihood estimator for . What is the
     observed and expected information?
          
     Solution:
     Maximum likelihood estimator:
     Note:  is also called shape parameter and  is scale parameter.
     Likelihood function for n observation will be:
                                                     n
      L(x1,..., x n | ,  )   f (x i | ,  )          n  i
                                                              x  1ex i / 
                                                   (( ))
     log-likelihood function will have the form:
                                                                              1
      l(x1,...,x n | , )  n ln( )  nln(( ))  ( 1)ln x i   x i

                                                                       
     Using the fact that  is a constant, taking the derivative of this function wrt  and
     solving the maximum likelihood equation gives:
    dl
            
                n 1                     ˆ  xi  x
                     2  x i  0 ==>  
      d                                   n 
     Since there is only one solution to this equation it is the maximum likelihood
     estimator. So the maximum likelihood estimator is the mean value of observations
     divided by .
   Observed information:
     The second derivative of the log likelihood function is:
      d 2 l n 2                 n 2n ˆ
              2  3  x i  2  3 
      d  2
                                    
     Then using the definition of the observed information we get.
                  d 2l      n 2n ˆ         ˆ
                                           2 
     Io ( )   2   2  3  n
                d                        3
     That is the observed information. Value of this function at the maximum:
                       ˆ ˆ
                    2    n
           ˆ
     Io ( )  n            2
                        ˆ
                        3    ˆ
                               

     Expected information is the expectation value of the observed information.
     Expectation is taken over observations. Since in observed information does not
     depend on the observations then expected information and observed information are
   equal to each other:
                                n 2             n 2
     I()  E(Io ())  E( 2  3  x i )   2  3  E(x i )
                                                   



          To find expected information (Fisher’s information) we need to get expected value of
          the random variable with gamma distribution. We can derive it using direct
          integration or Moment generating function. Let us use moment generation function.
                   
                                         x /     (  1)
           E(x)   xf (x; , )dx         x e dx  ( )  1  
                   0                 ( ) 0
          Here we used the fact that the function:
              1  x / 
                    x e
         (  1)
          is a density of a gamma distribution and therefore integral of it should be equal to one.
          Now using the expected value we can write for the expected information:
                                n 2                 n 2           n
         I()  E(Io ())   2  3  E(x i )   2  3 n  2
                                                                 


        b)
          Negative binomial distribution has the form:

               f (k | p)  pk (1 p), where k= 0,1,2....

          This is a probability mass function for the number of successful trials before a failure
          occurs. Probability of success is p and that of failure is 1-p. If we have a sample of n
        independent points drawn from the population with the distribution of negative
          binomial (n times we had ki failures and (ki+1)-st trial was success). Find the
          maximum likelihood estimator for p. What is the observed and expected information?

          Solution:
          Maximum likelihood estimator:
          Let us write the likelihood function:
          L(k1,...,kn | p)   pki (1 p)  (1 p)n  pki
          Loglikelihood function will have the form:
          l(k1,...,kn | p)  nln(1 p)  ln pki
        Taking the derivative wrt p and solving the maximum likelihood equation we get:
           dl        n       ki ==> - p(n  k ) + k = 0 ==> p   ki  k
           dp
              
                   1 p
                         
                              p
                                                i i              ˆ
                                                                      n   k i 1 k

          Since there is only one solution it is the maximum likelihood estimator. Here we used
          the notation for averages:
         k
               ki
                n

          Observed information:
          Using the equation for information we can write:

                      d 2l
           Io ( p)   2  (
                                  n       ki (1 p)2 k  p2
                                         2 )n
                      dp       (1 p) 2   p       (1 p)2 p2
          It is observed information.



     Using the expression for the expected information:
                            (1 p) 2 E(k)  p 2
     I( p)  E(Io ( p))  n
                               (1 p) 2 p 2
     So to derive the expected information we need:

     E(k) 
              E (ki )  E (k)
               n
     Here we used the fact that all observations have the same expectations. And it can be
     derived using:
                                 
   E(k)   kpk (1 p)  (1 p) kpk
            k 0                 k 0


     Now we use the relation for geometric progression:
       
                   1
    pk  1 p
      k 0
     If we get derivative of both sides wrt p:
       
                      1
      kpk1  (1 p)2
    k 0
     Multiplying both sides by p and 1-p gives:
                
                           p
      E(k)   kpk 
             k 0      1 p
     Finally we can write:
                           p
                (1 p) 2       p2
                         1 p         (1 p) p  p 2       1
    I( p)  n            2 2
                                   n         2 2
                                                     n
                     (1 p) p           (1 p) p        (1 p) 2 p
                     1
      I( p)  n
                (1 p) 2 p
     Information becomes bigger as value of p goes close to 0 or 1.




				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:3/8/2012
language:
pages:3