Newby RPI Max likelihood

Document Sample
Newby RPI Max likelihood Powered By Docstoc
					  "The Maximum Likelihood
    Problem and Fitting the
Sagittarius Dwarf Tidal Stream"
         Matthew Newby




                         Astronomy Seminar
                                             1
                         RPI Oct. 22, 2009
Overview:
    •Introduction
    •The Sagittarius Stream
        • SDSS
        • Locating
    • Maximum Likelihood
    • Methods
        •Differential Evolution
        • Monte-Carlo Markov-Chain
        •Gradient Descent
        •Genetic Search
        •Particle Swarm
    • Revisit the Sagittarius Stream
    • BOINC
        •Overview
        •Current and Future Work
                                       2
Introduction
•Modern Astronomy – No longer staring through a telescope
•Automated Surveys produce large data sets
•Errors in measurements – statistical methods
needed
•Fast and accurate computer routines are needed
in order to analyze this information!


                                                       Image : NASA.gov


                                computer$ go faster_




    Image : Wikimedia Commons


                                                                          3
The Sloan Digital Sky Survey (SDSS):




                                                         Image: sdss.org



  • 230+ million objects
  • 8,400 square degrees in the sky
  • Large percentage of north galactic cap
  • Very little data in galactic plane (too much dust)
  • Several hundred thousand stars
                                                                           4
             The Sagittarius Dwarf Tidal Stream

• The Sagittarius Dwarf Galaxy is
merging with the Milky Way
• The dwarf is being tidally
disrupted by the Milky Way,
creating long “tails.”




                                            Mapping the Tidal Stream will:
                                           • Provide information on matter
                                           distribution in Milky Way
                                           • Provide constraints on Galactic
                                           Halo
                                     Image (above): [Ibata et al. 1997, AJ]
                                                                                                  5
                                    Image (left): David Martinez-Delgado (MPIA) & Gabriel Perez (IAC)
                                                                   Halo
The Milky Way:                                                     Bulge
                                                                   Thin Disk
                                                                   Thick Disk
                   Data Wedge




                 Sun
                                             Sagittarius Dwarf Galaxy




 Tidal Stream


                                                                           6
                   ~30 kiloparsecs (100,000 light-years)
  Data Stripe:



                                                F-turnoff stars on the H-R diagram




Stripe 82 (southern galactic cap)




                                                                                         7
                         Image: Newberg & Yanny 2006, JoP Conference series (modified by N. Cole
                              Sag. Stream: Model
                                       • Assume stream is a cylinder
                                       • Radial drop-off given by a Gaussian Distribution




                                              • 2 background parameters
                                                  r0, q
                                              • 6 parameters per stream
                                                  ε, μ, r, θ, φ, σ


Cole, N.
           Background distribution:
                                      At least 8 parameters in the search –
                                      8-dimensional solutions space!

                                                                                   8
                         Maximum Likelihood:
• Bayesian Method
     • Must assume a “prior” – a model explaining the data
     • Find the parameters that are the “most likely” in a data set, given the prior



• Law of large numbers
     •Can assume that large data sets have normally distributed data points




• Find probability that each data point lies in the given distribution
•The you can get the likelihood:
                          L(Q|D) =  DataPointProbi




                                                                                       9
               Computational Algorithms
         Overview:
• Set up problem
• Parameter space: all allowed values of parameters
• Likelihood evaluator for given parameters
• Evaluation method – moves in parameter space in an efficient way
• End conditions: when change in best is below a limit, or a predefined
number of iterations is reached.


        Problems:
•Likelihood calculation is usually time-consuming
• Need to avoid local maximums – find global max

       What is the best method?

                                                                   10
Computational Methods:

                        “No Free Lunch”          (David H. Wolpert, William G. Macready)


Poor Students:

          Rosencrantz                   Guildenstern                      Ophelia

         •Only eats meat               •Low Carb Diet                    •Vegetarian



Local Eateries, same menus, random prices:

      Burger Palace             Gourmet Salads                 No Carbs at All

           Prices differ by restaurant! Not everyone can eat cheaply!
        One restaurant cannot be the best solution for every person (problem)!

•One solution method (or algorithm) will not be ideal for all problems!
       •Need to choose the best solution for the job at hand!
                                                                                           11
           Conjugate Gradient Descent (CGD)
• Calculates the gradient of the surface for each parameter
• Moves towards best likelihood using a line search
• Conjugate gradient uses the gradient of the previous step to converge faster
•Requires many likelihood calculations per move
• Unfortunately, may end at local maximums
• Need to run from several different directions in order to find global best

            Likelihood vs. Position

                 best solution
                                                   The gradient, G:

                             gradient

                                      location
                                                 L = likelihood function
                                                 Q = Parameter (i or j)
    Local Maximum                                hi = step size for ith parameter


                                                                                    12
     Gradient Descent: 1-dimensional case
                                          Line Search
• Evaluates two points in direction of gradient: one a distance 1d away, the other 2d
• d is usually related to the gradient (slope)
• If the middle point is not at a better likelihood than the end points, d is doubled and
the process repeated
• If the middle point is higher, then the middle point becomes the starting point for
another CGD
• Line Search causes the algorithm to reach the best likelihood efficiently




                                        next end point   Line Search example (left):
             next middle point
                                                         The first search does not find a better
                                                         likelihood for the middle point (yellow),
                                                         so the distance is doubled. This time, the
                                                         new middle point (red) has the best
                               first end point           likelihood. The next iteration of CGD will
                                                         start at this point.
                   first middle point
  starting point                                                                              13
            Monte-Carlo Markov-Chain (MCMC)
• A “random walk” method
• Samples parameter space well
• Automatically produces error distribution
• Easy to code

•Sensitive to running time and step size
• Never truly converges

•Metropolis-Hastings:
   • Take a step in each direction (parameter)
   • Step size/direction is random, drawn from
   a normal distribution
   • If the new location has a better likelihood,
   move to it
   • If the new location has a worse likelihood,
   then there is a chance of moving to it


 The trajectory of a 1000 step MCMC straight-line fit
                                                        14
 (top) and the distribution in b (bottom).
                           Genetic Search
• Inspired by natural selection
• Start with multiple “individuals” (positions) in parameter space
• Evaluate likelihood for each individual
• Remove individuals with the worst likelihoods
• Replace the removed individuals with “children” of the remaining individuals
(“parents”)


• Parents can be chosen randomly or from the best likelihoods
• Create children through crossover and mutation:
     • Crossover: A child inherits the parameters of multiple parents, either by
     averaging the parents’ parameters or by inheriting select parameters from
     each parent
     •Mutation: Replace a parameter with a new, randomly generated one


• Repeat until end conditions are met

                                                                                   15
                       Differential Evolution

• An individual moves according
to the weighted difference
                                    Difference Vector                     No Change
between the locations of two
“parent” individuals                                                              X
• If the new position has a worse
likelihood, then the individual
does not move
• Parents may be random or
chosen from the population best                 Change in position
• Also, multiple pairs of parents
may be used (averaging over the
differences)

                                                (center is global best)



                                                                                  16
                     Particle-Swarm Optimization
                                                              Parameter Space

  • Physically Intuitive –                                                            Global best
  based on animal behavior
  • Particles have velocities
  • “Forces” towards
  personal best, global best
                                                                                         Personal best
                                                                     to global best


                                                   velocity                   to personal best
                                                                   particle


 Position (x) change at step t:




w, c1,c2 are weighting parameters, p is personal best, g is global best, rand() is a random number
                                                                                                 17
                                  BOINC
       Berkeley Open Infrastructure for Network Computing
           Milkyway@home stats:                   Total              Active
          Users                                   37,251             16,010
          Hosts                                   79,023             25,101
          Teams                                   1,410              922
          Countries                               163                124
          Total Credit                            9,302,434,280
          Recent average credit RAC               52,731,529
          Average floating point operations per   527,315.3 GigaFLOPS / 527.315
          second                                  TeraFLOPS

• Users volunteer spare processor / graphics card time to the project
• Massively parallel
• Graphics processor technology has created a large increase in processing power
• Milkyway@home is now the #2 ranked BOINC project
• You can help, too: http://milkyway.cs.rpi.edu/milkyway/

                                                                                  18
           Separation: Stripe 82
Sgr Stream Stars   Sgr Stream Stars   Non-Sgr Stream Stars




                                                             19
                        Conclusions:



• Modern astronomy produces large data sets

• The Maximum Likelihood method is ideal for analyzing this data

• Powerful computer algorithms exist to perform MLE

• Mapping the Sagittarius Stream is possible by using these methods




                                                                   20
                                       Credits


The Sloan Digital Sky Survey
BOINC.com
Milkyway@home
Prof. Heidi Newberg, Rensselaer Polytechnic Institute
Nathan Cole, “Maximum Likelihood Fitting of Tidal Streams with Applications to the
          Sagittarius Dwarf Tidal Tails” (PhD Thesis, Rensselaer Polytechnic Institute, 2008)
Travis Desell, “Aysnchronous [sic] Global Optimization for Massively Distributed Computing”
          (PhD candidacy document, 2009)
Shakespeare, et al. “Hamlet”




                                                                                         21
3 stream search:




                   22

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:10/1/2012
language:Unknown
pages:22