Data Analysis_ by hcj

VIEWS: 17 PAGES: 37

									Data Analysis
Do it yourself!
    What to do with your data?
• Report it to professionals (e.g., AAVSO)
  – Excellent! A real service to science; don’t
    neglect this
• Publish observations (e.g., JAAVSO)
• Analyze it – yourself!
                    But …
• I’m not a mathematician
  – Let the computer do the math
• I’m not a programmer
  – Get programs from the net (often free)
• I don’t know how to use or interpret them
  – Neither do the pros!
  – Practice, practice, practice …
          Time Series Analysis
• A time series is a set of data pairs
            (ta , xa ), a  1,2,3,..., N
• t is the time, x is the data value
• Usually, times are assumed error-free
• Data = Signal + Error
                 x(t )  f (t )  
• x can be anthing, e.g. brightness of variables star,
  time of eclipse, eggs/day from a laying hen
      Basic properties of data x
         Actual                   Estimated

• Mean =  = expected      • Average = estimated 
  value
• Standard deviation =    • Sample standard
  expected rms               deviation = estimated 
  difference from mean
  Average and sample standard
           deviation
• Average
                  1
               x  x
                  N
• Sample standard deviation

               1
       s
              N 1
                    (x  x) 2
      Method #1: world’s best
• Eye + Brain: Look at the data!
• Plot x as a function of t: Explore!
• Scientific name:
                Visual Inspection
• World’s best – but not infallible
• Programs:
  – TS                 http://www.aavso.org
  – MAGPLOT            http://www.aavso.org
   Method #2: Fourier Analysis
• Period analysis and curve-fitting
• Powerful, well-understood, popular
• Programs
  – TS               http://www.aavso.org
  – PerAnSo          http://www.peranso.com
  Method #3: Wavelet Analysis
• Time-frequency analysis
• Old versions bad, new version good
• Programs:
  – WWZ              http://www.aavso.org
  – WinWWZ           http://www.aavso.org
Visual Inspection
Let’s take a look
Fourier Analysis
Fourier analysis for period search
• Match the data to sine/cosine waves
    f (t )  c0  c1 cos(2t )  c2 sin( 2t )
•    = frequency
•   Period = P  1 /
•   Amplitude = A = size of fluctuation
•   Obvious choice is period; mathematically
    sound choice is frequency
     Null Hypothesis (important!)
•   Null hypothesis: no time variation at all
•   So f (t )   = constant
•   So, xa     a
•   Quite important! Often neglected. Even
    the pros often forget this.
                  Is it real?
• Fit produces a test statistic under the null
  hypothesis
• Is usually “  /degree of freedom” (d)
                 2


• Linear:  2  4 is significant (not just by
  accident) at 95% confidence
• 95% confidence means 5% false-alarm
  probability
       Meaning of significance
• Significance does not mean the signal is
  linear, sinusoidal, periodic, etc.
• It only means the null hypothesis is
  incorrect, i.e., the signal is not constant
• Important!!!
              Pre-whitening
• If you find a significant fit, then subtract the
  estimated signal, leaving residuals
• Analyze the residuals for more structure
• This process is called pre-whitening
     How to choose frequency?
• Test all reasonable values, get a “strength of
  fit” for each. Common is “chi-square per
  degree of freedom” (but there are many)
• Plot frequency .vs. fit – the Fourier
  transform (aka periodogram, aka power
  spectrum)
     Fourier decomposition
    Any periodic function of period P
(frequency   1 / P ) can be expressed as a
                  Fourier series:
    F (t )  a  b1 sin( 2t )  c1 cos(2t )
     b2 sin( 4t )  c2 cos(4t )
     b3 sin( 6t )  c3 cos(6t )  ...
     bn sin( 2nt )  cn cos(2nt )  ...
     Fundamental + harmonics
For a pure sinusoid, expect response at
  frequency 
For a general periodic signal at a given
  frequency, expect a fundamental component
  at  , as well as harmonics at frequencies
  2 , 3 , 4 , etc.
         Lots of Fourier methods
• FFT: fast Fourier transform
   –   Not just fast: it’s wicked fast
   –   Requires even time spacing
   –   Requires N=integer power of 2
   –   Beware!
• DFT: discrete Fourier transform
   – Applies to any time sampling, but incorrect results for
     highly uneven (as in astronomy!)
   – Beware!
       Problems from uneven
           time sampling
• Aliasing: false peaks, often from a periodic
  data density
• Aliases at    signal  n data
• Common in astronomy: data density have a
  period P = 1 yr = 365.2422 d, so
                data  0.002738
• Solution: pre-whitening
Aliasing
Aliasing: UZ Hya
       Problems from uneven
           time sampling
• Mis-calculation of frequency (slightly) and
  amplitude (greatly); sabotages prewhitening
Solution: better Fourier methods
        (for astronomy)
• Lomb-Scargle modified periodogram
  – Improvement over FFT, DFT
• CLEAN spectrum
  – Bigger improvement
• DCDFT: date-compensated discrete Fourier
  transform (this is the one you want)
• CLEANEST spectrum: DCDFT-like for
  multiple frequencies
                 DCDFT
• Much better estimates of period, amplitude
           Let’s take a look
• Peranso (uses DCDFT and CLEANEST)
• Available from CBA Belgium
  – http://www.peranso.com
Fourier transform (CLEANEST)
           of TU Cas
Wavelet Analysis
                 Wavelets
• Fit sine/cosine-like functions of brief
  duration
• Shift them through time
• Gives a time-frequency analysis
                 Problems
• Same old same old: uneven time spacing,
  especially variable data density, invalidate
  the results
• But: even worse than Fourier
• Essentially useless for most astronomical
  data
           Wavelet methods
• DWT: discrete wavelet transform
  – Just not right for unevenly sampled data
    (astronomy!)
• Solution: WWZ =
     weighted wavelet Z-transform
Let’s take a look
Data Analysis
    • Do it yourself
• Use your eyes and brain
  • Healthy skepticism

• tamino_9@hotmail.com
        • Enjoy!

								
To top