Data Analysis Do it yourself! What to do with your data? • Report it to professionals (e.g., AAVSO) – Excellent! A real service to science; don’t neglect this • Publish observations (e.g., JAAVSO) • Analyze it – yourself! But … • I’m not a mathematician – Let the computer do the math • I’m not a programmer – Get programs from the net (often free) • I don’t know how to use or interpret them – Neither do the pros! – Practice, practice, practice … Time Series Analysis • A time series is a set of data pairs (ta , xa ), a 1,2,3,..., N • t is the time, x is the data value • Usually, times are assumed error-free • Data = Signal + Error x(t ) f (t ) • x can be anthing, e.g. brightness of variables star, time of eclipse, eggs/day from a laying hen Basic properties of data x Actual Estimated • Mean = = expected • Average = estimated value • Standard deviation = • Sample standard expected rms deviation = estimated difference from mean Average and sample standard deviation • Average 1 x x N • Sample standard deviation 1 s N 1 (x x) 2 Method #1: world’s best • Eye + Brain: Look at the data! • Plot x as a function of t: Explore! • Scientific name: Visual Inspection • World’s best – but not infallible • Programs: – TS http://www.aavso.org – MAGPLOT http://www.aavso.org Method #2: Fourier Analysis • Period analysis and curve-fitting • Powerful, well-understood, popular • Programs – TS http://www.aavso.org – PerAnSo http://www.peranso.com Method #3: Wavelet Analysis • Time-frequency analysis • Old versions bad, new version good • Programs: – WWZ http://www.aavso.org – WinWWZ http://www.aavso.org Visual Inspection Let’s take a look Fourier Analysis Fourier analysis for period search • Match the data to sine/cosine waves f (t ) c0 c1 cos(2t ) c2 sin( 2t ) • = frequency • Period = P 1 / • Amplitude = A = size of fluctuation • Obvious choice is period; mathematically sound choice is frequency Null Hypothesis (important!) • Null hypothesis: no time variation at all • So f (t ) = constant • So, xa a • Quite important! Often neglected. Even the pros often forget this. Is it real? • Fit produces a test statistic under the null hypothesis • Is usually “ /degree of freedom” (d) 2 • Linear: 2 4 is significant (not just by accident) at 95% confidence • 95% confidence means 5% false-alarm probability Meaning of significance • Significance does not mean the signal is linear, sinusoidal, periodic, etc. • It only means the null hypothesis is incorrect, i.e., the signal is not constant • Important!!! Pre-whitening • If you find a significant fit, then subtract the estimated signal, leaving residuals • Analyze the residuals for more structure • This process is called pre-whitening How to choose frequency? • Test all reasonable values, get a “strength of fit” for each. Common is “chi-square per degree of freedom” (but there are many) • Plot frequency .vs. fit – the Fourier transform (aka periodogram, aka power spectrum) Fourier decomposition Any periodic function of period P (frequency 1 / P ) can be expressed as a Fourier series: F (t ) a b1 sin( 2t ) c1 cos(2t ) b2 sin( 4t ) c2 cos(4t ) b3 sin( 6t ) c3 cos(6t ) ... bn sin( 2nt ) cn cos(2nt ) ... Fundamental + harmonics For a pure sinusoid, expect response at frequency For a general periodic signal at a given frequency, expect a fundamental component at , as well as harmonics at frequencies 2 , 3 , 4 , etc. Lots of Fourier methods • FFT: fast Fourier transform – Not just fast: it’s wicked fast – Requires even time spacing – Requires N=integer power of 2 – Beware! • DFT: discrete Fourier transform – Applies to any time sampling, but incorrect results for highly uneven (as in astronomy!) – Beware! Problems from uneven time sampling • Aliasing: false peaks, often from a periodic data density • Aliases at signal n data • Common in astronomy: data density have a period P = 1 yr = 365.2422 d, so data 0.002738 • Solution: pre-whitening Aliasing Aliasing: UZ Hya Problems from uneven time sampling • Mis-calculation of frequency (slightly) and amplitude (greatly); sabotages prewhitening Solution: better Fourier methods (for astronomy) • Lomb-Scargle modified periodogram – Improvement over FFT, DFT • CLEAN spectrum – Bigger improvement • DCDFT: date-compensated discrete Fourier transform (this is the one you want) • CLEANEST spectrum: DCDFT-like for multiple frequencies DCDFT • Much better estimates of period, amplitude Let’s take a look • Peranso (uses DCDFT and CLEANEST) • Available from CBA Belgium – http://www.peranso.com Fourier transform (CLEANEST) of TU Cas Wavelet Analysis Wavelets • Fit sine/cosine-like functions of brief duration • Shift them through time • Gives a time-frequency analysis Problems • Same old same old: uneven time spacing, especially variable data density, invalidate the results • But: even worse than Fourier • Essentially useless for most astronomical data Wavelet methods • DWT: discrete wavelet transform – Just not right for unevenly sampled data (astronomy!) • Solution: WWZ = weighted wavelet Z-transform Let’s take a look Data Analysis • Do it yourself • Use your eyes and brain • Healthy skepticism • firstname.lastname@example.org • Enjoy!
Pages to are hidden for
"Data Analysis_"Please download to view full document