Document Sample
FILTERING Powered By Docstoc
In cognitive experiments Based on Luck 2005

What‘s a filter
• Gain: the ratio of output to input • Amplifier: system designed to boost signal magnitude
– Gain > 1  output of amp. larger than input

• Filter: system designed to reduce signal magnitude at particular frequencies
– Gain < 1  output of filter less than input at given frequencies

Why do filtering?
• The Nyquist Theorem
– An analog signal can be converted to digital as long as the rate of digitization is twice as high (at least) as the highest frequency in signal being digitized – If the signal contains any frequencies higher than this limit (at least half or less than rate of digitization) they will show up as artifactual low frequencies (aliasing).

• This second part is why you have to do at least some (online) low-pass filtering

Why do filtering?
• EEG consists of a signal plus noise, and some of the noise is sufficiently different in frequency content from the signal that it can be suppressed simply by attenuating different frequencies, thus making the signal more visible
– Unfortunately, this benefit needs to be weighted against the cost of distortion in your signal, because ANY FILTER DISTORTS AT LEAST SOME PART OF THE SIGNAL.

More on distortion
• What kind of filter distortion are we talking about?
– Filters can change the relative amplitude of ERP components, the timing of ERP components, and can add peaks that weren‘t even there before.

• So not just innocuous smoothing! • But still useful in limited capacity.

How to describe filter properties
• Properties of a filter are defined by its TRANSFER FUNCTION.
– Transferring signal to filter output

• Transfer function has two components
– Frequency response function
• How filter changes amplitude of each frequency

– Phase response function
• How filter changes phase of each frequency

Frequency Response Function
• Filters in ERP research often described with one parameter only, the half-amplitude cutoff
– The frequency at which the amplitude is cut by half (50%) of its original value – Notice that in most cases this does not mean that no frequencies above this value pass, they just pass with amplitudes between 50%-0 of their original value

How to describe filter properties
• High-pass filters are often described in terms of time constants instead of half-amplitude cutoffs.
– If input to high-pass filter in time domain is constant, will ultimately return toward zero – Time constant is how long it takes for it to get down to 1/e of starting value
Input Output

What kinds of filtering typically get done in ERP?
– Initial, online, low-pass filter built into the hardware to ensure the sampling rate is twice as high as highest signal frequency recorded (so you don‘t get aliasing)

– High-pass filter to get rid of slow, non-neural potentials like skin potentials (<.01 hz) – Notch filters to eliminate electrical noise (60 Hz)

– Low or high-pass to ‗clean up‘ data

What kinds of filtering typically get done in ERP?
• Online filters can also be described as analog, as they are applied to a continuous voltage contrast • Offline filters can also be described as digital, as they are applied to data that has already been converted from the continuous voltage to a digital representation on the computer

What kinds of filtering typically get done in ERP?
• Best policy is to do as little online/analog filtering as possible—just the little you need to prevent aliasing and minimize the impact of very slow voltage shifts • Digital filtering has important advantages
– Not permanent—can always return to unfiltered data – Easier to construct phase-invariant filters – Easier to adapt than a hardware filter

• Famous tenet of filterers
– ―Precision in the time domain is inversely related to precision in the frequency domain‖

• Kind of analogous to the Uncertainty Principle
– The more certainly you specify what frequencies are in your data, the less certain you can be about what happened in the time domain
Low-pass filter High-pass filter

Luck‘s Recommendations
• Use a sampling rate of between 200 and 500 Hz, and make your online low-pass filter between 1/3 and 1/4 of that • Use online high-pass filter (AC recording) of .01 Hz • If averaged waveforms are fuzzy, can use a low-pass with a gentle, half-amplitude cutoff of between 20-40 Hz

• You really want the phase portion of the filter‘s transfer function to be zero for all frequencies, so that the phase of the ERP waveform won‘t change (you don‘t want to move latencies) • This is usually possible with offline digital filters • For this to happen with your online filter, your amplifier needs to incorporate special filters like Bessel filters that shift all frequency phases equally so that the original phases are easily recoverable.

Why distortions? Look at 10 Hz notch filter on 10 Hz data.

Why distortions? Look at a notch filter.
• Time-domain waveforms in the Fourier analysis are represented as the sum of a set of infiniteduration sine waves
– PROBLEM: transient ERP waveforms are finite in duration—they consist of finite-duration voltage deflections, not infinite-duration sine waves

• Thus, a notch filter of 10 Hz is equivalent to computing amplitude and phase in 10 Hz band of signal and then subtracting a sine wave with this amplitude from the waveform.
– This is going to create an inverse 10 Hz oscillation is the portions of the ERP waveform where you didn‘t have this frequency—in fact possibly increasing 10 Hz noise

Offline Filtering

Filtering as a time-domain procedure: a common-sense view
• What‘s the intuitive way we could imagine of getting rid of the jagged lines in our waveform?

Filtering as a time-domain procedure
• For each point, we could average it with the nearby points for a smoother waveform
– Point x = ( [x+1]+[x+2]+[x+3]+x+[x-1]+[x-2]+[x-3]) / 7

• More generally
– filtERP(t) = weight x ERP(t+j)

Filtering as a time-domain procedure
• If you just averaged together the +/- 3 surrounding points, all 6 contribute equally to x‘s new value (regardless of their distance from x) and this reduces temporal precision • Instead you can weight them by their distance from x (~ ERP(t))
– filterERP(t) = Weight(j) ERP(t+j)

• Last equation shows how the many values surrounding one point are incorporated into filter equation to give transformed value for the one point • New question: how does the original value of a given point influence all of the values in the waveform through the filter • If we think of filtering as smearing individual point values over time, we‘re asking, what is the extent and amount of the smear?

Impulse Response Function (IRF)
• The extent and amount of smear for a given point in a given filter is represented as a function over time, where t=0 is when the given point occurs. Positive points represent smear forwards, negative points represent smear back (acausal smear)

Impulse Response Function (IRF)
• If you imagine plotting the IRF at every single individual point, the filter it corresponds to is just going to be the sum of all those IRFs scaled to each point‘s original value.

• Therefore, the filter equation in the time domain can be re-represented as:
– filterERP(t) = IRF(j) x ERP(t-j)

• Note that we have been treating the ERP waveform as a function over time. When you combine two functions in this way, you say that you‘re CONVOLVING them
– filterERP = IRF * ERP

• For each point in the ERP wave—ERP(t)— substitute the sum of—for every scaled instantiation of the IRF along the timecourse of the ERP function—the scaled IRF‘s value at that point in time t
– filterERP(t) = IRF(j) x ERP(t-j) – filterERP(100) = IRF(1) x ERP(99) + IRF(2) x ERP(98) + IRF(3) x ERP(97)…IRF(99) x ERP(1) – the first term on the right gives you the effect of point t=99 on point t=100, the second gives you the effect of point t=98 on point t=100, etc.

Properties of Convolution
• Convolution is
– Commutative: A * B = B * A – Associative: A * (B * C) = (A * B) * C – Distributive: A* (B + C) = (A * B) + (A * C)

• This helps answer a common question that you may find yourself asking: What happens if you filter twice?
– (ERP * IRF1) * IRF2 = ERP * (IRF1 * IRF2) – E.g., the convolution of two gaussians  a wider gaussian, SO, filtering twice with same gaussian is like filtering once with a wider gaussian

Properties of Convolution
• Also, since convolution in time domain is equal to multiplication in frequency domain, frequency response function of double filtering equals the product of the two filters
– So if you high-pass and then low-pass, like multiplying the two functions, usually gives you a band pass of the middle frequencies – But watch out b/c if the two freq resp functions have a gradual attenuation, you could get much MORE attenuation than you wanted of the middle frequencies

Time  Frequency
• What is the relationship between the IRF and the HP/LP filters we were talking about before?
– ―Multiplication in the frequency domain is equivalent to convolution in the time domain‖

Frequency: Multiplication Ex: low-pass filter




 inv FFT

Time: Convolution Ex: 60 Hz Notch filter




Time  Frequency

Time  Frequency
• Usually faster to filter using convolutions than Fourier Transforms, so most digital filters implemented this way • Thus you create your desired transfer function for the filter and then transform into time domain to create IRF • Ex: you want to bandpass just 60 Hz



Time  Frequency
• In fact, not perfect transfer
– IRF for 60 Hz should be just 60 Hz wave – Problem is that sine-wave is finite, so you get power at other freqs too.

• That‘s why tapered function better

Back to distortion
• Let‘s pretend our ERP waveform is just this portion of a perfect 10 Hz sine wave

Back to distortion
• In fact, even though we defined the waveform as containing only a 10 hz sine wave, the spectrum of this waveform contains power at other frequencies. Why?

• This is because the waveform is finite

Back to distortion
• Remember that when time range is limited, frequency range is broadened, and vice versa. • That means that if you were to apply a filter to remove the 10 Hz component, you won‘t remove the entire waveform, even though we defined it as a purely 10 Hz sine wave

Back to distortion
• The reason you see artifactual peaks is that the original spectrum had a lot of power in frequencies near 10 Hz, such that they sum with the 10 Hz oscillations in the IRF

Back to distortion
• For the same reason, you don‘t see as much artifactual oscillation in 20 Hz or 60 Hz notch filters, b/c there isn‘t a lot of power in this or nearby frequencies in the original spectrum

Low-pass distortions
• Shape of window (freq response function) has a big impact on amount of distortion • General rule of thumb: unless you have a really good trick, the more ideal the filter in the frequency domain, the less ideal it is in the time domain • Since we usually care most about time domain for interpreting data, and the filtering is just an optional way to improve the image, usually better to ‗idealize‘ the time domain

Low-pass distortions
• Shape of window has a big impact on amount of distortion (back to our 10 hz sine wave)

Low-pass distortions
• Running average filter (here, 12.5 Hz)
– Much less temporal spread than windowed ideal – Problem: substantial attenuation in 10-hz band where most of original power located, so amplitude reduction – Problem: side lobes in freq response f (due to sharp onset and offset of IRF) let substantial high-freq noise pass – Can be especially good for dealing with precisely defined line freq noise

Low-pass distortions
• Gaussian filter (12.5 Hz)
– Still a little smear in onset/offset, but almost complete attenuation of freq > 30 Hz – Problem: still some amplitude reduction in freq of interest—maybe should set cutoff higher – Good compromise between time and freq domain – Can be harder to implement than running avg. – Easy to describe time-domain: just full-width at halfmaximum of gaussian (FWHM)

Low-pass distortions
• Causal filter (12.5 Hz)
– An event at particular times can‘t affect events at subsequent times – One positive: if the IRF has a fairly rapid, monotonic fall-off, they may produce relatively little distortion in onset latency, which may be key in some analyses.

A note about offline low-pass filtered data
• Reviewers often want you to do numbers on (offline) unfiltered data. Does this really make a difference?
– Probably doesn‘t have a huge effect if you use type of filter just recommended – However, helps to keep things straight in your head, because taking mean amplitude in a particular latency window from filtered waveforms is equivalent to taking mean amplitude from a larger window in the original data
• Thus, if you use unfiltered data for this kind of ‗average-over-time‘ measurement, high frequency noise will average out anyway, and you‘ll be more clear in your head about what latencies your measure is coming from

– Note that for non-averaged measures like peak amplitude, this is less of an option b/c the high frequency noise will corrupt these measures significantly

High-pass filtering in time domain
• A little more complicated • We implement this as doing a low-pass and subtracting it from the original waveform, yielding the left-over high frequencies.
– ERPh = ERP – ERPl = ERP – (IRFl * ERP)

High-pass filtering in time domain
• If we substitute IRFu * ERP for ERP (where IRFu is the unity impulse response, 1 at 0 and 0 everywhere else) and do some more algebra, we get the following:
– IRFh = IRFu - IRFl

High-pass distortions
• Application of same 2.5 Hz high-pass filter to three different waveforms

High-pass distortions
• Effect of high-pass filter can be very sensitive to shape of waveform, b/c of the negative shape of the IRF • If waveform consists of roughly equal positive and negative subcomponents, will lead to opposite-polarity copies of the IRF which will cancel each other out.

High-pass distortions
• Effect of high-pass filter can be very sensitive to shape of waveform, b/c of the negative shape of the IRF • If one polarity dominates waveform, the overshoot will be greater

High-pass distortions
• Can imagine this as reversed polarity of distortions low-pass produced
– In low-pass, you get spreading of peaks in same polarity as peaks, giving earlier onset/later offset – High-pass filter inverts the gaussian of the lowpass response, so produces opposite-polarity spreading – This can be worse b/c makes it look like they are artifactual peaks – This is what makes high-pass filtering more ‗dangerous‘ than low-pass.

High-pass distortions
• Interesting difference between causal and non-causal high pass filters
– Noncausal hp filters take low-freq info away from time zone of waveform and ‗push‘ it forwards and backwards in time – Causal hp filters only push it forward

• Most (all?) of filters discussed here have been linear • If your filter is linear, remember that you can combine it with other linear operations in any order (e.g. signal averaging or baselining) with no change in the result. (artifact rejection is nonlinear though) • Keep this in mind when you‘re designing your analysis protocol, by switching things around you can save computing time and human time

Bottom line of Off-line Filtering
• Try not to have to filter very much (e.g., get good data) • If you have to filter offline, try not to do any high-pass filtering (b/c of the danger of artifactual peaks) • If you have to filter offline, try filtering on a simulated data set first—then you can see exactly what distortions are possible.

Reporting filters
• Especially important if you report stats on data that has any offline filtering • Often filters reported like ―A low-pass filter of 30 Hz was applied to the averaged data‖
– 2 problems
• No description of shape • No explicit description of what ‘30 Hz‘ applies to

Reporting filter shape
• We saw that amount and type of distortion heavily depends on shape of frequency response function— e.g. a sharp cutoff leads to more oscillatory artifacts than a gradual one • Therefore important to report shape of filter you used
– Can be a classic type with well-known features— Hanning, Butterworth, etc. – If gaussian, can specify FWHM

Specifying measure: Amplitude vs Power
• Although directly related, [voltage] amplitude is not the same as power
– Voltage = current x resistance – Power = voltage x current – Power = voltage2 / [resistance <- usually =1]

Specifying measure: Amplitude vs Power
• Filters can reported in terms of either, but note that HALF-POWER CUTOFF DOESN‘T EQUAL HALF-AMPLITUDE CUTOFF
– .7072 (V) = .5 (P) – So frequency at which power is reduced by 50% will be frequency at which voltage is reduced by 29%

• Bottom-line: important to know (and report) which one you‘re using

Online Filtering

Back to the Nyquist Theorem
• The Nyquist Theorem
– An analog signal can be converted to digital as long as the rate of digitization is twice as high (at least) as the highest frequency in signal being digitized – If the signal contains any frequencies higher than this limit (at least half or less than rate of digitization) they will show up as artifactual low frequencies (aliasing).

Nyquist Theorem
• Ex: 100 Hz sampling rate x ½ = 50 Hz Limit
10 Hz signal 20 Hz 40 Hz 60 Hz 97 Hz 1 sec 10 cycles  correct 20 cycles  correct 40 cycles  correct (but choppy) 40 cycles  incorrect (aliased to 40 Hz) 3 cycles  incorrect (aliased to 3 Hz)

Nyquist rule
• Requires sampling at twice the fastest frequency present in original waveform to prevent aliasing
– Not just the fastest that you care about – This is why an online filter is necessary, to make sure that certain frequencies don‘t even make it to the digitization stage

Nyquist rule
• Does NOT ensure that data will be smooth, only that the frequencies will be accurate • The rule assumes that the signal is perfectly sinusoidal, not necessarily true in physiological systems • For these reasons, you often want to limit the frequencies to even less than half the sampling rate—1/3, 1/5, 1/10..
– (some refs: Attinger, Anne & McDonald 1966; in Coles, Donchin & Porges 1986 book)

Nyquist rule
• If you don‘t have very much high frequency noise, you can sample less and ignore the small amount of aliased noise • Also if all your noise is at a given frequency (e.g. 60 Hz), you can play some tricks with the sampling rate to limit the aliasing • But these techniques probably demand some amount of sophistication

Other filtering references
• Glaser & Ruchkin, 1976, Principles of

neurobiological signal analysis • Cook & Miller, 1992, Psychophysiology • Ruchkin, 1988, in Picton‘s Handbook of electroencephalography and clinical neurophysiology