Today I’m doing a first reconnaissance of Butterworth filters, used extensively in Mann et al 2008.
The comments here are notes and thoughts and I hope that some issues can be clarified by readers. As of 12.30 pm today, I think that that the main problem that I’ve had pertains to a difference between R and Matlab implementations, but it illustrates an interesting effect.
We’ve had discussions of smoothing from time to time. Just before the release of Mann et al 2008, UC posted an interesting comment on Mannian smoothing here based on an earlier smoothing paper. Within 24 hours, Mann changed his code for lowpassadaptive.m, noting (but not describing) the change here, “minor errors fixed 9/2/08″, failing to cite or acknowledge Climate Audit as a source, although Penn State policies appear to require students to properly acknowledge sources under similar circumstances. Consult the interesting comments by Roman M, UC and others.
This thread was written just before the release of Mann et al 2008. UC observed shortly after its release that lowpass.min is used everywhere in Mann et al 2008, something that I’m now trying to come to grips with. My take on the earlier thread was that the issues mostly seemed to involve padding of future values and their potential impacts. This inter-relates to today’s topic.
I ran into serious problems trying to replicate Mann’s smoothing operations in R. Here’s the Dongge series where I encountered problems with Butterworth filters applied to uncentered data (though it’s taken a while to diagnose the problem.) What I observed was the huge displacement at the end of the series (which seems to be “zero-seeking” behavior at the end point.)
If you are so inclined, you can download Mann’s version of this series as follows:
x=ts(x[temp],start=min(year[temp])) ; N=length(x); N #2003
The figure below shows the results of simple Gaussian filter and a 10-point Butterworth filter with frequency 0.05, padding according to lowpassmin.m (as shown in the SI) and selection of padding alternatives according to Mann’s procedures. In this example, padding was not an issue, as all the padding alternatives led to negligibly different results. The problem lay entirely with the implementation of the Butterworth filter, perhaps with the filtfilt algorithm. (The results are actually worse than indicated on this graphic as the overshoot on the right goes FAR off the chart.)
Here’s the code for my implementation of the Butterworth filter (using the signal package in r), which has an implementation of filtfilt (which I presume to do the same thing as the Matlab filtfilt, but this may be an issue.)
mannextend=rbind(dummy1,mannextend,dummy2);dim(mannextend) #2023 9
#extend beginning and end with 3 padding schemes
working=apply(mannextend,2, function(x) filtfilt(bf,x) )
[Note -for the subsequent graphic below, I've centered before filtering and then added back the mean,]
I’ve corresponded with UC and Jean S about this over the last couple of days. It seems that using Butterworth filters (and filtfilt) in R on off-center series seems to introduce a problem that is unrelated to end-point padding. UC observes:
This might cause the difference with Matlab. filtfilt.m sets the initial state + does minimum roughness, and centering doesn’t matter. If zeros are padded centering probably matters .
I’m pretty sure that this is the difference.
Sorting this out has not been entirely easy. In an earlier post, UC noted that the R-manual says:
“In this version, we zero-pad the end of the signal to give the reverse filter time to ramp up to the level at the end of the signal.”
filtfilt reduces filter startup transients by carefully choosing initial conditions, and by prepending onto the input sequence a short, reflected piece of the input sequence. For best results, make sure the sequence you are filtering has length at least three times the filter order and tapers to zero on both edges.
[2 pm Eastern: Dealing here only with the zero-tapering recommendation, UC specifically reviewed this question and emailed me that his view is that the zero-tapering part of the recommendation doesn't matter given the Matlab implementation of filtfilt and he doesn't understand why it's in the manual. Arggh.]
There are some odd comments about filtfilt in the manuals. The manual for the R-package “signal” says of the implementation of filtfilt:
This corrects for phase distortion introduced by a one-pass filter, though it does square the magnitude response in the process. That’s the theory at least. In practice the phase correction is not perfect, and magnitude response is distorted, particularly in the stop band.
In response to an inquiry to Matlab, Matlab (worryingly) stated to a CA reader:
The filtfilt function in Signal Processing Toolbox does not implement the initial condition proposed in Gustafsson paper. However, in his paper, Gustafsson pointed out that we used to have a bug in the initial condition we implemented in filtfilt. Therefore, we put Gustafsson’s paper as a reference to the function.
Padding was raised as a problem in the previous thread. UC observed to me today that the padding in lowpassmin as used in Mann et al 2008 is 50% less than the padding used in lowpass.m as reported in Mann’s 2008 smoothing paper ! THe padding code for lowpass.m in the reconstruction paper is:
while the corresponding padding in the smoothing paper lowpass.m is three times larger:
% pad far enough to remove any influence of boundaries of padded series on the interior
This would be 30 years of padding if the code in Mann et al (GRL 2008) were used, rather than 10 years of padding for the code in Mann et al (PNAS 2008). One would really hope for a more stable software platform.
There are some disquieting comments about both Butterworth filters and filtfilt in technical literature and manuals, though I’m not sure that these comments are relevant to the present situation. One caveat here says:
This filter is usually used in frequency domain work. It offers great slope and phase match, but adds lots of overshoot/preshoot errors to a transient or single shot event. A Butterworth filter is good for frequency domain analysis, but bad for time domain data acquisition.
Despite all these caveats, by centering the data, the problem seems to disappear as shown below. Here I first centered the series, then butterworthed it (there was negligible difference between 10 and 30 years of padding) then subtracted the mean to regain native centering. Whatever the theoretical concerns over padding and filtfilt, this smooth looks OK for practical purposes.
Does any of this matter? Dunno. I’m not really sure why this smoothing is done before standardization. I guess the moral will be to try the CPS recon without using Mannian smoothing and see if it makes any difference. (Once this bottleneck is removed, we’ll be pretty close to being able to emulate the CPS recon, though the EIV procedure still is quite distant.)
Now when filtfilt.m receives this padded result, it adds further padding at each end as well. In the case used by Mann, Butterworth (10, .025), it pads each end with 30 more values which are both reflected and flipped about the ends of the already padded sequence.