OK, folks. We finally extracted enough information from Martin Juckes to be able to replicate SI Figure 1. I’ll show here how one gets from point A to point B, which will help understand us understand exactly why Juckes did this the way he did. One more time, here is Juckes’ Figure 1 with its legend.
Fig. 1. Proxy principal components: the first principal component of the North American ITRDB network of Mann et al., 1998. (1) Using the normalisation as in Mann et al. 1998, (2) as (1), but using full variance for normalisation rather than detrended
variance, (3) normalised and centred on the whole series, (4) centred only (5) as archived by MBH1998. 21-year running means.
Here is where I started from: a plot of the 70-series AD1400 network with the Mannian, covariance (cen) and correlation (std) PCs scaled by their standard deviation over 1400-1980 and centered over 1856-1980 (instead of 1902-1980). When Mann re-scaled series for his regression step (not that such re-scaling is necessary) , Mann re-scaled by dividing by the standard deviation over the calibration period.
Figure 2. 70 series AD1400 NOAMER network. Scaled by standard deviation 1400-1980; centered on 1856-1980. All 3 PCs are here flipped from the versions archived by Juckes (series 16, 22, 24).
I note in passing that these 3 series centered on 1400-1980 are used in Juckes’ Comment on EE Figure 2, in the form shown below.
Figure 3: As Juckes Comment figure 1, but using the grey curve is generated using the “princomp” function instead of the “svd” function, so that the data is automatically centred. Also shown is the PC generated when the data is also standardised (grey dashed curve). As in figure 1, the PCs have been smoothed using a 21 year block average instead of Gaussian
smoothing. Here the curves have also been normalised to unit variance. All these curves have, by construction, zero mean (prior to smoothing). [The curves have been end-padded with 10 years of the closing value].
In response to persistent questioning here, Juckes said that he used rms normalization in his SI Figure 1 – something nowhere mentioned in the article. the standard deviation is obtained from the sum of the squared distance from the mean; the rms is calculated similarly except using the squared distance from 0. In a quick google, I saw this technique used in electrical circuits where there is a natural 0 in alternating circuits. I have never seen this method used in multiproxy climate studies. You’d have thought that this is something that Juckes should have mentioned in an article that dwells on normalization and standardization issues. Even more – justifying its use. Where else did he use this method? Did he sometimes use standard deviation and sometimes rms?
Anyway, applying rms to the same network yielded the following figure. It didn’t change the scale of the covariance (cen) and correlation (std) PCs very much, but it sharply reduced the scale of the Mannian PC1. Is this a good thing or a bad thing? Who knows? Wegman said that the Mannian PC1 was an incorrect methodology – a point that I agree with. So I’m not sure why anyone cares about different procedures for re-scaling it. There is endless talk in paleoclimate about how to re-scale things – shouldn’t this innovation by Juckes have been mentioned and justified? The main function of dividing by the rms in spaghetti graph terms is that Juckes makes the Mannian PC1 appear less of an outlier than it appears when scaled by its standard deviation.
Figure 4. 70 series AD1400 NOAMER network. Scaled by rms 1400-1980; centered on 1856-1980. All 3 PCs are here flipped from the versions archived by Juckes (series 16, 22, 24).
Now this figure still doesn’t replicate SI Figure 1 very well. However it is an exact replication of Juckes SI Figure 2 as shown below. All 3 PCs are here flipped from the versions archived by Juckes (series 16, 22, 24). I’m going to discuss orientation of PC1s on another occasion in connection with Juckes’ comment on EE Figure 1. The salient point here is that we’ve exactly replicated Juckes’ SI Figure 2 up to and including PC1 orientation.
Figure 5. Juckes SI Fig. 2 Caption: As Fig. S1, except allowing padding of up to 10 years data, so that the proxy network is 70
instead of 56 trees (see section 2 of this document for lists of proxies).
So the next step should be pretty easy: apply the same algorithm to the 56 site network, the sites which do not require extension in the 1970s (the PCs from which are PCs 1-15 in the 30-series archived network). We’ll ask later why Juckes has done a separate analysis on a 56-site network.
Figure 6. 56 series AD1400 NOAMER network. Scaled by rms 1400-1980; centered on 1856-1980. All 3 PCs here have been assigned the same sign as the corresponding series in SI Figure 2.
OK, the orientations don’t work. So let’s see what happens when the signs of two series are changed. We now nearly have Juckes SI Figure 1. There’s still something in the centering.
Figure 7. 56 series AD1400 NOAMER network. Scaled by rms 1400-1980; centered on 1856-1980. All 3 PCs here have been oriented to match the corresponding series in SI Figure 2.
There’s still a loose end in the centering. Again what is the purpose of these guessing games? Juckes’ refusal to provide a coherent description of his scaling and centering procedures is intolerable.
More substantively, there are some important differences between the 56-site network and the 70-site network. The covariance PC from the 56-site network has more of an uptick than the covariance PC from the 70-site network. I looked at the sites and the reason is simple – 12 of the 14 sites that make up the difference are non-bristlecones. So the dominance of bristlecones in the 56-site network is greater than the 70-site network and this increases bristlecone weight in the PC1 however calculated. (In MM05 EE, we observed that all the PC1s in the AD1000 network had a HS shape because of there were so many bristlecones – without the need for any mathematical artifice. In this context, we observed that the reliance on bristlecones was the elephant in the room within all these arcane discussions of PC methods. The Team never likes to discuss bristlecones and Juckes pointedly does not do any sensitivity analyses on presence/absence of bristlecones – a point that I’ll return to.)
In the rest of Juckes’ study, now I’m wondering when he used standard deviations and when he used rms. Does one have to guess at every step? I wonder if he’ll tell us.