Comments on: MBH99 and Proxy Calibration

By: Calibration in the Mann et al 2007 Network Revisited « Climate Audit

Calibration in the Mann et al 2007 Network Revisited « Climate Audit — Thu, 02 Jun 2011 03:18:00 +0000

[…] a post a few months ago, I discussed MBH99 proxies (and similar points will doubtless apply to the other overlapping series) from the […]

By: UC

UC — Mon, 05 May 2008 15:47:52 +0000

Here’s another interesting R series, from MBH98 AD1600 step:

q=57, and p=4 (TPCs 1,2,11,15), so with large n we should have . This is clearly not the case, and thus assumptions , are not valid.

By: UC

UC — Sat, 03 May 2008 06:54:02 +0000

Hu,

There are several R’s above this value, but out of 1000 trials, there should on average be 50 false rejects at the 95% level. You probably have more than 50, but then the tests are not independent, since they all use the same calibration coefficients, so it’s not too clear how big an overall reject this is.

167 values exceed 3.84. 18 is the largest value. To see how close chi-square(1) this should be, I think I need to walk to library for Fujikoshi & Nishi (1984) and Davis & Hayakawa (1987). R is large enough to make confidence interval empty, and I think it is a good indication that data doesn’t fit the model. Here’s what I think:

Model fits well in the calibration period, but not elsewhere ( cherry picking, spurious correlation ). Thus the residual covariance matrix will underestimate the true error covariance matrix, and these reconstructions based on subset of responses will be overconfident in accuracy, and we have a conflict.

It doesn’t look like it would rule out an 11c MWP.

If I’ll add Western USA (Hughes), for example, R will be very high in 11c as well.

By: Hu McCulloch

Hu McCulloch — Fri, 02 May 2008 13:04:45 +0000

RE UC #50,

Very nice!

If this is the same as Sundberg’s 1999 “R”, it should be chi-square(q-p = 1) as calibration n becomes large. The 95% critical value is 3.84, but since you’re subtracting 3, this would leave 0.84.

There are several R’s above this value, but out of 1000 trials, there should on average be 50 false rejects at the 95% level. You probably have more than 50, but then the tests are not independent, since they all use the same calibration coefficients, so it’s not too clear how big an overall reject this is.

Isn’t there an F interpretation (at least approximately) to the Brown/Sundberg R stat? The calibration period is often far from infinite.

What kind of confidence intervals does Sundberg’s other term “Q” give? It doesn’t look like it would rule out an 11c MWP.

By: UC

UC — Fri, 02 May 2008 06:12:45 +0000

Here’s IMO a nice example how Brown’s R works:

( http://signals.auditblogs.com/files/2008/05/hegerl_48.png )

I took two proxies from Juckes’ HCA set (mitrie_proxies_v01.csv, Northern Norway and Eastern Asia), calibrated them first separately to obtain CIs (yellow and red). Then I calibrated them together to obtain R (which is always zero in univariate case). When univariate CIs are in disagreement, R gets high values. Brown87:

For such a large R one might question the validity of the observation. Various strategies are then possible, including investigation of the individual error components and seeking further data.

Interestingly, Hegerl2006 is in the IPCC Fig. 6.10.c, where uncertainties (2*calibration residual std!!) are just overlapped, and conflict is not considered at all. But hey, .. 🙂

By: UC

UC — Wed, 30 Apr 2008 09:40:10 +0000

RE main post,

The MBH99 “adjustment” of the PC1 has the effect of “improving” its fit to temperature, and thereby increasing its weight in an MBH-style reconstruction.

Do you mean the CO2 adjustment ( http://www.climateaudit.org/?p=2344#comment-159797 ) ? As Mann re-scales proxies with detrended std, this adjustment has insignificant effect to his verification (and calibration ) stats. But if stats are computed correctly, this adjustment has effect (in addition to making ‘astronomical cooling’ visible) – it makes CIs shorter because recon results will be closer to calibration mean.

By: Hu McCulloch

Hu McCulloch — Tue, 29 Apr 2008 16:08:13 +0000

Re #26 last paragraph and Steve #29,
I still think there’s an inconsistency between your t of 1.71 for PC1 and the finite 95% CI reported in the post, whether Brown’s formula is used or the Draper and Smith version. The latter overlooks the fact that Y’ will contain an observation error in addition to the coefficient uncertainty, but it is the uncertainty of beta that makes the classical univariate CI blow up for confidence levels greater than the “significance level” (1 minus the p-value) of the slope coefficient.

I see now why your units in the millennial graph for PC2 were still in SD units — although the calibration in the preceding graph would permit translating it into dC, this translation is not necessary to see that there will be no HS, since both graphs have exactly the same shape, differing only in location and scale. I had assumed this was supposed to be the reconstruction, but in fact it is just PC2, which is adequate to make the point.

By: UC

UC — Tue, 29 Apr 2008 06:32:36 +0000

Some random notes (while trying to catch up with the main post)

For MBH99 AD1000 step, problematic proxies seem to be ( CIs over 10 deg C or complex roots, my implementation of Brown82 2.8)

fran010.txt
itrdb-namer-pc3.dat
npatagonia.dat
quelc1-accum.dat
quelc2-accum.dat
urals-new.dat
westgreen-o18.dat
morc014

Really inconsistent calibration results can be obtained by using Brown’s formulas with Juckes’ HCA proxy set (originally from Hegerl et al) . Univariate results look OK, but are in conflict with each other (except in the calibration period!), and thus Brown’s R gets really high values, and CIs go floor-to-ceiling ( see http://signals.auditblogs.com/2007/07/09/multivariate-calibration-ii/ )

Hu, #26,

This effect is very important when one is trying to reconstruct say decadal or tridecadal averages of temperature from annual proxies. The point estimate of the average is just the average of the point estimates. However, the CI of the average is not the average of the CI’s, but in fact is much narrower!

With some exceptions, see http://www.climateaudit.org/?p=2955#comment-231229

Steve #29,

The y-axis for the proxies is, as you observe, SD Units, as the proxies were standardized to standard deviation units in the calibration period, which is consistent with Brown equation 2.3.

Brown’s 2.3. means that targets are standardized, not responses. Not sure if it just to make derivations easier, in Brown87 there’s no scaling of targets.

By: UC

UC — Sun, 27 Apr 2008 16:53:13 +0000

Steve,

There’s a vast amount of hand-waving as I don’t think that any of the parties really understand what the methodologies are actually doing, let alone what the other party is doing. There’s nothing in their exchanges that rises much above a blog exchange.

Agree. The exchange has some humoristic features. And journals like Science are involved in this. Fun to read, but the alarmist in me gets a bit worried. Where is science going..

Mann’s replies such as

Osborn et al. (2006) have shown that the anomalous initial warmth and much of the subsequent long-term cooling trend in the Erik simulation is an artifact of inappropriate model initialization

are quite hilarious.

By: Steve McIntyre

Steve McIntyre — Sat, 26 Apr 2008 13:08:16 +0000

#44. There’s a vast amount of hand-waving as I don’t think that any of the parties really understand what the methodologies are actually doing, let alone what the other party is doing. There’s nothing in their exchanges that rises much above a blog exchange.

However, there’s another VERY important nuance that gets lost in the X on Y versus Y on X discussion – and this goes to the Partial Least Squares point that I’ve been trying to articulate clearly.

The X on Y versus Y on X distinction gets substantially extinguished at the reconstruction stage. Mann’s Y on X ends up being a (Partial Least Squares) X on Y after the matrix algebra cancellation in the 1-D case. IF you have substantially uncorrelated proxies as in MBH, then the only difference between the coefficients in X on Y and (PLS) X on Y is the rotation by the matrix and for a substantially uncorrelated network, that matrix is “near” the identity matrix.

So some of von Storch’s comments end up being applicable, though his reasoning isn’t at all precise and his terminology is very unhelpful. And of course, the Mann/Ammann responses are little better than a squid emitting ink to make things incomprehensible. Here is a recent abstract that contains a helpful description applicable to Mann/Ammann ink releases:

Six ink release types were observed: pseudomorphs, pseudomorph series, ink ropes, clouds/smokescreens, diffuse puffs and mantle fills. Each species released ink throughout all or most of its depth range; inking was not limited to shallow, sunlit waters. Individuals of each species produced one ink release type more commonly than other types, however, multiple ink types could be released by individuals of all species. Common behaviors preceded and/or followed each release type; pseudomorphs and pseudomorph series were generally associated with escape behaviors, while ink ropes, clouds, and puffs normally involved the animal remaining adjacent to or amid the ink. Deep-sea [species] may use ink for defensive purposes similar to those of shallow-dwelling species when they release pseudomorphs, pseudomorph series, or large clouds, and may use ink puffs in intra-specific communication. The function of ink ropes and mantle fills is unknown.