Comments on: MBH Proxies in Bo Li et al

By: More on Li, Nychka and Ammann « Climate Audit

More on Li, Nychka and Ammann « Climate Audit — Thu, 22 Mar 2012 22:08:16 +0000

[…] 8/29/07 CA thread MBH Proxies in Bo Li et al has already discussed certain data issues in depth, but the 11/18/07 general discussion Li et al […]

By: bender

bender — Mon, 19 Nov 2007 18:01:08 +0000

Actually, it makes sense to have both threads, this one for "MBH proxies" used in Li et al, and the other one for Li et al 2007 methodology independent of the proxy issues.

By: bender

bender — Mon, 19 Nov 2007 17:51:58 +0000

Was not aware of this thread. May I bump it and suggest it is relevant to confidence testing in Loehle (2007).

By: UC

UC — Wed, 05 Sep 2007 08:50:11 +0000

Silly me, used sample variance of calibration residuals directly, and forgot degrees of freedom. Inflation is 50, not 300 (without AR(2) GLS fit).

#67

they admit that MBH99 is a terrible overfit

They should state it a bit more explicitly, it would save time 😉

By: Jean S

Jean S — Wed, 05 Sep 2007 07:59:06 +0000

#66: Yes, that applies directly to MBH99, that's why I said they admit that MBH99 is a terrible overfit. But notice that the inflation factor 1.3 is not directly for ICE (with AD1000 step), it's for their model where they allow AR(2) for the noise (GLS fitting). #64: If you have time, run their code to obtain the true residuals. Then plot the residuals against the fitted temperature values. I would be surprised if there is no pattern. In any case, IMO trying to work with MBH99 AD1000 proxies is waste of time: garbage in garbage out.

By: UC

UC — Wed, 05 Sep 2007 06:07:59 +0000

Li:

The difference between the prediction and the actual observations is an unbiased estimate of the statistical prediction error. If there is no overfitting, the variance of the observed prediction error is expected to be equal to the prediction variability derived from our linear model.

If I got it correctly, for AD1820 step, the variance of observed prediction error is 300 times the predicted error.. Compare to the one they obtained with AD1000 step

This approach suggests an inflation factor of 1.30.

Quite interesting, Wahl and Ammann would probably deem #65 reconstruction unreliable, and they would have to remove proxies to obtain acceptable RE.

By: UC

UC — Tue, 04 Sep 2007 07:30:33 +0000

Li et al uses only 14 proxies:

Motivated by the recent discussion of uncertainty in the MBH99 reconstruction (North et al., 2006), we illustrate our statistical procedures for the purpose of this article by restricting our network of proxy records to the 14 series originally used in MBH99 for the period back to the year 1000 (see table 1 in MBH99).

As ICE (b in the post #64) was the correct method to replicate Figure 1, (and R-code indicates that ICE it is) I tried the same method with AD1820 network. There's 112 proxies, so overfitting (if present) should be more visible with this network (compared to AD1000). Here's the result, 1850-1859 as a verification period: Larger figure in here .

By: UC

UC — Fri, 31 Aug 2007 19:05:09 +0000

Jean

Well, it is hard to tell if it should be called classical or inverse

If I understood it right, we have two approaches in statistical calibration

a) CCE (classical calibration estimator (Williams 69)) , indirect regression (Sundberg 99), inverse regression (Juckes 06) and

b) ICE (inverse calibration estimator (Krutchkoff 67)), direct regression (Sundberg 99)

Residuals from Li et al. Fig 1. are obtained using b) for sure. That’s the one that assumes that very special prior for the temperature. And as those residuals are correlated with temperature, that AR(2) noise model is partly based on temperature autocorrelation! But I’m walking on thin ice here, as I haven’t had time to comprehend the full paper 😉

By: Mike B

Mike B — Fri, 31 Aug 2007 18:57:26 +0000

I found a pdf of what appears to be a presentation on Bo Li’s website. Is there also a paper somewhere? Or is the pdf it (at least for now)?

Although I need to study these results to be more specific, my first reaction of what is up on the website is that Bo Li has made a critical error than many young statistcians make: they don’t ask enough questions before diving into the mathematics. As a result, we’re left with many of the same fundamental flaws M&M exposed in the original hockey stick, but covered up with more complicated mathematics.

By: Jean S

Jean S — Fri, 31 Aug 2007 14:31:07 +0000

re #59 (UC): 1) Well, it is hard to tell if it should be called classical or inverse. That is, they still haven't figured out that the problem under study is known in statistics as (multivariate) calibration. Their model (eq (1)) is misspecified: I wonder if, statisticians, Li and Nychka have ever wondered why in the regression model the variables on the left (usually denoted by Y) are called dependent/response variables and the ones on the right (X) are independent/explanatory variables. For useful references, see, e.g., here. 2) Don't be too harsh, UC :) That would actually amount to testing the validity of the model. I see some positive signs here: they implicitly admit that MBH99 is a terrible overfit (p.4 "For example, MBH98 and MBH99 applied essentially an OLS to fit the linear model", and on the next page they notice that even the GLS fit is an overfit).