Comments on: McShane and Wyner Discussion

By: ACT

ACT — Sun, 12 Dec 2010 23:40:08 +0000

All the comments and McShane and Wyner’s rejoinder are now up online at http://www.imstat.org/aoas/next_issue.html

I haven’t read them properly yet but its good to see Tingley get a mauling in the rejoinder.

It’s a shame the editor felt the need to write:

“Thus, while research on climate
change should continue, now is the time for individuals and governments to
act to limit the consequences of greenhouse gas emissions on the Earth’s
climate over the next century and well beyond.”

[RomanM: Sounds like a good excuse for a new thread, Steve. I see there is a submission from Ross and yourself in the mix.] 🙂

By: ACT

ACT — Wed, 17 Nov 2010 13:22:12 +0000

I’m obviously coming rather late to the party here, but Tingley’s comments about the LASSO being inappropriate have been bugging me for a while.

Tingley claims the LASSO is inappropriate for multi-proxy studies because LASSO is a method for sparse regression (i.e. most appropriate when only a small subset of the possible covariates are expected to have a true effect). Tingley notes that the LASSO is equivalent to putting a double-exponential prior on the regression coefficients and that this does not make sense from a scientific perspective. He is correct that, for a fixed value of the bounding parameter, the LASSO is equivalent to such a Bayesian analysis. However, M&W use k-fold cross-validation to choose the bounding parameter. If, as is claimed, the temperature proxies are all highly informative, the cross-validation would find that very little bounding is needed (i.e. tend towards a standard unconstrained regression with lambda =1). Hence the choice of bounding parameter is highly data driven, meaning the actual procedure is nothing like using a double-exponential prior.

Tingley in the detailed version of his discussion, tries to show how poor M&W’s use of the LASSO is by using the LASSO but he states

“I do not perform the cross-validation procedure used in MW2010 to determine
the LASSO penalization parameter (lambda on page 13 of MW2010). Instead, I use the default setting of the glmnet package, which sets lambda to be 0.05 times the smallest value of lambda for which all coefficients are zero. The LASSO penalization is thus very small.”

Hence instead of doing what M&W have proposed, he is making sure the regression will always be really sparse. Unsurprisingly, he then finds that his version of the LASSO performs badly.

It is a classic straw man argument. However, I suspect Tingley doesn’t understand the importance of the cross-validation step, because what he has done is so easy to rebut and so obviously wrong to anyone with knowledge of the LASSO.

By: jak

jak — Sat, 02 Oct 2010 18:48:20 +0000

In reply to dahuang. dahaung, I am not sure of your point. The second paper you reference certainly mentions M&W but does not contain any criticism of the paper. I am also not certain why you would use the adjective "flamboyant" to describe the authors. This paper appears to have a refreshing objective of trying to bring modern statistical methods to climate science. If anything, it is a (mild) criticism of "mainstream" reconstructions for not making use of statistical knowledge available.

By: dahuang

dahuang — Sat, 02 Oct 2010 11:02:55 +0000

Tingley is very active in fighting with M&W nowadays, pay attention to the latter paper which is coauthored by a flamboyant team of professional statisticians! M&W have real enemies now! http://www.people.fas.harvard.edu/~tingley/ Tingley, Martin P. Spurious predictions with random time series: The Lasso in the context of paleoclimatic reconstructions. A Discussion of "A Statistical Analysis of Multiple Temperature Proxies: Are Reconstructions of Surface Temperatures over the Last 1000 Years Reliable?" by Blakeley B. McShane and Abraham J. Wyner. Submitted to the Annals of Applied Statistics pdf. A more detailed version can be found here. Tingley, Martin P., Peter F. Craigmile, Murali Haran, Bo Li, Elizabeth Mannshardt-Shamseldin and Bala Rajaratnam. Piecing together the past: Statistical insights into paleoclimatic reconstructions. Manuscript submitted, and currently available as Technical Report No. 2010-09, Department of Statistics, Stanford University. pdf

By: See - owe to Rich

See - owe to Rich — Tue, 28 Sep 2010 19:55:09 +0000

I’d like to comment on the RE statistic because, like Dave Dardinger, I recently learned about it from the “Hockey Stick Illusion”, in which SM is credited with most of the analysis, and because I’d like to understand it better. In a shortened notation, from the present article we get:

RE = 1-MSEx/MSEy = (MSEy-MSEx)/MSEy

Here MSEy is the mean squared error from the simple mean, or intercept, model y, and MSEx is from the more complex model x.

Now, if model x was actually a submodel of model y with parameters fitted by least squares, then RE would be like an F-statistic, and necessarily positive (so would not be a sensible threshold for significance).

But model x and model y are apparently measurements from a “holdout”, or “verification” period, using parameters estimated from a distinct calibration period, and hence RE.gt.0 is not mathematically enforced. Even so, given that model y presumably has more parameters and fewer degrees of freedom, one should still expect RE to be bounded away from zero to be significant. One might expect the threshold to depend on the degrees of freedom, as in an F-statistic.

I’d be grateful for any clarification that experts can add to that (vague) assertion.

Rich.

By: Pat Frank

Pat Frank — Tue, 28 Sep 2010 05:56:01 +0000

In reply to Paul Dennis. Paul, risking Steve's wrath, I hereby withdraw my criticism for any proxy temperature reconstruction based in physical theory. :-) That, and very best wishes for your efforts, and those of your colleagues.

By: Geoff Sherrington

Geoff Sherrington — Tue, 28 Sep 2010 03:06:24 +0000

One of the key comments for me in M&W is –
“On the other hand, limiting the
validation exercise to these two blocks is problematic because both blocks
have very dramatic and obvious features: the temperatures in the initial
block are fairly constant and are the coldest in the instrumental record
whereas the temperatures in the final block are rapidly increasing and are
the warmest in the instrumental record. Thus, validation conducted on
these two blocks will prima facie favor procedures which project the local
level and gradient of the temperature near the boundary of the in-sample
period. However, while such procedures perform well on the front and
back blocks, they are not as competitive on interior blocks. Furthermore,
they cannot be used for plausible historical reconstructions!”

This throws the spotlight back on the instrumental temperature record and whether the above statement is correct. There continue to be many reasons to suggest that the temperature record is inaccurate. Unless the relation between a proxy response and a local instrumental temperature is accurately characterised, we can get the instrumental era tail shaking the millennium dog. (Accuracy should not be confused with correlation or precision). There are dangers in accepting the temperature record at face value.

BTW, it would be an interesting analysis if both the initial and final blocks were horizontal lines of similar averages. Many such discrete locations exist as weather stations from 1900 onwards. We might end up with a historic record from books, showing MWP agriculture in cold places, with a statistical reconstruction showing an invariant temperature for over 1000 years. I mention this only in support of the M&W statement that problems arise because the typical proxy response to temperature is weak – and it destructs at this limiting case.

By: ul

ul — Mon, 27 Sep 2010 18:13:18 +0000

OT:

Hi,

I just noticed a little error in your Blogroll, it´s “Klimazwiebel”, not “Klimazweibel”.

(feel free to remove this comment)

greetings from Germany, UL

By: Steve McIntyre

Steve McIntyre — Mon, 27 Sep 2010 17:52:49 +0000

folks, please make comments that are specific to McShane and Wyner rather than editorializing against the concept of proxies. The issues at hand are technical ones about reconstructions in the presence of large and imponderable “noise” and the behavior of pseudoproxies.

By: Paul Dennis

Paul Dennis — Mon, 27 Sep 2010 08:39:11 +0000

In reply to mondo. Mondo, I'm sorry my comment related explicitly to the oxygen isotope, O-18, as a proxy for temperature and not tree rings. I think your comment about tree rings and plant growth are very valid.