Data "Snooping"

Add this phrase from economics to your vocabularies to describe the "other" studies where proxies with known HS shapes like bristlecones and Yamal are used time after time. Here’s a website with some links. They cite Sullivan, Timmermann and White (1999) and White (2000) for the following definition:

"Data-snooping occurs when a given set of data is used more than once for purposes of inference or model selection."

Hello??

The topic is actively being researched in econometrics and the methods need to be applied to multiproxy studies.

Wahl and Ammann Again #2

Here’s a pretty little graph that I think that you’re going to see more of. One is using the Wahl-Ammann variation of MBH methodology applied to MBH data; the others are from low-order red noise.
Pink – no PC reconstruction from WA without strip-bark and Gaspé; black – from low-order red noise.

I worked this up to illustrate a point in the no-PC part of Wahl and Ammann, but it bears a little commentary separately. You may recall jae trying to wrap his mind around overfitting and bender getting frustrated with him. If it makes either of them happier, MBH – WA variation – is a wonderful example of overfitting that may illustrate the point for jae, who might then undertake to explain the problems to Ammann and Wahl.

The pink graphic is the no PC reconstruction from WA without strip-bark and Gaspé, resulting in a network of 70+ series (down from the 95 in their Scenario 2 due to the strip-bark sites.) If you did a multiple linear regression of NH temperature against 70+ series with little mutual relationsip in a calibration period of length 79, I think that you’d agree that it was overfitting. So what would a reconstruction look like from such a process? I haven’t illustrated that here (I’ll do that now that I think of it), but it would look a lot like the above graphic.

Here I’ve used PLS (partial least squares) rather than OLS- see my linear algebra posts as to the proof that MBH regression can be reduced to partial least squares. OLS multiplies the partial least squares coefficients by (X^T X)^{-1} . If the network is close to orthogonal, then the PLS coefficients will not be changed all that much. In the simulations, to do it quickly, I’ve used a simple network with AR1=0.2 and then re-scaled the variance to match that of the series being illustrated. As you can see, there’s negligible visual difference between the MBH result and red noise.

All the reconstructions have high r2 (greater than 0.5) in the calibration period, and ~0 verification r2. This would be enough for non-climate scientists to conclude that there was overfitting.

Another distinctive feature of overfitting is the characteristic downward notch at the start of the calibration period – this is worth paying close attention to in the WA diagrams where it’s all too visible.

In the red noise and non-bristlecone cases, the reconstruction reverts to close to zero fairly quickly. If you re-insert bristlecones or HS-shaped series, their impact is to change the shaft location to more and more negative, while preserving the general geometry. I’ve been alking about the interrelation of spurious regression and overfitting for some time without illustrating it as clearly as I’d like. Fortunately, the Wahl and Ammann variation has introduced overfitting on such a colossal scale that it’s easy to show the effect.

I doubt that anyone in our lifetimes will ever again see elementary overfitting on the scale of Wahl and Ammann.

Francois and Dano on Agricultural Yields

Here are posts from Francois and Dano on Tilman et al. Continue reading

Wahl and Ammann Again #1

I asked KNMI what were the studies that had “refuted” our work. It seems to be Wahl and Ammann. I’ve never understood the traction of Wahl and Ammann with climate scientists. I doubt that any of them have worked through the details, but Wahl and Ammann issued a press release that all our claims were “unfounded” and that seems to be enough to settle things in climate world. Of course, they proved no such thing, but press releases seem to be what people pay attention – this is true in mining promotions as well. Continue reading

Holland and Sweden

I will be travelling to Europe in the week of Sept 9-15 to give presentations in Holland and Sweden. Two presentations in Holland on Sept 14 – a private presentation in the morning at KNMI and a public presentation at 7.30 in the evening at the Free University in Amsterdam presented by Natuurwetenschap & Techniek (who published one of the first articles on M&M in Feb 2005) – link . On Sep. 11, I will be making a presentation at the KTH (Royal Institute of Technology) International Climate Seminar in Stockholm.

The KNMI Annual Report, just published, has a section on the hockeystick with a bold heading: “the points criticised have been mostly refuted in various studies” and in the running text:

As far as science is concerned: since the start of 2005, the points criticised by McIntyre and McKitrick have been mostly refuted in various studies.

Willis E on Hansen and Model Reliability

Another interesting post from Willis:

James Hansen of NASA has a strong defense of model reliability here In this paper, he argues that the model predictions which have been made were in fact skillful (although he doesn’t use that word.) In support of this, he shows the following figure:

(Original caption)Fig. 1: Climate model calculations reported in Hansen et al. (1988). Continue reading

CMIP Control Runs

Willis Eschenbach sent in the following information about the Coupled Model Intercomparison Project (CMIP) project. I’ve not checked the analysis myself, but it is an interesting topic and well worth a separate thread. There are some other pretty good posts like this. If people want to suggest some back posts for individual threads, it just takes a couple of minutes for me to transpose them and I’m happy to do so.

Willis writes:

Dana provided a very relevant link to the Coupled Model Intercomparison Project (CMIP) project. Here’s Figure 1 from the project, showing the “control run” for each computer. (Before they try to hindcast the past or forecast the future, they do a “control run”. This is described by the CMIP as a run with constant forcing, rather than a run where the CO2 is changing over time.) Here are the results from that control run:

SOURCE http://www-pcmdi.llnl.gov/projects/cmip/overview_ms/control_tseries.pdf Continue reading

Tephras in Ecuador

Donald T. Rodbell et al 2002 (with overlaps to Mark et al and Abbott et al) , entitled " A Late Glacial–Holocene Tephrochronology for Glacial Lakes in Southern Ecuador" , here correlates glacial lakes in southern Ecuador according to widespread tephra (volcanic deposits).

A couple of interesting points – some BIG differences between radiocarbon dates from material adjacent to well-dated Tephra F, dated at 2500 BP in well-dated Pallcacocha and over 4400 BP in another location – the differenc attributed to recycling of old carbon fronm upvalley peat.

Also a curious observation about wind patterns in the last millennium. Continue reading

Glacier Bay, Alaska

George posted up the following link as supposedly supporting Lonnie Thompson’s views on Quelccaya:

Early results on Reid Inlet, where Reid Glacier has now backed up out of the ocean, show that the glacier had retreated beyond where it is now more than 10,000 years ago, advanced to the sea by 8,000 years ago, again retreated beyond where it is now about 7,000 years ago, and the ice once again advanced to Reid Inlet beginning about 5,000 years ago.

I’m not sure how the quote supports his position. I haven’t seen any academic publications on the location in question, but collated a little information from abstracts of the authors to conferences over the past few years, showing a curious result. Continue reading

Bender's Plot of Hurricane Count

Here’s bender’s plot of the number of hurricanes, showing the difference between plotting on an annual and boxed basis. There are statistical issues in fitting trend lines to spiky data like this, which bender is well aware of and pointed out in the predecessor thread. If Curry is unaware of these issues, what does that say? If she is aware of these issues and ignored them, what does that say?