Noise in Jones et al [1998]

People often have a hard time grasping how dificult it is to statistically distinguish between the vaunted multiproxy studies and red noise. Here are a few interesting images from the Jones et al [1998] proxy roster, which I’ve been working on.

Figure 1. Scatterplot of Jones proxies (grey)all scaled to 1901-1950 interval as in Jones et al [1998] . Cyan – annual average of proxies. Red- smoothed graph. Blue – level required for the mean to be "significantly" different from 0 using critical t-statistic of 1.96 (this is too low given autocorrelation, but is the OLS value). Bottom panel – standard deviations by year.

The most remarkable impression is surely the tremendous lack of consistency of proxies in any given year. You see this used to argue that the MWP was regional or occurred in different places at different times, but surely it is just as remarkable to see the lack of consistency in 20th century proxies, especially since they have already undergone a cherrypicking process even to be in the table. There are very few years in which the average of the proxies can be said to be inconsistent with a mean of 0. The blue lines show a 95% confidence interval. There are 990 years here -so there are actually fewer cyan values outside the confidence intervals than one would expect from random numbers with a mean of 0 – this is without even bringing autocorrelation into the equation.

The lower panel shows standard deviations by year. The standard deviation for all the proxies is 1.15 (a little higher than one since 1901-1950 standardization is used.) If an annual "signal" were being captured, one would assume that the standard deviation by year would be much reduced from the overall variability, as proxies presumably swung to one side or the other of the baseline. The average of the annual standard deviations is 1.09, so that there is negligible reduction of variance. I suspect that a little red noise in the benchmark would do the same.

Since this graphic could be a bit confusing if there were high interannual correlations, next is a version in which the grey shows the range of proxy values in any given year. I’ve also shown a blow-up for the 20th century. The series levels off after 1935. Interestingly, the maximum value of the red curve is in 1396 – compare this to Crowley’s imprecations against the inconceivability of a reconstruction with a warm early 15th century (not that we advocate this.)

Figure 2. As above. Vertical line is in 1935.

As a test of signal, I’ve applied the method which I presented in my analysis of Esper – a studentized glauch…". This shows that there is virtually no "signal" in this set of proxies (compare to chart in my Esper post). Again, it’s hard to see the non-signal in the MWP is any worse than the non-signal in the 20th century (this chart includes the Polar Urals series in the 11th century which is almost surely incorrectly dated.

Is this all due to regional variability or is it due to lousy proxies?

This entry was written by Stephen McIntyre, posted on Sep 25, 2005 at 2:26 PM, filed under Jones et al 1998, Multiproxy Studies. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

36 Comments

John A

Posted Sep 25, 2005 at 2:49 PM | Permalink

You know what this reminds me of? The PEAR experiments.

In those supposed experiments, they are testing for global human consciousness to be registered in “anomalies” in the statistical behavior of random generating machines (called “eggs”) which are located around the world.

They claimed a statistically significant anomaly after 9/11/2001 caused by human consciousness of what had happened on that day, but the analysis was abstruse and “after the fact”. Why, for example, didn’t they run the analysis constantly and have the “eggs” tell them when something had happened?

Back to the posting, can anyone seriously tell me that the accumulation of all of the proxies used by Mann and Jones as plotted above can be fairly distinguished from gaussian noise?

The only way to get a signal is to know which is noise and which is signal, which implies some sort of selection mechanism. But there are lots of selection mechanisms, so they run lots of different ones until they select the ones that give them “statistical significance” using their preferred metric.

Then they publish.
TCO

Posted Sep 25, 2005 at 7:07 PM | Permalink

Have you done this same sort of analysis on other reconstructions (e.g. MBH)? Do they all have this issue or is Jones special?
Steve McIntyre

Posted Sep 25, 2005 at 8:43 PM | Permalink

I haven’t done this precise format, but I suspect that they will all be the same. I’ve been organizing information on precisely which proxies cause differences between 11th and 20th century results in each study to prove the hypothesis that its bristlecone/foxtails, Polar Urals, Dunde in the first round, with a few new ones in the later studies: Sol Dav tree rings, upwelling subpolar diatoms at Oman,. MBH98 15th century proxies nearly all have -0.1-0.1 correlation to gridcell temperature.
TCO

Posted Sep 25, 2005 at 8:54 PM | Permalink

That would seem to indicate that there is no proxy hockey stick. That proxies just show a flat regime. IOW, given that we know temps have gone up this century, they don’t show anything at all. YOu’re left with drawing a straight line and then appending the instrumental stuff: basically Daly’s criticism of the graph being drawn for effect.
Brooks Hurd

Posted Sep 25, 2005 at 11:21 PM | Permalink

Steve,

Have you been able to perform the same analysis on Yang’s data?
Louis Hissink

Posted Sep 26, 2005 at 2:14 AM | Permalink

Steve,

Can you plot that initial graph with the Y axis as a logarithmic scale? It would be interesting to see what that graphs at. (I use Grapher on the windows machines, and changing the scaling of the axes is easily done. But I am typing this on the Tigerised Mac-mini and have no idea of a Mac version of Grapher to achieve the same result….).
Peter Hearnden

Posted Sep 26, 2005 at 2:51 AM | Permalink

That would seem to indicate that there is no proxy hockey stick. That proxies just show a flat regime. IOW, given that we know temps have gone up this century, they don’t show anything at all. YOu’re left with drawing a straight line and then appending the instrumental stuff: basically Daly’s criticism of the graph being drawn for effect.

But surely that’s the point? You can critices the evidence, or the methods, but what it doesn’t do is show much global or hemispheric climate fluctuation until recently. It’s pretty flat up until the instrumental era. So, what does a flat graph with a rapidly rising latter stage look like? Oh, difficult that one…So what’s the problem everyone? Steve has confirmed the main points of the HS in this post. It stands becuase that’s what the evidence shows – not much climate vairibilty until recently.

The above is why I’d like to see the sceptics HS. Because the evidence ATM point to a HSish recent milenium or two and, if it’s different, the critic need to get out there and start finding the evidecne rather than spend all this time rubbishing that which, if you were doing science, ought to be built on. Fat chance.
Louis Hissink

Posted Sep 26, 2005 at 3:11 AM | Permalink

Peter Hearndon,

commenting as a geologist I am faced with the problem that the earth, including its mass and thermally active interior are not factored as energy inputs into the various climate models.

Your point that the record is flat until until the instrumental era is valid. But given that the pre-instrumental data were interpreted by post instrumental methodologies makes the comparison invalid. The pre-instrumental data were never measured to the precision of post instrumental data (precision having nothing to do with accuracy). Statistically a case of mixing pears and oranges.

Proxy data for the past have to be interpreted within their scope, and until an accurate transformation of that data to the the present data is presented, that data cannot be serially concatenated as MANN et al did in their various papers.
Paul Gosling

Posted Sep 26, 2005 at 4:13 AM | Permalink

Given the problem of trying to calculate a global mean temperature, even with the instrumental record, never mind the proxies. Is there merit in abandoning this approach and instead looking at regional temperatures, for instance Africa, N. America etc. This should reduced the variability in the data. Are there enough proxies at regional scales to do this and if so would it be possible to interpret for instance, 3 continents showing an upward trend, 3 a downward and one no trend? Would N. America count for more than Oceania etc?
fFreddy

Posted Sep 26, 2005 at 5:49 AM | Permalink

Re #7, Peter Hearnden

But surely that’s the point? You can critices the evidence, or the methods, but what it doesn’t do is show much global or hemispheric climate fluctuation until recently. It’s pretty flat up until the instrumental era.

No, Peter, TCO’s point is that the proxies are just as flat throughout the 20th century as they are before the 20th Century. Given that the instrumental record shows the proxies are wrong in the 20th century, there is no reason to suppose they are any less wrong prior to the 20th Century. Hence, they do not disprove the existence of the Little Ice Age, the Mediaeval Warm Period, and so on.
Peter Hearnden

Posted Sep 26, 2005 at 5:56 AM | Permalink

Re #9. Surely, Paul, what you’re doing is saying you can’t calculate global temperatures but you can (contradictorally?) perhaps claculate local temperatures and from them global ones (at least that’s what you last sentence suggests to me)? But surely that’s what paleoclimatologists do? Or is your comment more knowing than it seems?
Peter Hearnden

Posted Sep 26, 2005 at 6:06 AM | Permalink

Re #10. Ff, OK, perhaps that what TCO is saying. But, if that’s your/his opinion, you can only say ‘my view is that I can’t put the warming of the instrumental record era any historic context since the proxy records aren’t, imo, up to it‘ not dismiss it becuase you think the past was warmer/colder becuase you clearly don’t think we know that? Right or wrong?

I say ‘look, it’s pikestaff plain it’s warming, get used to it and lets get on with addressing the problem’.
fFreddy

Posted Sep 26, 2005 at 6:38 AM | Permalink

Re #12

…not dismiss it becuase you think the past was warmer/colder becuase you clearly don’t think we know that? Right or wrong?

Wrong. These failed proxies are not the only evidence for past temperatures. Recall that before MBH came along, there was – and still is – lots of other evidence for LIA/MWP/etc. If it makes you happier, the existence of the MWP was the consensus view.

I say “look, it’s pikestaff plain it’s warming, get used to it and lets get on with addressing the problem’.

Depends on how you want to address it, which depends on why it is warming, and when it will stop doing so. If it is caused by the same natural cycle that produced the MWP/RWP/etc., then it makes sense to plan for a similar peak temperature plus a safety margin, and deal with the effects.
It only makes sense to try to deal with the causes if we are sure we know what they are. For all that the anti-capitalist propagandists are trying to blame current warming on the use of hydrocarbons, I am profoundly unconvinced (as you might have noticed …).
Paul Gosling

Posted Sep 26, 2005 at 6:50 AM | Permalink

Re #11

I was suggesting that you cannot calculate a global temperature from proxies, because coverage is not good enough and it is not likely that you will get a warm/cold signal from every region in any particular year. But you might be able to calculate regional temperatures if there are enough proxies on a regional scale. So you might have 20 proxies for N. America and 2 for South America. Trying to get an ‘Americas’ temperature would make no sense because a) the coverage is so different b) they are under the control of different climate systems. Would you expect synchronous cold and warm years? If not why combine them. My statistics knowledge is pretty thin, but I am pretty sure combining data from different ‘populations’ and treating then as the same population is bad practice.
Peter Hearnden

Posted Sep 26, 2005 at 6:54 AM | Permalink

Oh, yes, strong anecdotal evidence of LIA’s and MWP’s in places and at different times. So, it’s back to mid eighties Lamb* I suppose (nothing to do with it showing what people here want though…).

Re para two, I know you’re unconvinced. Still, at least you didn’t say you’re unconvincable :). It’s surely clear though that (OK we can argue magnitude) there’s something else going on atm?

* why no scrutiny of his work here I wonder? Right message I guess so no need?
TCO

Posted Sep 26, 2005 at 7:53 AM | Permalink

Pete: yeah that was my point. I’m quite happy with having one measurement that shows a flat line in the early days (if that’s the case). It’s just that the only way to know that proxies work is to compare them to something that is verified (in essence to 20th century, when instruments show warming). If it’s always a straight line, how do you know it even records temps?

P.s. I’m of the GW is (likely) happening but very slow and not impacting me, not something that worries me school. I’m much more interested in fixing air pollution in cities or bottom dragging nets which damage the ocean floor.

P.s.s. And regardless of there being a global LIA/MWP, I still think that currently the CO2 is raising things (what…a degree?)
David Stockwell

Posted Sep 26, 2005 at 12:14 PM | Permalink

Steve, TCO, the lack of signal due to temperature by averages of proxies is exactly what you would expect of an average of an upside down U shaped response. Whether a tree response positively, negatively or not at all depends on its position w.r.t. optimal temperature for that species. The selection of proxies for positive recent trends, or correlation with recent temperature, would give an recent upturn, however there is no reason to expect the positive correlation with temperature persists throughtout their history, as their position changes from one side of the ‘hump’ to the other. So there is no reason to expect a MWP/LIA from averages of proxies, or reliable signals from any particular trees even.
TCO

Posted Sep 26, 2005 at 12:20 PM | Permalink

It would not surprise me intuitively if the last 2000 years was flat. But it also wouldn’t surprise me if it wasn’t. Obviously some quite big changes have occurred over longer time scales. That’s no reason to say that last 2000 years had to have humps or valleys. But also no reason why it couldn’t have.

In the end, what matters is the quality of the proxies. how much we trust them as indicators. I don’t know all the details here, but obviously it is an issue in active debate. It wouldn’t surprise me given the complex nature of the algorithm and the many proxies if some cherry picking had been done by Mann. But it also wouldn’t surprise me if Steve was “cherry picking the criticisms” (for instance of substudies’ issues with confounding effects). Time will tell.
TCO

Posted Sep 26, 2005 at 12:22 PM | Permalink

Dave, interesting intuition.
Steve McIntyre

Posted Sep 26, 2005 at 1:37 PM | Permalink

Dave, I agree. I’ve mentioned this phenomenon a few times and the havoc that it wreaks on the multiproxy project. Look at the results for TTHH here.
David Stockwell

Posted Sep 26, 2005 at 7:25 PM | Permalink

It seems so obvious when you see it. An interesting example is to imagine a temperature driver as a sine wave with no noise. The growth of trees below the optimum temperature (positive correlation) will be in phase, trees above the optimum (negative correlation) will appear phase shifted 180 degrees. The most interesting are trees at the optimum. Growth of these should show a period doubling.
Steve McIntyre

Posted Sep 26, 2005 at 8:58 PM | Permalink

There’s another important nonlinearity (I think). You need warmer weather for a tree to germinate and get started than to keep going. So year 1’s will concentrate on the top of the sine wave.

Trees tend to die in cold cycles, thus last year’s will concentrate on the bottom of tghe sine wave.

If you do an age-fitting curve, the slope will be biased, because it will allocate some variance attributable to declining (on average) temperature during the life of a tree to a juvenile aging effect. This will cause the temperature in germination periods to be under-estimated.
TCO

Posted Sep 26, 2005 at 9:00 PM | Permalink

Good point. That’s why your enemies are right to use low frequency filter. Ha! 😉
David Stockwell

Posted Sep 26, 2005 at 9:30 PM | Permalink

The presence of period doubling should be easily testable. It us possible that a lot of the variation that is seen between proxies is due to location relative to optimum temperature, and not to regional effects, as suggested.
Re: 23. Period doubling will result in no R2 correlation with temperature at any frequency, so I can’t see how a low frequency filter would avoid it, just change the observable frequencies.
It seems to me that as a positive correlation in the recent time could switch to negative in some undetermined proxies the past, period doubled proxies could also be a possible explanation for the hockeystick pattern, i.e. while they are lined up in the present, some go out of phase in the past, creating the stable shaft. This would seem to be another viable explanation for the hockeystick shape, whatever the previous temperature was.
Paul Gosling

Posted Sep 27, 2005 at 2:36 AM | Permalink

While it is true that trees will have a growth response to temperature looks a bit like a sin wave, if you are considering the tree line ecotone you will be no where near the declining growth response to temperature. Remember the tree line is not a line, but an ecotone, which is the case of latitudinal tree ‘lines’ can be very broad. The trees are not only stunted but sparse. The first response to increased temperature is a growth increase on those trees that are there. This occurs well before recruitment starts again. If this leads to competition between trees as they get larger then there will be a growth decline. This does not indicate a change in the response to temperature.
Steve McIntyre

Posted Sep 27, 2005 at 6:28 AM | Permalink

Re #25: my point was different. It was that if you had a quasi-sine wave change in temperature AND you were doing age-adjustments in your standardization of tree rings for site chronology calculation, a portion of the decline actually attributable to cooling would be allocated to aging, biasing downwards the estimates of prior warm periods.
David Stockwell

Posted Sep 27, 2005 at 11:16 AM | Permalink

Re #25: inclusion trees at or near the optimum should cause cancellation in the average, they don’t actually have to be negative. Can you be sure all proxies are sited in the treeline ecotone, and remain there for their entire lifetime? After all, the main criterion for selection of proxies seems to be positive correlation with temperature, and these could be sited anywhere to one side of the response curve. Then, a warm period of sufficiently warmth, e.g. the MWP, could move positive growth responses into the optimal or even negative, depressing the overall signal, and flattening the apparent historic temperatures.
TCO

Posted Sep 27, 2005 at 12:55 PM | Permalink

RE 27: The way to do it (not sure if it is done, but just to show that it is possible) is to look at what the reconstruction gives you in temp changes, then see if those would be of the order to move you to the other side of the ecotone. For instance if the reconstruction shows a 1 degree change and the difference between ectones is 5 degrees (or maybe 2.5, gotta think about that), you are in good shape. (Yeah, you still might have some added noise if year to year variations are of the order of 5+ degrees, but even then you should be able to pull out signal, just maybe need to quantify the added uncertainty.)
David Stockwell

Posted Sep 28, 2005 at 12:42 PM | Permalink

Re: #28 Yes that would be interesting. Another well known non-linearity from plant physiology is hysteresis in the growth response. E.g. plants can delay response to temperature as it increases, but continue responding as it falls – to avoid kind of being ‘faked out’ I guess. With all these non-linearites it would be useful to do some simulation to see what the possible linear range is. While MBH98 is to be applauded for attempting to state assumptions, in view of all the well known non-linearities in plant response, I wonder how one gets away with stating that non-linearity is a “relatively unlikely event”.

“[MBH98] The indicators in our multiproxy trainee network are linearly related to one or more of the instrumental training patterns. In the relatively unlikely event that a proxy indicator represents a truly local climate phenomenon which is uncorrelated with larger scale climate variations, or represents a highly nonlinear response to climate variations, this assumption will not be satisfied.”
Steve McIntyre

Posted Sep 28, 2005 at 10:04 PM | Permalink

David, this issue came up in our Nature submission, with an amusing response from Mann. We had written (and this particular claim carried over to our GRL submission although the rest of the article was recast)

Under MBH98 methodology, 16 overweighted sites (out of 70) account for virtually all the North American PC1 variance. All were high-altitude sites, mostly with cambial dieback (“strip-bark”) formation, showing very high 20th century growth rates. An unreported MBH98 calculation5, studying exclusion of these 16 and four other high-altitude sites, yielded a PC1 nearly identical to ours (Figure1c – correlation=0.95). 15 of these sites were collected by the same researcher (Donald Graybill). Graybill and Idso6 stated that their nonlinear growth rates could not be attributed to temperature and hypothesized direct CO2 fertilization. Hughes and Funkhouser7 called their growth rates a “mystery”. Mann et al.8 stated that their growth rates “are more dramatic than can be explained by instrumental temperature trends.” Since MBH98 methodology requires (p. 780) that proxies follow a linear temperature response, the Graybill-Idso sites should have been disqualified. Instead they were heavily overweighted.

Mann replied:

MM04 demonstrate their failure to understand our methods by claiming that we required that “proxies follow a linear temperature response”. In fact we specified (MBH98) that indicators should be “linearly related to one or more of the instrumental training patterns”, not local temperatures.

The idea that the Stahle/SWM network PC7 could have an intimate relationship with a temperatature PC11 is bizarre even for the Hockey Team, but that’s why they say.
David Stockwell

Posted Sep 29, 2005 at 11:14 AM | Permalink

Steve, the work you are doing in exposing the flaws in MBH98 methodology is invaluable in preventing the endless repitition of similar flaws in the future, and perhaps a necessary step towards understanding the potential of tree rings to be proxies at all, with any methodology, with impacts on a wider class of studies. I think the assumption of “proxies following a linear temperature response” is wrong. The actual assumption for any workable proxy method based on instrument calibration at the recent end of the record, is a kind of homogeneity condition, that “proxies follow an identical linear temperature response to the calibration stage throughout the whole period and range of the proxy, and there is no other response to another confounding factor, such as precipitation, throughout the whole range”. Otherwise the proxy is varying in an unknown way to unknown factors. Commonsense would suggest this is far too strong a condition for growth of plants, but may be approximately met by quasi-physical proxies such as treelines, at least until trees reach the top of a mountain.

As an aside to the above, the Graybill-Idso sites may in fact have a linear temperature response, it is just that the CO2 or other response is stronger. So perhaps the reason sites should be rejected is not because they fail to meet MBH98 inadequately stated assumptions of linearity, but should be rejected because of a confounding factor.

Nevertheless, the conditions for success of tree ring proxies are much stronger than stated. It is not clear what the restated assumptions regarding linearity to one or more instrumental training patterns would entail, or if it is even a weaker condition than proxies following a linear temperature response. What I am saying is that irrespective of other flaws and problems of this particular study, the use of treegrowth as a proxy seems to be a very poorly specified model, that could only succeed under very limited conditions and would explain the general mess that the data show.
ClimateAudit

Posted Sep 29, 2005 at 2:02 PM | Permalink

I’ve seen a recent study under review in which they take their results up to 1985. They have ring widths in many sites up to 2000. The ring widths don’t go up after 1985; they decline. They report that the post-1985 verification fails. However, they get excellent verification in the late 19th century. So they exclude post-1985. Based on verification in a mid-19th-20th splits, they “confidently” make assertions about the relative level of the MWP to the late 20th century. I don’t think that they even realize what they’re doing.
David Stockwell

Posted Sep 29, 2005 at 4:45 PM | Permalink

Sounds like the horns of a dilemma: admit the downturn is due to negative temperature correlation and the MWP goes up, or admit the downturn is due to falling temperatures and global warming goes down. Both unsatisfactory outcomes for the Hockey Team, or perhaps I am reading too much into it.
Steve McIntyre

Posted Sep 29, 2005 at 9:47 PM | Permalink

Re #33. You’ve got it exactly.

I’m going to post up on upside-down quadratics. We observed this in passing in our E&E article. The point seems so obvious and so unarguable that it’s remarkable that policy continues to be driven by tree rings.

When you add in the precipitation issues, it becomes like Monty Python. Half the proxies in MBH98 which are supposely temperature proxies are used in Cook et al [2004] as precipitation proxies.
TCO

Posted Sep 29, 2005 at 10:03 PM | Permalink

You know this is not the first field in the world where people have worried about confounding factors (and a non-linear effect can be thought of as another factor, just adding in terms with higher orders in the polynomials and coefficients of their own). I think that is a pretty basic concept. Even one that a physicist would understand…let’s not keep acting like we discovered America with comments from the frigging peanut gallery. People like Stockwell acting as if it’s some concept Mann can not concieve of.

In any environmental or sociological study there are always worries about hidden factors, etc. (and limits to how many you can check.) The issue wrt Mann is to address their seriousness (within the grownups world of knowing this common concern). Not run around prattling 3 unknowns/2 equations like a simpleton.

I assume it’s ok to hit Dave, while letting Sid slide for 10 posts. Most sites I hang out at, I am the pet troll and allowed to be disruptive (double standard).

I’m thinking about another sabbatical or cutting back to weekly comments…
David Stockwell

Posted Oct 4, 2005 at 8:03 AM | Permalink

Re: #34. A post on upsidedown quadratics would be great. The abundance of evidence that tree-ring growth does not have a ‘linear relationship with temperature’ as assumed by MBH98 should undermine any confidence in the conclusions of all methods using them, including the simple averaging advocated by Huybers.