Jones et al [1998]: Gridcell Correlations

Since I showed the effect of smoothing on the relationship of Dunde to temperature, I thought that it would be useful to post up a table showing the Jones et al [1998] proxy correlations to temperature versus my calculations using HadCRU2.

This table shows: col 1- my calculated correlation for 1881-1980 using HadCRU2 (I haven’t tested why they used 1881-1980); col 2- correlation as reported in Jones et al [1998]; col 3- original reported correlation, according to Table 4 of Jones et al [1998]; col 4- t-statistic from obtaining the correlation by a regression model (which matches the usual calculation). I’ve not attempted to deal with spurious t-statistics issues here. In a number of cases, Jones et al [1998] already reported lower correlations than the original publication, e.g. Jacoby treeline; Lenca and Rio Alerce; Galapagos corals. In some cases, my calculations using HadCRU2 are much lower again, with the differences sometimes being substantial. For example, the correlation for the Jacoby NH treeline study is reduced here to 0.07, down from the original 0.72 (Jones – 0.37) . Likewise for the Rio Alerce and Lenca series. A few series show higher correlations e.g. Svalbard melt.

Jones says (p. 461) :

"The most surprising correlation in Table 4 is for ENG. The value is relatively low because the gridbox series (50-55N, 0-5W) incorporates up to 10 station records and some SST data while ENG is based on only 3 inland stations."

It is surprising that the reconstruction of gridcell temperature from Tornetrask tree rings (this includes the "adjustment" discussed before) is supposedly more accurate than the Central England temperature version is to the HadCRU2 gridcell. This seems unlikely and suggests that some portion of this high correlation may be spurious. A t-statistic seems to me to be a more sensible approach, but there are no surprises in the t-statistics here. Relative to the usual 2.96 benchmark, 5 of 7 SH series are insignificant and 3 of 10 NH series.

	Calculated	Jones 98	Original Report (Jones 98)	tOLS
Fenno	0.73	0.79	0.74	10.59
Urals	0.61	0.83	0.82	6.01
Jasper	0.56	0.48	0.42	6.58
Svalbard	0.30	0.08	NA	2.76
C England	0.62	0.84	NA	7.80
C Europe	0.94	0.90	NA	27.19
Kameda.melt	0.10	0.17	NA	0.98
Jacoby.treeline	0.07	0.34	0.73	0.52
Briffa.WUSA	0.38	0.60	NA	4.12
Crete, Greenland.O18	0.41	0.30	NA	4.13
Tasmania.92	0.35	0.42	0.57	3.64
Lenca	0.14	0.36	0.61	1.31
Alerce	0.05	0.35	0.61	0.43
Law.Dome	0.08	0.26	NA	0.38
Great Barrier Reef (5)	0.19	0.18	0.31	1.86
Galapagos	0.13	0.39	0.66	1.31
New.Caledonia	0.41	0.41	NA	4.38

Jones et al [1998] Table 4 shows "decadal" correlations which are supposedly measuring "low frequency" effects. I’ve spent quite a bit of time recently pondering statistical issues of handling "low frequency" and they are by no means easy, if you’re trying to do it in an advanced way. In this table, col 1 – my calculation of the correlation: I made decadal averages for both proxies and temperature (not by smoothing); col 2- reported inJones et al [1998]; col 3- OLS t-statistic.

The Law Dome statistic here is meaningless as it based on 3 decades ( i.e.the correlation results from 3 values) which are hardly enough to ground a reportable correlation statistic. There are a couple of significant decreases: the Polar Urals decadal correlation declines from 0.92 to 0.23 (whereas Tornetrask is unchanged). I’m not sure why. It might be due to changes in the HadCRU2 version. The decadal correlation in the Svalbard melt series improves. I’ve noticed major changes in Greenland temperature series in HadCRu editions and maybe htere was a change in Svalbard here. Jones et al [1998] mentions as a caveat that inaccurate temperature series may contribute to low correlations.

The t-statistics here show the impact of the reduced number of values used in the correlation calculations. Despite the seemingly high decadal correlations, only 4 series have significant decadal OLS t-statistics (let alone t-statistics allowing for spurious significance issues). The four are: the Central England series – hardly a "proxy" for temperature; the Central Europe historical series; Svalbard melt % and the Tornetrask reoonstruction. The Svalbard melt series is hugely non-normal. I’ve been meaning to check to see the effect of normalizing this series. The Tornetrask series was "adjusted". This may have an effect. Again, it is disquieting that this reconstruction is "more accurate" than the CEng series.

	Calculated	Jones 1998	tOLS
Fenno	0.82	0.80	4.05
Urals	0.23	0.92	0.63
Jasper	0.30	0.45	0.89
Svalbard	0.79	0.38	3.67
C England	0.84	0.80	4.31
C Europe	0.90	0.83	5.68
Kameda.melt	-0.40	– 0.28	– 1.24
Jacoby.treeline	0.76	0.87	2.36
Briffa.WUSA	0.65	0.79	2.40
Crete.O18	0.27	0.49	0.80
Tasmania.92	0.65	0.58	2.40
Lenca	0.31	0.55	0.93
Alerce	0.12	0.16	0.34
Law.Dome	0.57	0.98	0.70
GBR.5	0.55	0.52	1.87
Galapagos	0.14	0.16	0.41
New.Caledonia	0.60	0.48	2.11

This entry was written by Stephen McIntyre, posted on Sep 19, 2005 at 11:50 AM, filed under Jones et al 1998, Multiproxy Studies and tagged jones 1998. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

36 Comments

TCO

Posted Sep 19, 2005 at 8:16 PM | Permalink

1. does smoothing always (usually) help the r factor?

2. When is it appropriate, not appropriate (if I’m looking at it from a management

perspective).

3. I don’t understand the difference between col2 and 3. what is jones 98 and jones

original 98? OK. I think I get it: original study report. But WTF. How come 3 people

get different results so often just from basic crunching??

4. “col 4- t-statistic from obtaining the correlation by a regression model (which matches

the usual calculation). I’ve not attempted to deal with spurious t-statistics issues here.”
a. who’s t-stat (yours or Jones and if Jones, did you check them?)
b. “from obtaining the correlation by a regression model (which matches the usual

calculation)” HUH? What should I take away?
c. “…spurious…” You think the number crunching of the t-stat is wrong? Or that their

is something else more subtle that is wrong? Is the t stat is not the appropriate metric

for evaluation?

5. Are we doing correlation to gridcell or to “climate field”?

6. “Jones says (p. 461) :

“The most surprising correlation in Table 4 is for ENG. The value is relatively low because

the gridbox series (50-55N, 0-5W) incorporates up to 10 station records and some SST data

while ENG is based on only 3 inland stations.”

It is surprising that the reconstruction of gridcell temperature from Tornetrask tree rings

(this includes the “adjustment” discussed before) is supposedly more accurate than the

Central England temperature version is to the HadCRU2 gridcell. This seems unlikely and

suggests that some portion of this high correlation may be spurious. A t-statistic seems to

me to be a more sensible approach, but there are no surprises in the t-statistics here.

Relative to the usual 2.96 benchmark, 5 of 7 SH series are insignificant and 3 of 10 NH

series.”

a. Is ENG= central england?
b. Tornetrask? It’s not even in the list.
c. Did you completely change topics in the middle of a paragraph?
d. I have a hard time following you…

7. YEAH…I don’t know what to think about the “low frequency” stuff. Is it right thing to

do to get measurements of interest? A copout to make trends look better? People acting

“cool” with long words?

8. “col 3- OLS t-statistic”: yours or Jones?

9. “The Law Dome statistic here is meaningless as it based on 3 decades ( i.e.the

correlation results from 3 values) which are hardly enough to ground a reportable

correlation statistic. There are a couple of significant decreases: the Polar Urals decadal

correlation declines from 0.92 to 0.23 (whereas Tornetrask is unchanged). I’m not sure why.

It might be due to changes in the HadCRU2 version. The decadal correlation in the Svalbard

melt series improves. I’ve noticed major changes in Greenland temperature series in HadCRu

editions and maybe htere was a change in Svalbard here. Jones et al [1998] mentions as a

caveat that inaccurate temperature series may contribute to low correlations. ”

a. seems like another paragraph with mixed issues.
b. declines mean from “smoothed Jones” to “your smoothed”. Ok. Checked on that. my first

assumption would be smoothed versus unsmoothed from the text.
c. “changes in HadCRU2 version”: I guess this would explain the differences between all

the versions?
d. “Jones et al [1998] mentions as a caveat that inaccurate temperature series may

contribute to low correlations.” He is hypothesizing fault in CRU?

10. This website (http://www.cru.uea.ac.uk/cru/data/temperature/#faq) talks about CRU

editions of data changing. But I’m surprised that there would be this much

impact…especially since FAQ says changes are mostly in recent years data. Is also a note

about the poor quality of data in 1850s.

11. “let alone t-statistics allowing for spurious significance issues” watchu talkin’ bout

Willis?

12. “the Central England series – hardly a “proxy” for temperature”: what is it? a set of

instrumental records? what is it doing in here?

13. “The Svalbard melt series is hugely non-normal.” Implication? (for civilians?)

14. “The Tornetrask series was “adjusted”. This may have an effect.” (What’s going on?

What’s your concern?)

15. Again, it is disquieting that this reconstruction is “more accurate” than the CEng

series: Which one? Tornetrask? where is that on the chart?
TCO

Posted Sep 19, 2005 at 8:17 PM | Permalink

So much for notepad…
TCO

Posted Sep 19, 2005 at 8:17 PM | Permalink

Try turning word wrap off next time, I guess.2
TCO

Posted Sep 19, 2005 at 8:18 PM | Permalink

My cat typed that last “2”
Paul Penrose

Posted Sep 19, 2005 at 9:26 PM | Permalink

TCO,
I’ve been concerned with using tree-ring widths as temperature proxies since I first came across the subject a few years ago. The signal-to-noise ratio is so poor that you have to jump through a lot of statistical hoops to get anything out of them. The problem is, how do you know that what you are seeing when you are done is indeed real, what with the data so heavily massaged? My take away from Steve’s work is: doubt. We still don’t know if we have recovered the real signal there, but it won’t done by using private datasets and secret data processing procedures. Before we bet the future on this stuff it has got to be much more solid, reliable, and repeatable.
TCO

Posted Sep 19, 2005 at 9:49 PM | Permalink

Oh sure, agreed. I’m just trying to follow the thought thread in the post…
Steve McIntyre

Posted Sep 19, 2005 at 10:29 PM | Permalink

I’ll picku up a few now:
1. the t-statistic is related to the r-statistic but allows for the number of measurements. Smoothing reduces the number of “effective” readings, so the significance of any reading is much reduced. For example. If you have two readings only, you will get 100% correlation half the time, but it doesn’t mean anything.
2. for calculating statistical significance – I’m not sure. I’m trying to understand “low frequency” issues.
3. It’s hard to say why the answers are so different. I often have trouble replicating Hockey Team results as you know. Whether it’s different series versions, different temperature versions, I don’t know. There are some weird differences between temperature editions, which I’ve not explored, but would be fertile ground for someone to do. The Hockey Team does not annotate weird changes.
4. my calculation of the t-statistic. The t-statistic from OLS tables assumes independence, which is not the case in autocorrelated series. So this is a lower limit t-statistic,
5. gridcell. That’s what Jones et al do; it’s different than Mann who correlate to “climate fields”, which are weighted averages of gridcell temperatures.
6. Tornetrask=Fenno(scandia); ENG=C England. It’s probably better as 2 paragraphs.
7. The assumption that you can have a “low frequency” relationship between a proxy and temperature without having a “high frequency” one raises lots of problems about how you’d go about establishing one. I haven’t seen any statistical discussions by the Hockey Team on this, and it seems to me to raise a lot of problems.
8. my calculation. Jones doesn’t use t-statistics. Anyone looking at the t-statistics would know that they are insignificicant but the correlations “look” significant. Usual Hockey Team stuff.
9. He’d say this is for the earlier HadCRU. I’m a little worried about someone working both sides of these statistical relationships – there’s a temptation to tailor results a little, especially when there’s no backup on the HadCRU temperature results.
10. I’ve noticed some very odd changes. For examples, 4 gridcells used in MBH, which supposedly had over 50% available observations in HadCRU v1, had 0 observations in HAdCRU v2. What’s going on?
11. see above.
12. I can see why you want to use long temperature series in reconstructing past temperatures, but don’t call them “proxies” and use performance statistics based on partial use of actual temperature data to sell results for steps with only proxy data. Let’s see the proxy-only step. It’s a Hockey Team thing,
13. At best it’s a nonlinear transformation of temperature and will cause bias when averaged.
14. See my post in April or May on Tornetrask adjstments. Their reconstruction went down in the 20th century; so they tilted it up.
15. Tornetrask=Fenno.
TCO

Posted Sep 19, 2005 at 11:22 PM | Permalink

Thanks, man(n). 😉
TCO

Posted Sep 19, 2005 at 11:26 PM | Permalink

does Jones compile CRU? Is there an alternate compilation? I guess you could rerun against the alternate. maybe even different versions of the alternate. That would show if large changes are to be expected from revisions done by different compilers.

Sorry, I keep suggesting more experiments for you…
John A

Posted Sep 20, 2005 at 12:37 AM | Permalink

Doesn’t everyone know by now that direct measurement of temperature using thermometers is much less accurate than a few well chosen stands of trees, a laptop, and a PhD in saving the world?
Louis Hissink

Posted Sep 20, 2005 at 4:25 AM | Permalink

Steve,

Is Jones still keeping his data exempt from scrutiny?

I would have thought that temperature data collected by a taxpayer funded organisation should be available to all.

As an interesting aside I am off bush next week personally doing an electromagnetic survey, single-user, backpack machine. It’s efficiency is temperature dependant and as long as ambient air temp is low, (say up to 30 deg Celsius) instrumental drift is linear.

Drift is of course easily recorded, repeated measurements during the day of the same station in the survey.

If the drift is linear, we can correct the data, and essentially a no-brainer.

However if the drift is not linear, then we reject the data totally and repeat the survey next day, which at the end of the day is again examined for drift linearity.

Electromagnetic surveys of the earth’s surface used by the mining industry are restricted by “atmospheric effects”, (‘spherics”) or electrical noise in the earth’s atmosphere. If the “spherics” are high, no surveying is done. Possible but the data is junk.

This atmospheric noise is essentially electrical disturbances in the earth’s atmosphere, or electric field. This means the existence of electric currents passing through a resistive medium, (the air), and for those of us who understand how light-bulbs work, an indication of an input of energy. Heat is produced.

Temperature is an indication of energy state of a substance.

And I leave it here for contemplation.
Steve McIntyre

Posted Sep 20, 2005 at 7:10 AM | Permalink

Jones gridcell temperature data is still private. Warwick Hughes has tried hard to get it without success. Update: What I was meaning to say (and is clear from prior comments on this site) was that the station data underlying the gridcell data is kept under lock and key; the gridcell results are publicly available.
TCO

Posted Sep 20, 2005 at 7:33 AM | Permalink

Is there an alternate?
JerryB

Posted Sep 20, 2005 at 7:56 AM | Permalink

“Is there an alternate?”

Yes.
JerryB

Posted Sep 20, 2005 at 6:26 PM | Permalink

Mike,

I have the impression that Steve is spending less time online today than usual, so let me mention that Steve has discussed Durbin-Watson in other threads, for example in http://www.climateaudit.org/?p=317
Dave Dardinger

Posted Sep 20, 2005 at 6:42 PM | Permalink

Speaking of not spending time at one’s blog, what has happened to RealClimate? I hadn’t been there in quite some time and figure there’d be tons of new threads. But there was exactly 1 new thread in over a month. Have the principals lost interest?
TCO

Posted Sep 20, 2005 at 7:37 PM | Permalink

Merger talks…
JerryB

Posted Sep 20, 2005 at 7:47 PM | Permalink

What happened to Mike Hollinshead’s comments? They disappeared from this thread, as well as from at least one other thread.

John A: call your office!
JerryB

Posted Sep 20, 2005 at 7:59 PM | Permalink

Mike,

FYI, this site uses a software package called “Sparm Karma” to block spam, but sometimes it misinterprets good stuff as spam.

I would expect that Steve, or John A, will reinstate your comments fairly quickly.

Don’t take the (presumably temprorary) deletions personally.
Dave Dardinger

Posted Sep 20, 2005 at 8:17 PM | Permalink

Ah, pink potted meat strikes again!
David Stockwell

Posted Sep 20, 2005 at 8:26 PM | Permalink

Why low significance? Consider a theoretical model of trees of a single species uniformly and randomly distributed in an area with a range of altitude. Assume the species has an optimal growth temperature outside of which growth rate declines. Now increase the temperature. The increase in growth of trees in suboptimal conditions will be compensated by the decrease in growth of trees in suboptimal conditions. So, if you randomly sample the temperature history of the area, by coring random trees, the average of any measure of deviation should be zero. Multiply this by multiple species and you have a real world situation. Since there should be no reason, on a theoretical analysis, to expect a temperature signal from averages of tree rings, it follows that the only reason a temperature signal has been detected results from biased sampling. Note this should not necessarily apply to other proxies such as glacier length, or to studies constrained to local areas.
TCO

Posted Sep 20, 2005 at 9:16 PM | Permalink

Assumes perfect and instantaneous response to the forcing function. That forests have legs. Not reasonable.
David Stockwell

Posted Sep 20, 2005 at 9:24 PM | Permalink

No, lags are irrelevant. No Ents required.
TCO

Posted Sep 20, 2005 at 9:29 PM | Permalink

Wrong. try doing a forcing function of high frequency on a system whose resonance is at low frequency.

that sounded pretty mathematical (gotta hold me ground now and hope that I’m not “Sidding”)
David Stockwell

Posted Sep 20, 2005 at 9:35 PM | Permalink

OK, making me think. I don’t see how the frequency matters as it is relative and there is no scale. We also assume that the age of the trees is considerably longer than the duration of the forcing, if thats what you mean.
Louis Hissink

Posted Sep 21, 2005 at 6:45 AM | Permalink

Jerryb,

there is an alternate? If so, where and how?

Steve,

thanks – Warwick is persistent but this rejection of data requests means that Jones is afraid of the data becoming public. No need to write here what I think about Jones’ action on that account.
TCO

Posted Sep 21, 2005 at 7:22 AM | Permalink

Just reread your post. I guess in your model world, with your model methods that would be the result. However, there is no reason for a real researcher to be so constrained (e.g. to “random coring” or (implicitly) to not inlcuding locational issues (e.g. elevation) as a variable within a multiple regression). And there is no reason why your trees have to be situated in such a manner or have such a response.
JerryB

Posted Sep 21, 2005 at 7:34 AM | Permalink

Louis,

First, let me clarify the question: “is there an alternate?”.

When Steve wrote: “Jones gridcell temperature data is still private.”, his wording was open to multiple interpretations. When he added: “Warwick Hughes has tried hard to get it without success.”, he narrowed the possible interpretations to (land based) suface station temperature data. (Other kinds of “Jones gridcell temperature data” are publicly available.)

One collection of land based surface station temperature data that is publicly available is called GHCN (Global Historical Climatology Network). A brief overview of GHCN is available at http://www.ncdc.noaa.gov/oa/climate/research/ghcn/ghcnoverview.html

Caveat: any such collection includes problematic data.
David Stockwell

Posted Sep 21, 2005 at 7:40 AM | Permalink

That is my point. The theoretical model predicts that in the real world, extracting a temperature signal from trees in environments of variable optimality REQUIRES selection, or call it ‘cherry picking’. There is no apriori reson to expect dendroclimatology to produce a reliable climate signal from a random sample of trees. This is not the case however with other proxies such as glacier length or isotope ratios where one expects, apriori, the response to be linear with temperature not parabolic as is the case with trees.
Steve McIntyre

Posted Sep 21, 2005 at 7:45 AM | Permalink

It’s the station data that’s not available – the gridcell information is available.
TCO

Posted Sep 21, 2005 at 7:48 AM | Permalink

Did you follow what I meant with the multiple regression comment? The implication is that with extra equations, you can solve for extra unknowns. It’s an algebra concept and is reasonable (not cherrypicking). Basically, in “Dave’s model world”, I would just record elevation and note that the areas on the “too hot zone” were getting worse and on the “too cold zone” were getting better during any warming period.

Steve can back me up here. I’m not doing anything snaky here. The algebra issue is that in your model world, we have a quadratic response to temp, so that I need to solve a regression for both t and tsq. HAving the extra elevation variable allows me to do that.

Slight pedant point: if RW, MXD are both collected (and differ in their reaction to temp) that may also allow me to deconvolute your conundrum. But elevation recording and input into the model is the obvious method.
TCO

Posted Sep 21, 2005 at 7:52 AM | Permalink

Why is the station data not available (short the names of people). Is it still a privacy concern (if exact locations are given maybe)?

What is the source for the gridcell temps in that other record (the stations)? Who did the construction? Still seems like a worthwhile test to look at how that series tends to move with version and interact with proxies, versus the JOnes CRUs (if you want to snoop for possible finagling by Jones).
David Stockwell

Posted Sep 21, 2005 at 8:00 AM | Permalink

#31 TCO, yes you could try that, and attempt to use all trees, but the following problems come to mind. First you have to know the optimal temperature and response curve for each species, and the position of each tree relative to that. Also, temperature is only one factor defining optimal habitat. The more parameters you need in your model, the more potential errors you incorporate, and more data required. Not saying you couldn’t with a great deal of selection or additional information calibrate a more complex model, just that one wouldn’t expect a temperature signal from a simple model using a parabolic proxy.
TCO

Posted Sep 21, 2005 at 9:00 AM | Permalink

Agreed.
JerryB

Posted Sep 21, 2005 at 9:01 AM | Permalink

Steve, or John A,

In case you do not get to review yesterday’s comments, let me mention that some comments by Mike Hollinshead were deleted from this thread yesterday between comments 14 and 18.

It seems that he exceeded Spam Karma’s new visitor first day threshold.
David Stockwell

Posted Sep 21, 2005 at 9:15 AM | Permalink

#34 And relevant to todays hot thread on treelines, while averages of tree cores have expectation of zero temperature signal, the methodology is probably not ‘foxable’, a treeline proxy would not suffer from the same cancellation problem, as we would expect treeline response to roughly linear with temperature.