The PC1 in Mann and Jones [2003], Jones and Mann [2004]

You’d think that there would be little left to figure out about Mann’s PC methods. I’ve been re-examining the PC1 in Mann and Jones [2003] and Jones and Mann [2004] for reasons that I’ll explain further in my next post. The data is at WDCP here but I wasn’t able to replicate this result and had basically given up temporarily. Since I don’t like loose ends, it’s irritated me. Out of the blue, I figured out what he did. The solution was splicing of Mannian proportions.

Let’s review the clues. We know that MBH98 and MBH99 uses a screwy PC methodology. Mann and Jones [2003] say in addition only the following:

western North American tree-ring temperature reconstruction, (warm season temperature [Mann et al., 1999]; we employ an extension of the first principal component of the western North American tree-ring data based on 6 ultra-long lived, temperature-sensitive Western North American tree ring records available back to AD 200″¢’¬?the resulting series is virtually indistinguishable from the corresponding Principal Component (PC) series used by Mann et al., [1999] based on 27 available chronologies during the AD 1000–1980 overlap interval).

I was able to replicate the selection of 6 sites pretty easily. I’ve done a collation of details such as start date, type etc. for the ITRDB North American tree ring data base which I use all the time (this is inherent in the WDCP functions but unfortunately not available as a data frame). Using these start dates, requiring a start prior to 201, for series with the ITRDB ids from MBH98, I located exactly 6 sites. Five of the 6 were (surprise, surprise) bristlecones; the 6th was a New Mexico site of Grissino-Mayer, used in precipitation reconstructions. One of the sites is our old favorite, Sheep Mountain.

id	location	type	lat	long	cell
ca534	SHEEP MOUNTAIN	PILO	37.22	-118.13	733
ca535	METHUSELAH WALK	PILO	37.26	-118.1	733
nm572	EL MALPAIS	PSME	34.58	-108.06	807
nv515	INDIAN GARDEN	PILO	39.05	-115.26	733
nv516	HILL 10842	PILO	38.56	-114.14	734
ut509	MAMMOTH CREEK	PILO	37.39	-112.4	734

When I did a mannomatic PC calculation on these 6 sites, the eigenvector weight of Sheep Mountain (coefficient^2) is over 80% – so it wears the pants in the PC1. There is some additional information on the PC1 in the caption to Figure 4 of Jones and Mann [2004] which says:

Local and regional proxy temperature reconstructions by continent. Source references for all series are given in Table 1. Each series has been normalized over the period 1751–1950 and then smoothed with a 50-year Gaussian filter. For the decadally resolved data the normalization period is the 20 decades from 1750 to 1949, smoothed using a 5-decade filter.

Table 1 merely cites MBH98 as authority (not Mann and Jones, 2003), but claims an annual correlation of 0.20 to gridcell temperature and 0.56 to decadally smoothed temperature. It states that the corresponding instrumental data is either for the overlying 5 degree by 5 degree grid box (for single-site proxies) or averages of several boxes (for regional or multiproxy series). There is a comment in passing about CO2 fertilization in Jones and Mann, 2004:

During the most recent decades, there is evidence that the response of tree ring indicators to climate has changed, particularly at higher latitudes and more so for density than ring width measurements [Briffa et al., 1998a]. One suggested source for this behavior is “Å”ÅCO2 fertilization,” the potential enhancement of tree growth at higher ambient CO2 concentrations. Though it is extremely difficult to establish this existence of this effect [Wigley et al., 1988], there is evidence that it may increase annual ring widths in high-elevation drought-stressed trees [Graybill and Idso, 1993]. Recent work making use of climate reconstructions from such trees has typically sought to remove such influences prior to use in climate reconstruction [Mann et al., 1999; Mann and Jones, 2003].

Now as noted above, Mann and Jones 2003 does not refer to any "adjustment", but MBH99 has a ridiculous adjustment which imputes CO2 fertilization in the 19th century and negative fertilization in the 20th century. There are "fixed" PC1s and the calculation is inelegant but can be decoded from information at the UVA site. The earliest directory containing "fixed" data is AD1000.

I noticed that the early portion of the plot for the emulated PC1 (using the mannomatic method) looked a lot like the archived version. In fact, it proved to have a correlation of >0.9999 for selected early intervals. With some experimentation, this value extended for the period from 200 to 1699 – so the archived version was definitely a re-scaled version of the mannomatic PC1 up to 1700 (but the relationship broke down afterwards.) The "fixing" of the AD1000 PC1 takes place after 1700 – so on a hunch I tested the correlation between the AD1000 "fixed" PC1 and the archived AD200 PC1 – bingo, there was a correlation of >0.9999. So the latter portion was a re-scaled version of the AD1000 "fixed" PC1, which was spliced with the AD200 mannomatic PC1.

Thus there were two successive "adjustments": first the AD1000 PC1 was coerced to have a low-frequency shape like the Jacoby NH composite (this is alluded to in MBH99); then this adjusted PC1 for the AD1000 period is spliced with the AD200 PC1. Remember the hyper-ventilating at http://www.davidappell.com about splicing PC series (which Rutherford and perhaps others had done in the file pcproxy.txt – the file originally provided to us. Here’s another example.

This entry was written by Stephen McIntyre, posted on Feb 8, 2006 at 9:05 PM, filed under General. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

17 Comments

David Stockwell

Posted Feb 9, 2006 at 9:01 AM | Permalink

At what point does an ‘adjustment’ become hoax? Splices are often involved in hoaxes. Think Piltdown man, with a modern cranium and primative jawbone. Think secrecy about revealing details of origins. I suppose in law one would have to show that the adjustments materially reduced the long time scale variance, and secondly that there was intent to deceive. It seems like you have both these ingredients through your meticulous reconstruction, and withholding of such information as R2 values.
jae

Posted Feb 9, 2006 at 10:28 AM | Permalink

The more I learn about these “reconstructions,” the more I think that all this meta-analysis simply obliterates any true signals, probably because of the differences in various proxies, relative to their “reaction times” to warming or cooling. It looks to me like individual proxies, EXCEPT TREE RINGS, are far more useful for reconstructing past temperatures. Am I way off base, here?
John Hekman

Posted Feb 9, 2006 at 11:31 AM | Permalink

Steve, I will be in Mammoth two weeks from now, if you need any pictures of the Mammoth Creek bristlecones.
–John
John Hekman

Posted Feb 9, 2006 at 12:57 PM | Permalink

Sorry, it looks like the Mammoth used here is Mammoth Utah. I’ll be in Mammoth Calif.
Nicholas

Posted Feb 10, 2006 at 10:07 AM | Permalink

jae : IANAS (I Am Not A Scientist) but my own, basic analysis of the proxy data in MBH98, which mostly involved plotting it and examining the plots, is that there is clearly a well-defined signal in all the non-tree proxies. Almost all the tree proxies seem to have no trend, and just hover around some median point, or have no trend and just hover around until 1900 whereupon they spike upwards. The exceptions were those proxies based on tree ring wood density, not on ring width. Those seemed to have signals which correlated well to the other non-tree proxies.

If I averaged together all the non-tree proxies which proxied temperature (not precipitation) along with the two tree ring density series, I got a graph which showed a warming between 1400 (start of series) and about 1550, then a steady decline to around 1850, then a climb back up to a point a little below the earlier peak. It’s hard to calibrate it properly, but that seemed about what I expected. However, since I was only using about 12 of the hundreds of proxies (since they’re almost all tree ring measurements), it could have been geographically biased. It certainly included samples of data from various points around the world, but I can’t guarantee they were spread out evenly. I’d have to plot them on a map. Even then, the coverage of 12 samples would be pretty poor. Still, since a lot of the hundreds of tree ring series are geographically close, I’m not convinced they help all that much in that sense.

IMO, throwing all this data in together makes no sense, especially because the precipitation proxies seemed to tell a different story from the temperature proxies I looked at, and I really don’t see the case for using the tree ring widths at all. The only significant signal in them that I detected is the one which M&M have pointed out was decided was not due to temperature by the people who studied those particular series.

Anyway, the reason I examined the data in the first place is that I wanted to see what the raw data looked like, not just the result of some statistical munging into what seemed like random noise up to 1900. I didn’t know whether the data looked random, or meaningful, or what. Now I do, and after reading what M&M have reported, encoutered few surprises. The best result I got, I think, was to see how clearly the tree ring density series differ from the tree ring width series.
jae

Posted Feb 10, 2006 at 10:36 AM | Permalink

Nicholas: Thanks. That all makes a great deal of sense to me, except I have doubts about wood density showing anything, since it is closely associated with growth rate in softwoods (http://www.city-net.com/albertfp/density.htm). Are you going to look at more proxies?
Nicholas

Posted Feb 10, 2006 at 11:30 AM | Permalink

jae : As time permits.

I was a bit off in my last comment, there’s more than 2 wood density proxies. All of them are by Briffa et al. There’s Fennoscandia (one set of data), Northern Urals (one set of data), and North-West America (many sets of data). The way Mann et al. document the series they used is cumbersome and makes it difficult for me to study it. For example, rather than keeping all the North-West America density series seperate, they mix them in along with the tree ring width series (at the end of each set in a given American state). Since I was only looking at the two that were easy to seperate from the rest, I forgot about the third set.

Here are some graphs, using my non-standard integration method of processing the proxy data to look for trends. They represent, top to bottom:

* The Fennoscandia and Nortern Urals wood density series. You can see they seem to have a somewhat similar trend, although not of quite the same magnitude. That may be because of my very basic calibration though.
* The average of three of the Arizona (part of the North-West USA series) tree ring density series. These also seem to have some trends going on, although I haven’t checked all of them to see if they overlap well. I’m not sure what the difference between Az553/554/555 and Az553x/554x/555x are. They seem similar, but not the same. It’s not documented in the mbh98datasummary.txt file which I have.
* Same for California. Again, there seem to be meaningful trends which don’t just look like random noise. However it doesn’t seem to correspond well to the Arizona one. They’re neighbouring states (right?) and both technically deserts I believe, however I think there is somewhat more precipitation in California which might cause some significant differences.
* The averages of the Arkansas (ring width), Arizona (ring width), Vagonov (Soviet tree lines) and South-West USA/Mexico (ring width) series. These four are to demonstrate what seem to me to be basically trendless data sets. I’m just using my eyes, but I don’t see much of significance there. Of course, it’s always possible that the climate didn’t change very much over 600 years at these sites, but that doesn’t seem to square with the other pieces of data. If you average it together, you do get a dip around 1580 and a pretty big dip around 1825-1900, which is not entirely inconsistent with other proxies, but it’s not very convincing to me. The magnitude of variance is a lot less than what the ice cores and boreholes show.
* Just to show it’s not the average which is robbing the data of any signal, individual plots of five of the Arkansas data sets. They seem pretty random to me.

It’s hard for me to say what the tree ring widths or densities represent exactly, but I see more long-term variance in the density data. Perhaps it responds more slowly to small changes in climate? Just guessing.

Please note, this is a very basic analysis, and is only my attempt to examine the data and get a feel for its character.
Nicholas

Posted Feb 10, 2006 at 11:37 AM | Permalink

Interesting, I just noticed, the California tree ring density series don’t seem to have the same post-1900 spike that the California tree ring width series have. I wonder if it’s due to different trees being studied, or that whatever caused the rings to get wider didn’t affect the density? I guess denser rings may well be thinner in general.

Correction: When I said “North-West USA” I meant “Western Northern America”, which is not quite the same thing. I guess “Western USA” is a better description, since I don’t see any Canadian sites.
jae

Posted Feb 10, 2006 at 12:43 PM | Permalink

Nicolas:

You graphed Az553x/554x/555x; did you also graph Az553/554/555. It would be interesting to see what Mann did here.
Nicholas

Posted Feb 10, 2006 at 10:02 PM | Permalink

jaw : Yes, I did, here’s a comparison for both the Az and Ca series. (First Az…, then Az…x, then Ca…, then Ca…x).

They’re actually dynamically generated graphs, you can change what they draw yourself if you want to fiddle with it.

I can’t work out what’s going on. Perhaps “x” indicates that the original authors published a new set of data based on additional trees? Some of them just seem to have been renormalized between the non-x/x series, whereas others look quite different. Ca556x seems a bit odd, I’ve graphed it down the bottom, it starts out really far into the negative zone – perhaps because it’s based on a tree which started growing around the start of the series, maybe in a bad location, so didn’t do well at first.
Nicholas

Posted Feb 10, 2006 at 10:23 PM | Permalink

Of greater relevance to this post (The PC1 in Mann and Jones [2003]), here’s an interesting graph. The top one is of the six series mentioned here. Note how Ca534 spikes. Below it is the regular average of all six. Pronounced “hockey-stick” shape. Then, below, I repeat the same exercise without Ca534. Hockey stick gone! And for my next trick…

Here’s a graph of the first ten in the CA series, with their average below, to demonstrate that the “hockey-stick” shape is not common to many of these series. It’s in a few out of hundreds, from what I have seen so far, but it’s of such great magnitude that it tends to dominate the shape of the average of a large number of the data series. I can’t imagine how much effect weighting those particular proxies by 80% or more must have on the final result.
Steve McIntyre

Posted Feb 10, 2006 at 10:43 PM | Permalink

Nicholas, the java script is nice for showing the proxies. Re #10 – the x suffix shows a density (mxd) series; the others are ring width. ca534 is the bristlecone Sheep Mountain that you’ve heard lots about – it gives over 80% of the eigenvector weighting in the MJ04 PC1. Nice stuff. I’ve posted up data files for the O&B proxies – to the extent that I’ve figured them out. Those would be of current interest as well.
Nicholas

Posted Feb 11, 2006 at 12:41 AM | Permalink

Ah, thanks, interesting. As I said the mbh98datasummary.txt file which I’m using to work out what each data set represents is confusing to me. First it describes each set according to the location, what it’s measure, what it represents and who took the measurements. Then in a seperate section it lists the data sets present along with the start year, end year, lat/long and source study. The oddity is, the first section ONLY mentions Briffa K. and Schweingruber F.H. in relation to “Dendro density”, not “Dendro ring widths”, but the sets like CA556 are credited to Briffa and Schweingruber, which according to the lack of “x” are ring widths, not density. I find that a bit misleading. Now that you’ve pointed it out, I found the section where it mentions what the “x” means. Thanks for explaining it.

Also interesting is that when I look at the corresponding ring width and density data sets, some of them seem to have very good correlation, such as Az555/Az555x and Az554/Az554x but others like Az553/Az553x have what look like no or negative correlation. Perhaps what is happening is, if the soil is good and precipitation is plenty, higher (lower?) temperatures are allowing the trees to make wider AND denser rings, whereas in other areas the soil is bad or precipitation is poor, and higher (lower?) temperatures are allowing for wider rings, but no more density can be achieved – and in some cases lower. That’s just a guess, but it seems like there must be some process determining whether there is good correlation between the two attributes, or not.

None of the Ca width/density comparisons look to me like a negative correlation but overall the correlation doesn’t seem to be as strong as those two Az series, which look almost identical between the width and density graphs in terms of shape.
jae

Posted Feb 11, 2006 at 8:51 AM | Permalink

Nicolas:

Very interesting plots.

Like all living things, genetics also plays a very important part in tree growth rates, density, etc. Therefore, you can probably find a tree to prove anything you want to.
Steve Hemphill

Posted Feb 12, 2006 at 12:32 AM | Permalink

Steve – your site lists some random series applied to the MBH98 methodology. I graphed them, and was wondering if the math is still valid.

Thanks…

Steve Mc: No one’s thrown any stones at this calculation. Now these are not randomly chosen from the PC1s but were selected to illustrate high-end. But I suspect that, if you selected 14 at random from the population and flipped them to ensure orientation, and then applied O&B methods, you’d get results similar to O&B. It would be a useful exercise. I’ve kept all the simulated PC1s but they are 50 MB in size, although in 10 5 MB packages.
Steve Hemphill

Posted Feb 12, 2006 at 10:45 AM | Permalink

If you would send them to me I’d like to look at them.

Thanks,
Steve
Terry

Posted Feb 12, 2006 at 6:37 PM | Permalink

Does anybody know if there is any pre-processing of the proxy series before they are fed into MBH etc.? To put it another way, are the proxy series simply plots of tree ring widths (or whatever), or are the raw measurements processed in some way to take into account things such as variation in growth rates over the life of the tree?

Thanks.