Many of the good folks who write the papers and keep the databases seem not to use their naked eyeballs. By that I mean, they seriously think that you can invent some new procedure, and then apply it across the board to transform a group of a thousand datasets without looking at each and every dataset to see what effect it might have. Or alternatively, they simply grab a great whacking heap of proxies and toss them into their latest statistical meat grinder. They take a hard look at what comes out of the grinder … but they seem not to have applied the naked eyeball to the proxies going into the grinder.
So, I thought I might remedy that shortcoming for the proxies used in Mann 2008. As a first cut at understanding the proxies, I use a three-pronged approach — I examine a violinplot, a “q-q” plot, and a conventional plot (amplitude vs. time) of each proxy. To start with, let’s see what some standard distributions look like. The charts below are in horizontal groups of three (violinplot, qq plot, and amp/time) with the name over the middle (qq) chart of each group of three.
A “violinplot” (left of each group of three, yellow) can be thought of as two smoothed histograms back-to-back. In addition, the small boxplot in the center of the violinplot shows interquartile range as a small black rectangle. The median of the dataset is shown by the red dot. Below each violinplot are the best estimates of the AR and MA variables as fit by the R function “arima(x,c(1,0,1))”. Note that in some cases either the MA coefficient, or both the AR and MA coefficients, are not significant and are not shown.
The “QQ Plot” (center of three, black circles) compares the actual distribution with the theoretical distribution. If the distribution is normal, the circles will run from corner to corner of the box. The green “QQLine” is drawn through the first and third quartile points. The red dotted line shows the Normal QQLine, and goes corner to corner. It is drawn underneath the QQLine, which sometimes hides it if the distribution is near-normal.
The standard plot (right of three, blue) shows the actual data versus time. The heteroscedasticity (Breusch-Godfrey test) is shown above the standard plot, with a rejection value of p less than .05.
Below are some possibile distributions that we might find. A number of them look related to “power law” type distributions (exponential, pareto, lognorm, gamma, fdist) although there are subtle differences. These all look like the letter “J”, and seem to differ in the upper tails, the tightness of the bend, and the placement of the QQLine.
The sinusoidal, Sin + Noise, and Uniform distributions all look like the letter “S”. Comparison with the QQLine shows that there is too much data in the tails. On the other hand, the “normal squared” distribution is a signed square of a normal distribution, and has too little data in the tails.
Finally, note that the ARMA distributions are not at first blush visually distinguishable from normal distributions. However, as you can see by the blue plots, they are more lifelike.
With that in hand, let’s take a look at some of the Mann proxies. What I did was simply go through the proxies and look for anomalies, proxies that seemed strange in some way. I didn’t look at numbers, that comes later.
I’m not sure what to say about the first 18 proxies shown below. A number of these (5, 15, 73, 249, 293) have a very high value in the first years. This gives them what appears to be a power law distribution … but looking at the standard plots in blue, it is obvious that they are not power law distributions. Instead, they are a normal-like distribution with something strange going on in the very first part of the record. Others, like 197 and 204, are known in the trade by the technical term “kinda goofy distributions”. And some, like 22, seem to be fine except for one year that’s way out of kilter. (As an aside, #22 is an excellent example of a proxy where the naked eyeball immediately identifies a problem which the statistics have trouble picking out.)
In the next set of 18 proxies below, we start wandering further afield. Does it seem reasonable that 422, 430, and 468 are temperature proxies? Did they really use the interpolated values in 422? What’s happening at the start of 418, 604, 656, 715, and 885? What’s going on with the recent data in 432? And 316 is just plain bizarro, asymmetrically bimodal with the median (red dot) way up near the top.
Below is a final set of 18 proxies, and the fun continues. To start with, at 897 it looks like we have a real live Poisson distribution, but it is apparently bounded at the top … what’s that doing here?. 899 looks like it might be some kind of binomial. 1017 looks like a true Pareto or power-law distribution. 1044 appears to be three different records laid end-to-end. 1055, one of Thomson’s mystery proxies, starts small and increases steadily for a thousand years … temperature proxy? You be the judge.
Moving on, 1062 is one of the proxies that is known to be compromised in recent years. We have more of the early-year high value problem in 1111 and 1148. On the other hand, 1176 is both high and low at the start. And I don’t have a clue what’s going on in the last three (1204, 1206, 1208), except that 1208 looks like an upside-down power law of some kind.
Now, I must confess, I don’t really know how to sum up Mann’s list of candidate proxies. I would call it an “undigested dog’s breakfast”, were it not for the implicit insult to the canine species. For some of these proxies, like 1148, we might be justified in chopping off the early years and using the rest. For others, rejection (or in some cases driving a stake through their hearts) seems more appropriate.
So let me pose a more general question. Mann is using his own special Pick Two comparison method to select, from among these 1209 starting contestants, the proxies that have some nominal relationship with recent temperatures. But it appears to me that the process badly needs some kind of “pre-selection” ex ante criteria. These criteria would weed out those proxies which, although they might pass the “Pick Two” test, and they assuredly pass the “grab any proxy” test, don’t pass the naked eyeball test. But the naked eyeball test is not mathematical enough for the purpose. My general question is:
What kind of ex-ante test might we use to exclude certain categories of these proxies?
As a first cut at this question, it does seem that if a proxy is sufficiently heteroscedastic, it should be excluded. One of the main ideas of reconstructions using proxies is the “Uniformitarian Principle”. This principle states that physical and biological processes that connect environmental processes to the proxy records operated the same in the past as they do in the present. This Uniformitarian assumption underlies all proxy reconstructions. We assume that the same physical and chemical rules that connect temperature and proxy in the period of the instrumental record have applied throughout the history of the proxy. And as such, the record should reflect that unvarying influence by having relatively constant variance over the period of record.
In addition, regardless of the Uniformitarian Principle, we need to use proxies that are at least somewhat stationary with respect to variance, or our statistics will mislead us. If the proxy variance is different in the present and the past, we will under- or over-estimate past temperature variance.
Looking at the records above, I would suggest that any proxy record which is heteroscedastic at p less than 0.01 should be excluded from consideration. This would not rule out all of the … mmm … “more interesting” records above, so it might be a “necessary but not sufficient” test for candidate proxies. However, it does distinguish between 107, which I had called a “true Pareto” before checking the scedasticity, and 1062, a known bad proxy.
Finally, I did not choose the proxies above by any statistical methods. I just picked these because they struck my eye. And there are plenty more. I had to whittle my choices down a ways to get to the 54 proxies shown above.
And as I found out after making these selections, of the 1209 proxies, no less than 571 of them are heteroscedastic at p less than .01.
1. I picked these proxies out, not by looking at their homoscedasticity, not by any kind of statistical analysis, but simply by looking at each and every one of the 1209 Mann proxies. The human eye is a marvelous tool.
2. This process can be described by one word … boring. It is not as glamorous as flying to England to help free vandals, or as interesting as testifying before the US Congress about how you are being muzzled.
3. Nevertheless, it is the most crucial part of the process. Why? Because it is the part of the process that only a human can do. Only by examining them, proxy by proxy, can we come to any reasonable conclusions about their possible fitness as temperature proxies.
Best to everyone,