Spam Karma reset

We’ve had a few new posters fail to negotiate the Spam Karma filter, possibly because of using the same domain for e-mail as spammers use. Also, if you see the comment numbers appear to desynchronize from the quoted comment numbers, its much more likely to be because Steve or myself have restored a comment or two from the Spam Karma filter that should not have been there.

To clear out problems I’ve reset the Spam Karma filter, which means that everybody will be treated as a new commenter until the filter gets used to you again. If you get blocked by the filter, don’t send it again (that just makes Spam Karma worse), just send a quick e-mail to Steve. Usually if you do nothing, the comment will be retrieved between minutes and hours later in any case.

If you’re new to commenting on Climate Audit, I would suggest that you save your comments in case Spam Karma blocks you and send the comment to Steve at his e-mail address.

The price of popularity in weblogging is attention from spammers of all kinds and eternal vigilence of the filter logs lest genuine comments go astray. In the case of Climate Audit, we currently receive around 7-12 spam comments and trackbacks per hour, so you’ll appreciate that this automation is absolutely necessary.

MBH98 Figure 7 Unveiled

Sometimes we hear that science is "self-correcting". To be "self-correcting", however, individual people have to step up to the plate now and then and actual do the work. To the extent that the term "science" includes the work of Mann et al., then analysis of MBH98 Figure 7 , which was actually the culmination of MBH98, is part of the "self-correcting" process. For all the sniping against blogs, we’ve seen a nice use of blogs over the past few days in exposing MBH98 Figure 7.

Chefen put the issue on the table, pointing out that the partial correlation coefficients simply didn’t tie together. I brought this to the attention of readers here. Jean S was able to decode Mann-speak and derive a graph that matched Mann’s (which I’ve now also been able to replicate.) Our replicating the Mann figure doesn’t mean that Mann was "right" and Chefen was "wrong" to point out the discrepancy. The difference identified by Chefen pointed to something potentially unstable in the results. Mann said that his results were "robust" – one of his favorite words" to different choices of window length (he used a 200-year window in the illustration), specifically mentioning that the conclusions were robust to a 100-year window. So Jean S ran the results for a 100-year window. I’ve replicated these results. Jean S’s script is here; mine is here.

It is inconceivable to me that any person could describe what you are about to see as "robust". For example, were a Nature referee presented with the 100 year graphs and asked to endorse the claim of "robustness", I do not believe that the claim would be accepted. But decide for yourselves.

Here’s how Mann described the construction of Figure 7:

We estimate the response of the climate to the three forcings based on an evolving multivariate regression method (Fig. 7). This time-dependent correlation approach generalizes on previous studies of (fixed) correlations between long-term Northern Hemisphere temperature records and possible forcing agents lLean et al 1995; Crowley and Kim, 1996]. Normalized regression (that is, correlation) coefficients r are simultaneously estimated between each of the three forcing series and the NH series from 1610 to 1995 in a 200-year moving window. The first calculated value centred at 1710 is based on data from 1610 to 1809, and the last value, centred at 1895, is based on data from 1796 to 1995″¢’‚¬?that is, the most recent 200 years. A window width of 200 yr was chosen to ensure that any given window contains enough samples to provide good signal-to-noise ratios in correlation estimates. Nonetheless, all of the important conclusions drawn below are robust to choosing other reasonable (for example, 100-year) window widths.

First, here is Jean S’s replication of MBH98 Figure 7.

Figure 1 : Jean S: solid – emulation; dash-dot lines from MBH98. Blue: CO2, green: solar, red: volcanic. There is no moving averages (smoothing) used here. Jean S: "Those are simply partial correlation coefficients in moving windows plotted such that the year corresponds to center of the window." SM – I think that Jean S and I are using "partial correlation coefficient" in different ways. Since our replication graphs are the same, underneath the algebra and terminology, we’re doing the same thing. I’ve haven’t tried to reconcile the terminology yet.

Now here is my version of the same thing expressed a little differently. Here I’ve shown in black – MBH98; blue – the regression coefficients from normalized multiple regression – Jean S’s partial coefficients; red – "partial correlation coefficients" a la Chefen. Like Jean S, I’ve obviously replicated the archived reconstruction. An interesting point here – I’ve simply used CO2 values without logging. I think that the CO2 curve is so smooth that logging must not make any difference. The blue version overprints the black version – so if you can’t tell the difference, it’s because the replication is exact.

200-year window
Figure 2. SM Emulation. Black – MBH98; blue – emulation using regression coefficients from normalized multiple regression – renormalized in each window; red – partial correlation coefficients.

100 Year Window

Remember Mann’s claim quoted above: "all of the important conclusions drawn below are robust to choosing other reasonable (for example, 100-year) window widths." Let’s look at the results using a 100-year window. First here is Jean S’s version/


Figure 3. Jean S – 100 year window, as Figure 1 above.

Now here is my version, arriving at virtually identical results as Jean S, again expressed a little differently.


Black- MBH98 archived; red – partial correlation coefficients; blue- OLS regression coefficients from regression of scaled series. 100 -year window for latter two.

I hardly need editorialize about some of the key points. Observe that the solar coefficient for some reason goes to negative relationships in the 19th century and then increases dramatically in the 20th century with values exceeding that of the CO2 coefficient. Now let’s see the "conclusions" which are supposedly "robust" to the use of 100 year windows.

The first conclusion is the significance of the correlation coefficients. As an editorial aside, didn’t Mann tell us that calculating correlation coefficients would be a "silly and incorrect thing" to do. And didn’t Wahl and Ammann agree with that? Oh well. Mann:

We test the significance of the correlation coefficients (r) relative to a null hypothesis of random correlation arising from natural climate variability, taking into account the reduced degrees of freedom in the correlations owing to substantial trends and low frequency variability in the NH series. The reduced degrees of freedom are modelled in terms of first-order markovian “‹Å“red noise’ correlation structure of the data series, described by the lag-one autocorrelation coefficient r during a 200-year window…For (positive) correlations with both CO2 and solar irradiance, the confidence levels are both approximately 0.24 (90%), 0.31 (95%), 0.41 (99%), while for the “‹Å“whiter’, relatively trendless, DVI index, the confidence levels for (negative) correlations are somewhat lower (-0.16, -0.20, -0.27 respectively). A one-sided significance test is used in each case because the physical nature of the forcing dictates a unique expected sign to the correlations (positive for CO2 and solar irradiance variations, negative for the DVI fluctuations).

The 200-year window has kept the solar coefficient in the positive range, while the 100-year window has the solar coefficient going from positive to negative. The volcanic coefficient is negative in the 200-year window, but its sign changes in the 100-year window. In fact, the sign of even the CO2 coefficient changes in both windows. I haven’t checked the correlation coefficients for significance yet. But my intuition is that the negative correlation coefficient for solar in the 19th century will prove "significant", which makes you wonder how realistic Mann’s significance testing us. Anyway on to the next conclusion:

The correlation statistics indicate highly significant detection of solar irradiance forcing in the NH series during the “‹Å“Maunder Minimum’ of solar activity from the mid-seventeenth to early eighteenth century which corresponds to an especially cold period. In turn, the steady increase in solar irradiance from the early nineteenth century through to the mid-twentieth century coincides with the general warming over the period, showing peak correlation during the mid-nineteenth century. The regression against solar irradiance indicates a sensitivity to changes in the “‹Å“solar constant’ of ,0.1 KW-1m-2, which is consistent with recent model based studies [42].

If you go back and look at the 200-year window, you see the rationalization for the bolded remark. There is a sort of local maximum to the solar regression coefficient in the mid-19th century. Now look at the 100-year window. The situation is exactly the opposite. You reach a maximum negative regression coefficient in the 19th century. What a crock. Going on:

Greenhouse forcing, on the other hand, shows no sign of significance until a large positive correlation sharply emerges as the moving window slides into the twentieth century. The partial correlation with CO2 indeed dominates over that of solar irradiance for the most recent 200-year interval, as increases in temperature and CO2 simultaneously accelerate through to the end of 1995, while solar irradiance levels off after the mid-twentieth century.

Using the 100-year window supposedly "robust", again, this is simply not true. You see increases in the regression coefficients of both solar and CO2 in the 20th century, with solar values actually out-stripping CO2 values. In these calculations, Mann has, by the way "grafted" the 1980-1995 instrumental record onto the proxy record up to 1980 and used the grafted values in calculation of the coefficients. Again, the reported conclusion is simply not true for the 100-year window. Next:

Explosive volcanism exhibits the expected marginally significant negative correlation with temperature during much of 1610–1995 period, most pronounced in the 200-year window centred near 1830 which includes the most explosive volcanic events.

Once again, going to a 100-year window, the conclusions don’t hold up. You get positive "correlations" at the beginning of the graph and again in the early 20th century. From this "analysis", Mann then concludes:

It is reasonable to infer that greenhouse-gas forcing is now the dominant external forcing of the climate system.

This is the analysis. I kid you not. Let’s suppose that a 3rd year student handed in this analysis – what grade would it get just as statistics? You probably wouldn’t fail a phys ed student in a state university who was taking a required statistics course, especially if he played on the basketball team. How about a Ph. D. student at an Ivy League university?

The other remarkable thing about this analysis is that this is so easy to do. In this case, Mann archived the data. Anyone – including me – could have done this analysis at any point in the past 6 years. It could have been done by our phys ed student mentioned above. One presumes that no one ever has or else the topic would have emerged before now. So score one for the blogs.

MBH98 Figure 7

Since Chefen has brought this figure into play, there’s much to consider about it. [Also see followup here.] For interested parties, the data is at WDCP here . I haven’t got to checking Chefen’s results yet, but wanted to table interesting results in passing. As Jean S noticed, the Corrigendum stated that an “old” version of the reconstruction was used in these calculations, but it didn’t “matter”. I’ve plotted the two versions together and, for the purposes of his correlations, the difference between the two versions probably doesn’t “matter” but it’s worth wondering why the two versions differ. Secondly, Figure 7 contains a splice of instrumental and proxy records, which presumably carry forward into the correlations. I’ll review what Mann has said in the past about such splices. Continue reading

THe Ritson Coefficient

In my previous comment on Ritson’s AR1 calculation, I think that I correctly diagnosed that the calculation was goofy, but I didn’t diagnose what was going on correctly (thanks to Demetris Kousoyannis who emailed me). I’ve re-visited it and I’m pretty sure that I’ve now diagnosed the problem with what Ritson was doing. My starting premise was 100% correct – if you simply do an AR1 fit to the US tree ring series, you get the high AR1 coefficients that I reported before and which differ radically from the Ritson coefficient. So you have to ask yourself: if there’s a perfectly good algorithm for calculating AR1 fits, why does Ritson propose a new algorithm for calculating an AR1 coefficient? Why wouldn’t he just use a standard algorithm? Needless to say, I am naturally pretty suspicious of Hockey Team non-standard algorithms?

Anyway, I checked Ritson’s method against synthetic AR1 series of varying AR1 coefficients up to and including random walk and, while the answers were different than those of a standard algorithm, they were all in the right general range.

Then I experimented with ARMA(ar=.9,ma=-0.6) noise, a type of noise pretty familiar from climate time series (leaving aside the larger question of long-term persistence and multiple scaling). ARMA (1,1) series are something that should be on the radar screen even of the Hockey Team.

Here the performance of the various methods varied fantastically. This is based on very quick simulations. If you correctly specified the model as ARMA(1,1), estimates of the AR1 coefficient using standard arima function in R were 0.8-.93, all pretty reasonable. If you estimated the AR1 coefficient using a mis-specified ARMA(1,0) model, you got AR1 coefficients in the 0.34-0.53 range, which, interestingly enough, is also in the range of observed AR1 fits to North American tree ring data.

Now for the Ritson coefficient: it was in the 0.0 to 0.2 range, again almost exactly the Ritson coefficients for the North American tree ring network. So the Ritson method fails catastrophically in the face of ARMA(1,1) noise. A conventional AR1 calculation is a little more stable against misspecification of ARMA(1,1); the Ritson method goes haywire.

Of course at bizarroclimate, they won’t care about such details.

It’s never easy with the Hockey Team.

http://www.climateaudit.org/?p=682

Something New on MBH98

Just when you think that the MBH98 Little Shop of Horrors has been fully explored, something new turns up. I’ve never spent any time on the last part of MBH98, where he does a "detection and attribution" study linking his temperature reconstruction to solar, volcanic and CO2. All these detection and attribution studies (e.g. Hegerl) richly deserve close analysis. Anyway a blogger in New Zealand has written some highly provocative analyses, which includes a close analysis of first differences, which one of our readers was wondering about. For readers who will only read articles listed in the Jesuit Index or Nature/Science, these analyses may not be for you. For readers who are interested in what’s going on with these studies, go there immediately. It’s at the felicitously named Sir Humphreys, go to Chefen. I’ve made a few snarks about signal processing approaches to things, but Chefen shows the strength of a signal processing approach in knowledgeable hands.

Here are a few teasers, but please don’t merely accept my short characterization of the results, check it out.

Chefen examined the Fourier spectra and Allen variances of the first differences in the Mann temperature reconstruction, obtaining

"a nearly perfectly straight line sloping downwards … with a slope of nearly -1. To anyone who works with signal processing that means… almost a pure noise source…. It looks whiteish with the low frequency removed, or as if some noise has been high-pass filtered.

He states:

If you do want to create your own Mann-method temperature noise signal, then just use a N(0,0.0115) distribution and add it up. It’s a fairly good approximation.)

Now for the next interesting find. Chefen did a simple check on the correlation of the temperature reconstruction to solar and to CO2, reporting:

Note at this stage that Mann outright claimed that the correlation for the CO2 data in the 20th century was much much higher than that for the solar data. Now straight away we are suspicious of that, because just looking at the graph above it is hard to say which is a better match. Indeed, if we do an actual correlation then the coefficient for the solar data is 0.804 while that for CO2 data is 0.815. That is, marginally higher for CO2 but the difference is utterly insignificant. We really have no cause for prefering one over the other and CO2 certainly does NOT dominate.

Now for the topper. Chefen gets exactly the same coefficients from random data processed according to the sum of filtered white noise described above:

Let’s take that idea a bit further, now I’ll generate a purely random data set in the following way…1. Create a normally distributed data set with the same mean and standard deviation as the difference temperature data. This is pretty much like white-noise and distributed as N(0.045,0.201).
2. High pass filter this with a cutoff frequency of 0.1/year to make the spectrum match that of the difference temperature data, which lacks power below 0.1/year.
3. Sum this data to create a noise data set that has the same characteristics as the real temperature data.
4. Calculate the correlation of this noise with the temperature data, just as we did for the solar and CO2 data.
5. Repeat this 100,000 times to get a good idea of what is going on.The correlations for with purely RANDOM data sets are tightly bunched in the range 0.80 to 0.83! The correlations for the CO2 and solar data lie completely WITHIN this range. What does this say?

It’s quite remarkable. Will his observations become more true if they are published in Nature or more formally in a journal? I don’t think so. They are either right or wrong as they stand. The journal is simply a form of endorsement and an effective means of disseminating the information to non-specialists. You have some qualified specialists, who may or may not have their own agendas, telling you that the article is worth reading. In this instance, I view myself as a qualified specialist and I’m telling you that this is worth reading. I’m not telling you that it’s necesaarily true as I haven’t replicated the simulations. But Nature referees don’t do that for you either.

Enron’s Climate Change Policy

I’ve posted up a little treat from the past – Enron’s climate change policy as downloaded in October 2002. I knew that Enron was favor of Kyoto before I knew of Michael Mann. In 1998, Enron received the Environmental Protection Agency’s (EPA) Climate Protection Award for its “exemplary efforts and achievements in protecting the global climate.” Their pamphlet includes the following interesting definition of climate change:

Climate change, also known as “global warming,” is a phenomenon that occurs when “greenhouse gases” are released into the atmosphere.

I don’t want to engage personally in a thread about carbon trading, other than to say that, as someone with some limited international trading experience, I can think of some important reasons why a person concerned about climate change could rationally oppose the Kyoto trading system as a relevant means of achieving that goal.

Enron Verdict

A big story today is the guilty verdict on Enron executives Kenneth Lay and Jeffrey Skilling. There are many interesting issues involved in this, but the one that I wish to draw to attention of readers here is that Lay and Skilling were not found guilty of stealing money or looting the treasury, but of dishonesty and withholding the truth.

For example, see this summary of the prosecutor’s closing argument:

But all the talk from lawyers ended Wednesday morning at 10:35, when prosecutor Sean Berkowitz summed up the government’s case against the defendants with a simple maneuver.

Berkowitz pulled out a large poster, which he displayed on an easel. On the one side, in big, capital letters, was the word, "TRUTH," and on the other, "LIES." After all the intricately detailed testimony that came before the jury about "dirty hedges," "goodwill write-downs," "monetizations" and "dark fiber sales," the black-and-white chart boiled it down to those two words.

"These men lied," he declared. "They withheld the truth.
They put themselves ahead of the investors. I’m asking you to send them a message, that it’s not all right. You can’t buy justice. You have to earn it."

The people who bought Enron’s stock, Berkowitz said, "weren’t entitled to much, but they were entitled to honesty." After he finished, deliberations began.

It looks like the sentences are going to be stiff.

"But even if they are convicted of just one count each, their sentences are sure to be stiff. Judge Lake, for one, has already shown that he takes corporate fraud seriously.

He sentenced Jamie Olis, a former vice president of finance at Dynegy Inc., to 24 years in prison for his role in a $300 million accounting scam at the Houston energy firm.

Last year, the Fifth US Circuit Court of Appeals in New Orleans threw out the sentence, but upheld the fraud conviction. Lake has not yet resentenced Mr. Olis.

"These men [Lay and Skilling] are going to be in jail for decades if convicted," says Mr. Zamansky.

There is some very interesting blog commentary on the Enron here/

In this case, the verdicts are going to be appealed. Lay and Skilling did not necessarily do the actual fraudulent calculations themselves. Andrew Fastow has already been convicted as the main architect of the fraudulent calculations. The question for Lay and Skilling is whether they remained "deliberately ignorant".

The relationship of the frauds at the heart of the present charges to the Enron collapse is interesting and I don’t think that it is well understood. The frauds in the present charges are a series of limited partnerships that were concocted to disguise writeoffs. But the losses in these limited partnerships were not what caused the collapse of Enron. The losses in these limited partnerships were a very small fraction of the total writeoffs involved in Enron. Had all these limited partnerships made good, the Enron collapse would have been delayed only a little while. The real problem with Enron is that they made a lot of crappy investments with minimal due diligence. But Lay and Skilling were not charged with making lousy investments. They were charged in connection with things that probably constituted less than 5% of the total collapse in monetary terms, if that.

But while the direct impact of the frauds on the balance sheet was (I think) not the direct cause of the collapse, the leverage was fantastic as the frauds made Enron look profitable, which was essential for it to keep raising money. If they had reported even a slight loss at any time, the wheels would have fallen off the money raising, people would have asked questions. So they avoided taking writeofs, developed ever more fantastic methods of parking non-performing assets , just to get knife-edge profits. Even slight profits were enough to satisfy the "consensus" of investors that this was one terrific company. In fact, by "consensus", in 2000, Enron was voted the best-managed firm in the U.S. and convicted felon Andrew Fastow was the "consensus" financial executive of the year.

In Eichenwald’s terrific book on Enron, the first person credited with noticing the problems was a short seller, who really came out of left field. He simply noticed what was, in effect, a statistical anomaly – the profits were miniscule relative to the capital employed and they always came out fractionally positive. When you had large fluxes in and out, it didn’t make sense that the knife edge always came out just positive. He wondered what accounting decisions had been made. I think like a short seller. Whenever I see knife edge balances, like the knife edge balance by which the net index from modern proxies comes out a hair warmer than the index from medieval proxies in many multiproxy studies, I wonder what accounting decisions were made. You can dress it up in statistical language, but civilians can think of the issues as being accounting decisions. Sometimes you need to look at more than one thing. Andrew Fastow did.

I often talk here about the need for full, true and plain disclosure. I don’t say this out of any belief that businessmen are more honest than academics. I don’t think that at all. All I’m saying is that breaches of the obligation of full, true and plain disclosure are serious and people are being sent to jail for breaching these obligations. Maybe not enough. Withholding the truth, as noted above, is a form of criminal dishonesty just as much as overt lying and was clearly involved in the charges against these two Enron executives. Codes of academic conduct have fairly similar obligations and the omission of adverse results can amount to misconduct, in much the same way that withholding truth from investors can amount to fraud.

AR1 on First Differences

The question for today is how does realclimate go from tree ring series with autocorrelation functions that look like the one in the figure below to a claim that these proxies have an AR1 coefficient of 0.15. We know that they are pranksters, but this looks like a good prank and it is.

Autocorrelation Function for Sheep Mountain. AR1 coefficient is 0.7. Red line shows rapid exponential decay of even an AR1=0.7 autocorrelation.

One of our readers (and their readers) wrote to them and wondered whether they had taken first differences prior to doing their calculations. (This turns out to be the case, or at least part of the story, as shown below.) Mann categorically denied that any first differencing took place.

The article you link to uses a differencing technique “to remove the large variance highly correlated slow component from consideration prior to determining the AR1 autocorrelation component.”

This doesn’t sound like a straight-forward estimate of an autocorrelation coefficient. What do you get if you use the standard method of estimating autocorrelation coefficients? Also, in the Von Storch analysis, which type of model are they using: a standard AR1 model, or a model that corresponds to this differencing technique that removes the highly correlated slow component?

Thanks in advance for the clarification.

[Response: You misunderstand what he’s done. He hasn’t performed any differencing of the data at all. He has simply calculated the lag-1 autocorrelation coefficient of the actual proxy data and has provided an argument for why this should be representative of the noise autocorrelation. In fact, it is easy to verify that this is the case using synthetic examples from climate model simulations with added AR(1) noise. -mike]

The reader wrote back and again Mann denied any first differencing.

If you could indulge me some more, I need some more education.

My reading of the Ritson paper you link to suggests that the differencing technique he uses removes ANY highly autocorrelated slow component before calculating the AR1 coefficient.

My question then is, what happens if there is NO temperature signal in the data, but there IS an extraneous, highly autocorrelated signal that is not temperature related (say a CO2 fertilization effect or a precipitation signal). Does the procedure remove the confounding, signal? Presumably the procedure cannot tell the difference between the “true” signal and an extraneous signal with the same statistical properties. So is it possible the procedure is removing exactly the type of red noise that Von Storch is trying to simulate?

[Response: You misunderstand what he’s done. He hasn’t performed any differencing of the data at all. He has simply calculated the lag-1 autocorrelation coefficient of the actual proxy data and has provided an argument for why this should be representative of the noise autocorrelation. In fact, it is easy to verify that this is the case using synthetic examples from climate model simulations with added AR(1) noise. -mike]

Now let’s look at Ritson’s description of what he did, as shown in the following excerpts. His autocorrelation coefficients [àŽ⯠] are calculated from the series Y, which are clearly the first differences of the original proxy series X. So Mann is pulling our legs when he says that there is no first differences (sort of like him saying, I did not have r2 with that statistic, Miss Lewinsky).



If one calculates results according to his last formula on the 70-series AD1400 MBH98 tree ring network, conveniently archived at GRL in connection with our paper last year, sure enough one gets an average of 0.14 for the period 1400-1980. I got a little bit higher value for the period 1400-1880, said by Ritson to have been used in his calculations. I’m not sure what the differences are, but it looks like I’ve replicated his calculations.

The formula shown by Ritson would hold for an AR1 process. Simple prudence would dictate that one calculate the actual AR1 coefficient. The software is convenient and it takes one line of code to do and another line to graph. In fact, the actual AR1 coefficients, calculated freshly are strongly negative.

When one is looking at ARMA models, it’s always a good idea to look at more than AR1 models and, for climate series, I think that it should be mandatory to look at ARMA(1,1) models. Again the differences are striking. Instead of the AR1 coefficients being negative, they are more widely dispersed and there is a very, very strong MA1 coefficient.

I looked specifically at confidence intervals for Sheep Mountain (illustrated above) and, for the first differences used by Ritson, the AR1 coefficient was 0.08 (se 0.06) and MA1 coefficient -0.75 (se 0.04). This was a series with an AR1 coefficient near 1 modeled on the proxy data itself.

Now think about what would happen under the Ritson method if you are started with a random walk. First differencing transforms the Y series to white noise, which would have Ritson coefficient of 0. This is a total fiasco. It makes Rasmus look competent.

A Letter to Ritson

Ritson at realclimate did not thank me for helpful discussions on autocorrelation despite lengthy correspondence on my part with him. I thought that the histogram that I posted up earlier today looked familiar. So I looked back at my correspondence with Ritson (who posted up on autocorrelation at realclimate and sure enough, I’d sent the identical histogram to him in November 2004. So Ritson had seen correctly calculated autocorrelation coefficients a long time ago. The letter was interesting to re-read in the present context.

I tend to be most interested in empirical points and from reading your paper, I see one very obvious bit of empirical information which we could helpfully report (and I think that the absence of this may have been frustrating you) – the AR1 coefficients of the North American tree ring network. In the AD1400 network used by Mann, there are 70 sites with AR1 coefficients ranging from 0 to 0.79. Below is a histogram. Obviously the AR1 persistence in these tree ring site index series is greater than in temperature series and I think that this turns out to be very important both in principal component analysis (especially as done by Mann) and in the downstream regressions. Curiously, the AR1 coefficients seem to me to be more strongly correlated to the author than to any other variable. Series done by Stahle have little autocorrelation (and Durbin Watson statistics around 2), while series done by Jacoby and Graybill have very high autocorrelation and Durbin Watson statistics sometimes under 1. This is inadequately discussed in the literature.

Secondly, the AR1 coefficients underestimate the actual persistence in many sites. For example, the Sheep Mountain site has an AR1 coefficient of 0.76 under an ARMA (1,0) model, but its actual persistence is much greater. The ACF is shown below, with the red line showing the iterated AR1 coefficient. I’ve gotten interested in fractional processes to deal with sort of situation, following Mandelbrot (who actually calculated Hurst parameters for some tree ring series). I suspect that what I’ve described as a fractional process (using Whitcher’s algorithm following Hosking (1981)) would be recognisable to you as your 1/f process. Interestingly, Hurst of the Hurst parameter in 1/f processes was a hydrologist, who studied fluctuations of the Nile, a climatic series.

Thirdly, here is a graphic showing the relationship between the weighting of a site chronology under Mann’s PC method and its AR1 coefficient. I think that you will agree that it is a very strange scatter plot. The 14 series on the right account for over 99% of the variance in the PC1. As you see, there is a strong association between the AR1 coefficient and the EOF1 weighting – which is not an effect that one would normally seek and surely points to some problem with the method. The over-printed sites are sites which Mann excluded in an unreported sensitivity study – which gave him very different results than he reported. As I’ve mentioned to you before, I am very struck by this omission, which would not be legal to omit in a securities offering. When you look at the sites marked with the overprint, they are all bristlecone pine sites, mostly from Donald Graybill and reported in Graybill and Idso (1993). There are many curious features about bristlecone pines and the growth index has never been shown to be a climate proxy. The extreme right hand site is the one shown in the above ACF function.

I find it incredible that Ritson can seriously propose AR1=0.15 as a model for a series with an autocorrelation function looking like Sheep Mountain in the second figure. Especially when the issue had been specifically brought to his attention.

As a passing comment, the above letter also illustrates how constructive I was in correspondence. I probably have been a little unguarded in this respect as, for example, I sent Bürger a considerable amount of detailed information, but did not receive any acknowledgement in Bürger et al 2006.

Red Noise at realclimate

realclimate today has a post How Red are My Proxies? which is so weird it’s worthy of Rasmus. (Note: see lsubsequent comment here). They discuss the autocorrelation properties of North American tree ring proxies, something about which I know a lot. They say:

Using data from the North American network of seventy sets of tree rings extending from 1400 to 1980 you obtain an actual one-year AR1 mean autocorrelation factor with a value close to 0.15 (the exact number depends on the proxy series and time period chosen but is always less than about 0.3).

They are nuts. Here’s a histogram of the AR1 coefficients of the 70-series MBH98 tree ring network which we archived in a readable table in connection with our GRL paper. I’ve included a short R script here to calculate AR1 coefficients. The mean autocorrelation was not 0.15, but was 0.4. Out of 70 AR1 coefficients, only three were less than 0.15 and the mean was 0.4. The range of values was from 0.03 to 0.79. Tellingly, the highest AR1 coefficents all belonged to bristlecones.

But it’s even worse than that. If you model the series as ARMA(1,1), the AR1 coefficients increase dramatically with high negative (and nearly always) statistically significant MA1 coefficients. Many of the AR1 coefficients now become close to 1- random walk levels, especially the bristlecones. The statistical properties of this type of series – high AR1 and negative MA1 – are trickier than people think. I’ve posted up notes on them by Ai Deng for example.

I have no idea how realclimate got their results. Their whole post looks completely goofy to me.

The other salient point – and we included this histogram in our Reply to Von Storch discussed , is that the tree ring series in this network have virtually no correlation to gridcell temperature; many of correlations to precipitation and of course the bristlecones have a correlation to CO2 levels.