Comments on: Replicating Santer Tables 1 and 3

By: Georges

Georges — Tue, 05 Jan 2010 00:01:34 +0000

On your post Oct 22 2008 Replicating Santer Tables 1 and 3

Trying to duplicate your numbers but I cannot get the data.
http://data.climateaudit.org/data/models/santer_2008_table1
http://data.climateaudit.org/scripts/spaghetti/msu.glb.txt
Not on your site anymore. Could you send the second file (I can always type the first) to the email address above.

Thanks

Georges

[RomanM: The site has been rearranged. Try using

http://www.climateaudit.info/data/ as a starting point for data

and

http://www.climateaudit.info/scripts/ for scripts.

For example look at: http://www.climateaudit.info/data/models/ for the Santer file.

By: Kenneth Fritsch

Kenneth Fritsch — Wed, 29 Oct 2008 14:47:18 +0000

In reply to Hu McCulloch. Re: Hu McCulloch (#66), Hu, I looked at using annual data on another thread (linked to my 2 posts below). Annual data did indeed decrease the all important adjusted trend standard deviation. It appears to me that Santer et al. selected the optimum conditions for obtaining a large trend standard deviation by choosing the monthly over the annual data and the 1979-1999 time period over the 1979-2007 (or at least to 1979-2006) time period. I am also looking at the monthly data for the globe and zonal regions of the globe for determining the adjusted trend standard deviation and thus far it appears that the tropics zone is rather unique in having the AR1 correlation sufficiently large so as to have a large influence on the resulting trend standard deviation adjustment. Re: Kenneth Fritsch (#30), Re: Kenneth Fritsch (#32),

By: Hu McCulloch

Hu McCulloch — Wed, 29 Oct 2008 04:49:48 +0000

Re Ken Fritsch #65,

So what you are saying is that the uncertainty of any warming trend (and alternatively with the possibility of a cooling trend) in the tropics over the past 30 years should be even greater than determined in Santer et al. (2008).

That’s about right. Santer, Nychka et al (2008) should have paid attention to the excellent (if perhaps not definitive) Nychka, Santer et al (2000)!

By the way, what would the use of annual data, in place of the monthly used in Santer et al., do to eliminate the need for the AR correction and to the calcualtion of a standard deviation or SEM and/or the use of the differences between the surface and troposphere temperature trends?

Annual data would greatly reduce the nominal sample size, but at the same time would probably greatly reduce the serial correlation, and therefore the need to adjust the effective sample size to be far smaller than the nominal sample size. So you probably wouldn’t lose much by just going to annual data.

By: Kenneth Fritsch

Kenneth Fritsch — Mon, 27 Oct 2008 13:59:33 +0000

In reply to Hu McCulloch. Re: Hu McCulloch (#64),

I don't understand all the data issues in the Santer … Nychka et al 2008 IJC paper, but for what it's worth, they say they correct for serial correlation with the "classical" ne = n*(1-rhohat)/(1+rhohat) formula (their equation (6), rather than with the improved Nycha, … Santer et al 2000 "adjusted formula". By the admission of no less than 5 of their 17 coauthors, they are therefore substantially undercorrecting for serial correlation. This means that they have too small CI's, and are overrejecting the hypothesis of equality of trend slopes (as well as of zero trend slopes!). But what the heck, it's only …

So what you are saying is that the uncertainty of any warming trend (and alternatively with the possibility of a cooling trend) in the tropics over the past 30 years should be even greater than determined in Santer et al. (2008). That appears to me to make the proposition that Santer et al. implies, i.e. with warming (AGW) in the tropics, the ratio of troposphere to surface temperature trends should be greater than 1, even more unworkable since the warming part is gravely and statistically in question. By the way, what would the use of annual data, in place of the monthly used in Santer et al., do to eliminate the need for the AR correction and to the calcualtion of a standard deviation or SEM and/or the use of the differences between the surface and troposphere temperature trends?

By: Hu McCulloch

Hu McCulloch — Mon, 27 Oct 2008 01:39:01 +0000

Steve (#9) has identified Nychka, Buchberger, Wigley, Santer, Taylor and Jones, (2000) as NCAR technical paper “Confidence intervals for trend estimates with autocorrelated observations”, and Jean S (#10) has provided a link at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.7829&rep1&type=pdf

This is in fact a very interesting paper, that relates to my own recent research interest on how to compute confidence intervals for OLS parameter estimates when errors are serially correlated. See my “Median Unbiased Estimates of Higher Order Autoregressive Processes…” (2008), at http://www.econ.ohio-state.edu/jhm/papers/MUARM1S.pdf, which extends Don Andrews’ “Exactly Median-Unbiased Estimates of First Order Autoregressive/Unit Root Processes,” Econometrica 1993.

Nychka, … Santer et al (2000) show that both MLE confidence intervals (using MLE estimates of the AR(1) and regression coefficients and then the LR to form CIs as implemented by Steve) and “classical” 95% confidence intervals (adjusting OLS CI’s according to “adjusted sample size” ne = n(1-rhohat)/(1+rhohat), where rhohat is the estimated AR(1) coefficient using the OLS residuals) for a time trend slope coefficient have coverage that falls far far short of 95%, especially when the true rho approaches 1. They also show that either is a lot better than “naive” CI’s that take no account of serial correlation, and that “classical” uniformly outperforms MLE.

They propose an “adjusted” effective sample size adjustment, ne = n*(1-rhohat-.68/sqrt(n))/(1+rhohat+.68/sqrt(n)), that they show has better coverage than even the “classical” CI, This adjustment may not be perfect, but it is at least a far cry better than the Newey-West “HAC” standard errors that are almost universally used to undercorrect for serial correlation in the econometric literature.

Although the 2000 paper should not be the last word on the subject, it still contains a lot of useful information that I think still deserves to be published somewhere.

In fact, the Nychka et al graphs show that the coverage of even their “adjusted” 95% CIs falls well short of 95%, indicating that they have not yet found the right functional form. There probably is no simple true functional form, so that the correct “effective sample size” ne(n, rhohat)should probably just be given as a table lookup of values that happen to give the right coverage, at say 95%. This can be found numerically by simulation. Although it poses some interesting numerical problems of finding, for each n, a continuous function ne(n, rhohat) of the continuous variable rhohat, I don’t think these problems are insurmountable.

I don’t understand all the data issues in the Santer … Nychka et al 2008 IJC paper, but for what it’s worth, they say they correct for serial correlation with the “classical” ne = n*(1-rhohat)/(1+rhohat) formula (their equation (6), rather than with the improved Nycha, … Santer et al 2000 “adjusted formula”. By the admission of no less than 5 of their 17 coauthors, they are therefore substantially undercorrecting for serial correlation. This means that they have too small CI’s, and are overrejecting the hypothesis of equality of trend slopes (as well as of zero trend slopes!).

But what the heck, it’s only …

Andrews’ 1993 approach is to median-unbias the AR(1) coefficient by simulation to obtain rhohatMU, and then to plug this into the classical matrix covariance formula, which is equivalent to using ne = n(1-rhohatMU)/(1+rhohatMU) in large samples. Since the adjustment is nonlinear in rho, mean-unbiasing rhohat will not median unbias the adjustment. Nevertheless, median-unbiasing rhohat will yield median-unbiased monotonic functions of rhohat.

This isn’t quite as precise as correcing the coverage directly, as Nychka, … Sanger et al do. However, precisely correcting say 95% CIs will in general not quite correct say the 99% or 90% CIs. A perfect adjustment would simply replace the traditional Student t tables with alternative critical values that are multiples of the OLS SE’s that depend on both sample size and critical value. But correcting the 95% CI as in Nychka, … Sanger et al attempt to do is at least a big step in the right direction.

In my paper, I extend Andrews 1993 to correct for higher order AR processes in the residuals of a general OLS regression. My approach is admittedly probably too cumbersome to ever catch on, but I think it deserves some attention. I’ve just sent it to the JEDC.

The “classical” (1-rhohat)/(1+rhohat) adjustment is, as I believe I recall from discussion on CA last year, due ultimately to Quenouille’s 1952 book. Quaint, but still a lot better than nothing, or Newey-West HAC!

The Lee and Lund Biometrika 2004 paper cited by K Hamed (#11) refers to Nychka et al (2000) extensively, but unfortunately fails to list it in the references, apparently because it was unpublished. I disapprove of this practice — important papers deserve to be cited, even if they are just working papers! (That is to say, many of my own indexed economics citations are to unpublished papers…)

The Matalas and Sankarasubramanian 2003 Water Reseouces Research paper cited by Lucia on her webpage contains further good information. However, I find the Nychka et al 2000 paper to be more useful.

By: SD or SE: What the heck are ‘beaker’ and the other talking about? | The Blackboard

Fri, 24 Oct 2008 18:50:23 +0000

[…] Steve McIntyre resolved the first question: Santer used “SE”. This is reported in Replicating Santer Tables 1 and 3. The tables can only be replicated using SE, not SD. This doesn’t help readers who still […]

By: Dan Hughes

Dan Hughes — Fri, 24 Oct 2008 18:17:34 +0000

In reply to Steve McIntyre. Re: Steve McIntyre (#61), ok Steve, I put it over here. Snip the one above at will. Thanks again for all your hard work. And others, too.

By: Steve McIntyre

Steve McIntyre — Fri, 24 Oct 2008 17:58:56 +0000

#59. Dan, let’s not get into broader issues of climate models on these threads – let’s stay to the narrow issues of what Santer is doing, as that’s hard enough for now.

#60. ENSO has a definite ARMA(1,1) signature as do many series. Interestingly the MSU data does not have a significant MA1 term, but it’s AR1 term is virtually identical to the AR1 term in the ESNO index modeled as ARMA(1,1). What does this mean? Dunno. I think that AR1 is fine for this analysis, but that’s a bit one-off as IMO it’s assumed far too quickly in most cases.

By: Mike B

Mike B — Fri, 24 Oct 2008 17:49:45 +0000

In reply to Steve McIntyre. Re: Steve McIntyre (#56), Steve - a couple of thoughts come to mind. First of all, are your simulations based on trend + AR(1) noise? Or are you simulating an AR(1) process with trend + white noise? Second, I realize the interest in all parties on focusing on trends; it is afterall the most "politically" relevant statistic that can be easily related to a mass audience. But wouldn't in be better to focus on the best model for your simulation rather than one that is expedient, at least to see what the results are? You've mentioned before an ARMA(1,1) process, and you could easily adapt your simulation approach to build confidence regions around the parameter estimates for that type of model as well. Not that I'm trying to assign you work or anything...just a thought.:)

By: Dan Hughes

Dan Hughes — Fri, 24 Oct 2008 17:28:34 +0000

Let’s see if I understand the recent discussions here and elsewhere.

First some scientists devise a system of continuous equations that are some kind of approximate description of the transient response of the Climate of Earth. For the most part these equations are buried away in an opaque Black Box.

Then these same scientists devise some numerical methods to solve the equation systems they devised above. For the most part these methods are wrapped around the Black Box above and then together put into another Black Box.

Next the scientists produce computer codes that ‘solve’ the numerical methods devised in the previous step. And in a similar fashion, the previous Black Boxes get stuffed into yet another Black Box. It’s getting really hard to know exactly what’s in these boxes. We do know, however, that the ‘solution’ methods have yet to provide numbers that are independent of the discrete approximations to the continuous equations.

Then we gather up a bunch of these Black Boxes having essentially unknown contents and pair therm with a bunch of application methods and users. And by the way the application methods and users are as equally opaque as the Black Boxes that they put to furiously calculating.

The above system comes up with a bunch of calculated numbers by various unknown combinations of the Black Boxes, application methods, and users. It is not reported how or why the particular bunch of numbers is chosen to be representative of the transient response of the Climate of Earth. Additionally we aren’t completely sure if the numbers represent predictions, or projections, or what-ifs, or estimates of the future response of Earth. But hey, we got a bunch of numbers so let’s do something with them.

We observe that both the bunch of calculated numbers and the ‘data’ have wiggles imposed onto a general trend. Based on no demonstrated information at all, we assume that the wiggles in the calculated numbers arise solely from the models of physical phenomena and processes contained in the Black Boxes. And we assume that the calculated wiggles are realistic representations of the actual Climate of Earth.

Plus, while we are looking at temporal and spatial scales that are Climate we call the calculated wiggles weather noise or turbulence. Forgetting that what we’re looking at is already one humongous average; it’s the scale of the planet for an entire year, for gosh sakes. Even as the know that the Black Boxes do not attempt to resolve weather. And even more we know that the Black Boxes have yet to accurately/correctly resolve any physical phenomena and processes. And finally even more, we know that the more important physical phenomena and processes relative to Climate are not described from first principles.

It is well known for close to a hundred years that any of the first four steps listed above in the procedures are all equally capable of producing wiggles in calculated numbers that are solely due to problems in those steps. These wiggles have nothing whatsoever to do with any physical phenomena and processes. No investigations are conducted to determine the actual source(s) of the wiggles in the calculated numbers.

Right off the bat we can plainly see that the Black Box calculated numbers don’t look anything like the data. Let me repeat that. We can plainly see that the Black Box calculated numbers don’t look anything like the data. The Black Boxes cannot yet reproduce the transient response of even planet-wide average quantities.

Now we are having discussions about how the bunch of calculated numbers “compares” with the “measured” response of Earth. It turns out that we aren’t sure how to make the comparisons. And some methods allow for the possibility that the more that the bunch of calculated numbers deviates from ‘data’ the more consistent the bunch of numbers becomes.

Plus we also know that the measured data have in fact been processed through a system similar to that described above. The reported data are in fact results of processing the actual measurements by other opaque Black Box models, solution methods, application methods, and users.

In the process of our discussions we encounter phrases like apples-to-apples.

At this point I wouldn’t stick my hand into the bunch of calculated numbers thinking I would grab an apple.

This is not my Engineering. And I would like to think that it’s not anyone’s Science.

I contend that the very critical and absolutely necessary first step in any comparison exercise is to determine precisely what’s in each and every Black Box before we can even begin to discuss the apples-to-apples issue.