Comments on: Using Santer’s Method

By: nono

nono — Sat, 14 Aug 2010 19:53:44 +0000

In reply to nono.

okay, then the larger d1* comes almost exclusively from a smaller variance of observed trend, s(b0)^2 in the formula.

s(b0)^2 is inversely proportional to the nb of observations. Therefore, intuitively, adding +10 years of data (that is, +50% of data) should reduce s(b0)^2 to 2/3 of its previous (Santer08) value.

The denominator of d1* being the square root of the inter-model variance + s(b0)^2, such a reduction of s(b0)^2 should yield a denominator reduced to sqrt(2/3)=0.8 times its Santer08 value. And that’s a lower bound, obtained by neglecting the inter-model variance.

So this gives an upper bound for d1* = 1.25 times its Santer08 value. You find a x4 increase (1.69 vs 0.37).

What have I done wrong? Is s(b0) reduced by more that that? Is the inter-model variance reduced as well?

By: PaulM

PaulM — Sat, 14 Aug 2010 17:07:03 +0000

Roman you are muddying the waters a bit. Steves point is that the null is rejected, with all the data!

By: Lewis

Lewis — Sat, 14 Aug 2010 16:56:44 +0000

Also, there’s a known rule of common sense – keep pharmacists away from statistics! How many press releases of ‘statistical significance’ are reported as red letter news! And yet, with climate science, we give them a pass!

By: lucia

lucia — Sat, 14 Aug 2010 16:55:36 +0000

In reply to RomanM. Re: RomanM (Aug 13 19:34), Yep. We never see information on power of the test when we get "fail to reject". The power (or type II error) was discussed in my sophomore year statistics class, so it's odd not to see it. There also never seems to be any suggestion that if we are testing a hypothesis HO and specify our assumptions about the process, we should, when possible, pick a method with greater power. (So for example, if a time series really IS AR(1), and we have a choice between using monthly data vs. annual average data, we should generally prefer a method that gives us more power. Admittedly, if the higher method requires a super-computer to implement and the poorer one can be done on a spreadsheet, one might go for the lower power method for that reason. But all things being equal, the higher power method is preferred.)

By: Lewis

Lewis — Sat, 14 Aug 2010 16:50:52 +0000

Santerizing the models is not now nor ever will be a legitimate statistical procedure for the simple reason that statisticians do not consider the mere failure to reject the null hypothesis as evidence to support that hypothesis.

Like you said!

By: pete

pete — Sat, 14 Aug 2010 02:39:11 +0000

In reply to Steve McIntyre. There's a bug in make.table4 --- it's trying to read from your d:\ drive instead of downloading from the website.

By: RomanM

RomanM — Sat, 14 Aug 2010 00:34:35 +0000

The problem of trying to show that the null hypotheses could be true also exists in other scientific areas. In particular, in bioavailability tests, a pharmaceutical company manufacturing a generic version of a drug tries to demonstrate that their product will be absorbed by the body in a manner equivalent to the original preparation. The testing required from the drug manufacturer must show that the null hypothesis of no difference in the mean absorption is not rejected.

However, in order to guarantee that the result is not due to high variability in the sample or to insufficient information due to an inadequate sample size, they must also show that if the difference between the two formulations was greater than a specified amount, the procedure would reject the null hypothesis at a predetermined significance level (in technical terms, the power of the test would be sufficient to distinguish differences of the given magnitude). This portion is completely lacking in the procedure used by Santer rendering the test useless.

Dr. Pielke demonstrated in his presentation how the latter procedure works: More garbage in … Santerized models out.

By: Kenneth Fritsch

Kenneth Fritsch — Fri, 13 Aug 2010 20:02:18 +0000

I assume that Lapse_T2LT and Lapse_T2 refer to the difference series between the troposphere and surface temperature anomalies.

By: nono

nono — Fri, 13 Aug 2010 19:00:16 +0000

Steve, I’m looking at T2LT, rss.

The quantity in the numerator of d1*, (ensemble – obs trend), does not seem to be very different from that of Santer08.

I then assume that the difference in d1* comes from the denominator. So which term(s) of the denominator changes a lot between Santer08 and your estimate? Is it the (inter-model) variance of mean trends, or the variance of the observed trend?

Steve: the change results simply from more observations, which yields more degrees of freedom and thus narrower CIs in the trend estimation allowing for AR1 autocorrelation.

By: Steve McIntyre

Steve McIntyre — Fri, 13 Aug 2010 17:44:32 +0000

Script for this – which collects quite a bit of other information – is at
http://www.climateaudit.info/scripts/models/santer/script_comment_short.txt