Using Santer’s Method

Using Santer’s own methodology with up-to-date observations, here are results comparing observations to the ensemble mean of Chad’s collation of 57 A1B to models to 2009. In each case, the d1* calculated Santer-style has moved into very extreme percentiles.

The results from Ross’ more advanced methodology are not getting results that are in any sense “inconsistent” with the application of Santer’s own methods to up-to-date data.

Tropo

Sat

Obs Trend

Ensemble

Santer d1* (1999)

d1*(2009)

Percentile

Lapse_T2LT

rss

-0.033

-0.079

-0.67

-2.819

0.003

Lapse_T2LT

uah

0.048

-0.079

-3.5

-7.395

0

Lapse_T2

rss

0.005

-0.069

NA

-4.212

0

Lapse_T2

uah

0.084

-0.069

NA

-8.518

0

T2LT

rss

0.159

0.272

0.37

1.69

0.948

T2LT

uah

0.075

0.272

1.11

2.862

0.996

T2

rss

0.121

0.262

0.44

2.196

0.981

T2

uah

0.04

0.262

1.19

3.449

0.999


11 Comments

  1. Posted Aug 13, 2010 at 12:39 PM | Permalink

    I’m guessing the final sentence should say “up-to-date data”?

    Steve
    ; fixed

  2. Steve McIntyre
    Posted Aug 13, 2010 at 12:44 PM | Permalink

    Script for this – which collects quite a bit of other information – is at
    http://www.climateaudit.info/scripts/models/santer/script_comment_short.txt

    • pete
      Posted Aug 13, 2010 at 9:39 PM | Permalink

      There’s a bug in make.table4 — it’s trying to read from your d:\ drive instead of downloading from the website.

  3. nono
    Posted Aug 13, 2010 at 2:00 PM | Permalink

    Steve, I’m looking at T2LT, rss.

    The quantity in the numerator of d1*, (ensemble – obs trend), does not seem to be very different from that of Santer08.

    I then assume that the difference in d1* comes from the denominator. So which term(s) of the denominator changes a lot between Santer08 and your estimate? Is it the (inter-model) variance of mean trends, or the variance of the observed trend?

    Steve: the change results simply from more observations, which yields more degrees of freedom and thus narrower CIs in the trend estimation allowing for AR1 autocorrelation.

    • nono
      Posted Aug 14, 2010 at 2:53 PM | Permalink

      okay, then the larger d1* comes almost exclusively from a smaller variance of observed trend, s(b0)^2 in the formula.

      s(b0)^2 is inversely proportional to the nb of observations. Therefore, intuitively, adding +10 years of data (that is, +50% of data) should reduce s(b0)^2 to 2/3 of its previous (Santer08) value.

      The denominator of d1* being the square root of the inter-model variance + s(b0)^2, such a reduction of s(b0)^2 should yield a denominator reduced to sqrt(2/3)=0.8 times its Santer08 value. And that’s a lower bound, obtained by neglecting the inter-model variance.

      So this gives an upper bound for d1* = 1.25 times its Santer08 value. You find a x4 increase (1.69 vs 0.37).

      What have I done wrong? Is s(b0) reduced by more that that? Is the inter-model variance reduced as well?

  4. Kenneth Fritsch
    Posted Aug 13, 2010 at 3:02 PM | Permalink

    I assume that Lapse_T2LT and Lapse_T2 refer to the difference series between the troposphere and surface temperature anomalies.

  5. RomanM
    Posted Aug 13, 2010 at 7:34 PM | Permalink

    Santerizing the models is not now nor ever will be a legitimate statistical procedure for the simple reason that statisticians do not consider the mere failure to reject the null hypothesis as evidence to support that hypothesis.

    The problem of trying to show that the null hypotheses could be true also exists in other scientific areas. In particular, in bioavailability tests, a pharmaceutical company manufacturing a generic version of a drug tries to demonstrate that their product will be absorbed by the body in a manner equivalent to the original preparation. The testing required from the drug manufacturer must show that the null hypothesis of no difference in the mean absorption is not rejected.

    However, in order to guarantee that the result is not due to high variability in the sample or to insufficient information due to an inadequate sample size, they must also show that if the difference between the two formulations was greater than a specified amount, the procedure would reject the null hypothesis at a predetermined significance level (in technical terms, the power of the test would be sufficient to distinguish differences of the given magnitude). This portion is completely lacking in the procedure used by Santer rendering the test useless.

    Dr. Pielke demonstrated in his presentation how the latter procedure works: More garbage in … Santerized models out.

    • Posted Aug 14, 2010 at 11:55 AM | Permalink

      Re: RomanM (Aug 13 19:34),
      Yep. We never see information on power of the test when we get “fail to reject”. The power (or type II error) was discussed in my sophomore year statistics class, so it’s odd not to see it. There also never seems to be any suggestion that if we are testing a hypothesis HO and specify our assumptions about the process, we should, when possible, pick a method with greater power. (So for example, if a time series really IS AR(1), and we have a choice between using monthly data vs. annual average data, we should generally prefer a method that gives us more power. Admittedly, if the higher method requires a super-computer to implement and the poorer one can be done on a spreadsheet, one might go for the lower power method for that reason. But all things being equal, the higher power method is preferred.)

  6. Lewis
    Posted Aug 14, 2010 at 11:50 AM | Permalink

    Santerizing the models is not now nor ever will be a legitimate statistical procedure for the simple reason that statisticians do not consider the mere failure to reject the null hypothesis as evidence to support that hypothesis.

    Like you said!

  7. Lewis
    Posted Aug 14, 2010 at 11:56 AM | Permalink

    Also, there’s a known rule of common sense – keep pharmacists away from statistics! How many press releases of ‘statistical significance’ are reported as red letter news! And yet, with climate science, we give them a pass!

  8. PaulM
    Posted Aug 14, 2010 at 12:07 PM | Permalink

    Roman you are muddying the waters a bit. Steves point is that the null is rejected, with all the data!

Follow

Get every new post delivered to your Inbox.

Join 3,245 other followers

%d bloggers like this: