Rahmstorf’s Second Trick

The Rahmstorf et al reconstruction commences in AD900 even though the Mann et al 2009 reconstruction goes back to AD500. Once again, this raises the obvious question: why didn’t Rahmstorf show values before AD900? Are these results adverse to his claims? Once the question is posed, you can guess the answer.

First, here is a slightly annotated version of Rahmstorf 2015 pseudo-AMOC index, supposedly calculated as the difference between the Mann et al 2008 NH reconstruction and the Mann et al 2009 gyre reconstruction. I did my own calculation of this series and overplotted it onto Rahmstorf 2015 Figure 3b, but it plotted ~0.15 deg C too high. For the overplot below, I therefore subtracted 0.15 deg C from my calculation and it matched more or less exactly up to ~1850, the start of the instrumental period. I’m not sure why the differences arise after 1850. I’ll discuss one theory at the end of the post. As in yesterday’s post, post-1995 values using M09 gridded data go up (strong blue below), whereas Rahmstorf shows a decline.

Figure 1. Annotated version of Rahmstorf et al 2015 Figure 3b showing pseudo-AMOC index. Overplotted is calculated difference between the Mann et al 2008 NH reconstruction and the Mann et al 2009 gyre reconstruction. 1850 and 1995 are marked by vertical lines.

In the next figure, I’ve plotted both the pseudo-AMOC index using Mann et al 2009 gridded data (thick blue), shown for the 500-2006 period included in the archive, and the corresponding pseudo-AMOC index using the Mann et al 2008 NH reconstruction as in the previous figure (shown as thin blue). Either way, there is a dramatic step change in the pseudo-AMOC reconstruction at AD900, with values prior to AD900 being comparable to the supposedly alarming late 20th century values. .Perhaps Rahmstorf’s decision not to show values prior to AD900 has an innocent and unrelated explanation, but unfortunately Rahmstorf and coauthors did not provide it.

Figure 2. Annotated variation of Rahmstorf et al 2015 Figure 3b showing pseudo-AMOC index. Overplotted is calculated difference between the Mann et al 2009 NH reconstruction and the Mann et al 2009 gyre reconstruction. Prior to AD1600 (dotted red line), only two climate “fields” are reconstructed. In my plot of the pseudo-AMOC index using Mann et al 2009 gridded data, I removed the 0.15 deg C bodge discussed in the previous figure. In addition, the standard deviation increases dramatically after AD1600: Mann et al 2009 stated that only two climate “fields” (principal components) are used in the gridded reconstruction prior to AD1600 and this presumably has something to do with it

AD900 in Rahmstorf et al 2015
Rahmstorf et al didn’t show this dramatic change in behavior at AD900. This raises several questions: (1) what accounts for the large step change? (2) why wasn’t it shown or discussed by Rahmstorf et al? (3) Rahmstorf et al were obviously aware of the pre-AD900 behaviour and one presumes that they would have some fine print rationalizing their failure to show the data: what was it?

I’ll first look at the Rahmstorf text that touches on the AD900 start and how they purported to rationalize not showing pre-AD900 results.

First, Rahmstorf et al stated that Mann et al 2008 provided a “skilful” NH EIV reconstruction “back to AD900 and beyond”. This is a little coy, since Mann et al 2008 claimed a skilful EIV NH reconstruction back to AD300 – see its Figure 3 – and obviously doesn’t explain the “late” AD900 start.

For the Northern Hemisphere mean, Mann et al. [12 – 2008] produced reconstructions using two different methods, composite-plus-scale (CPS) and errors in variables (EIV). Here we use the land-and-ocean reconstruction with the EIV method using all the available proxies, which is the reconstruction for which the best validation results were achieved
(see Supplementary Methods of Mann et al. [12-2008]). Based on standard validation scores (Reduction of Error and Coefficient of Efficiency), this series provides a skilful reconstruction back to AD 900 and beyond (95% significance compared to a red-noise null).

Next, Rahmstorf then said that the gyre reconstruction was “skilful” back to AD900, without saying anything about earlier reconstructions:

The subpolar gyre falls within the region where the individual grid-box reconstructions are assessed to be skilful compared to a red-noise null [13 – Mann et al 2009]. In addition, we performed validation testing of the subpolar-gyre mean series, which indicates a skilful reconstruction back to AD 900 (95% significance compared to a red-noise null; see Supplementary Information for details).

The Supplementary Information then stated that they carried out tests on steps from AD900 on, the networks used in the “composite reconstruction”. Again, it doesn’t say anything about steps prior to that.

To validate the proxy reconstructions of temperature we use standard techniques developed during the past two decades in the paleoclimate community. Validation of the subpolar gyre temperature reconstruction was performed on each proxy network used in the composite reconstruction (900AD, 1400AD, 1500AD, 1600AD, 1700AD, 1800AD). (See Mann et al. 2009 for details on the selection of proxy networks used in the composite reconstruction.)

These are the only references to the AD900 issue that I located in the article. Obviously none of them reports the AD900 step or provides a definitive explanation for not showing pre-AD1900 values.

RegEM Step Changes
The AD900 step change arises from a fundamental instability in RegEM methodology that has not been reported by any publicly funded academic in peer reviewed literature, but has been discussed from time to time at CA.

Jean S and UC were the first to notice the pathology, providing the example shown below in March 2009 here, observing:

Some 0.6 C change due to one added proxy. Weight of the curtis-proxy increases quite a lot, and there are many sign changes.

One sees the above example in Mann et al 2008 Figure S6 (shown below): its EIV NH reconstruction using screened proxies (shown in magenta) has a similar step change at AD600. In the diagram below, the magenta reconstruction begins in AD400. This means that the reconstruction prior to the step change passes Mannian verification as well as the reconstruction after the step change: so one cannot assume that Mannian verification cannot occur for both reconstruction variations.

Analysis of Mannian RegEM methodology at CA pinpointed the dramatic step changes to changes in the sign (and weights) of proxies that sometimes arise from the addition of a single nondescript proxy. The effect is bizarre and undermines one’s willingness to credit RegEM, which may be one of the reasons why the results were not shown by Rahmstorf.

According to weird Mannian rules, sometimes the addition of a single nondescript series can change the number of retained regularization parameters. In a change from one to two or two to three regularization parameters, the weights assigned to individual proxies in the RegEM calculation can change dramatically and, even more importantly, change sign. This is nowhere discussed in peer reviewed literature by publicly funded academics, but is the case nonetheless.

I haven’t parsed the particular AD900 step change in Rahmstorf et al 2015, but the pathology is instantly recognizable for people with mathematical understanding of the method, a group that does not appear to include any of the coauthors of Rahmstorf et al. It is possible that reconstructions using AD800 and earlier networks do not pass Mannian validation, but this is not a given: in the Mann et al 2008 example shown above, both the reconstruction before and after the step change pass Mannian verification criteria or they would not have been shown. It is not a given that the pre-AD900 gyre reconstruction would fail Mannian verification criteria.

ARMA (1,1) Modeling

Rahmstorf et al describe their statistical test setup as follows:

The annually resolved AMOC reconstruction from 900 to 1850 formed the basis for an ARMA(1,1)model which closely resembles the statistical properties of the data.

Right away, one can see a couple of obvious defects of this procedure. The number of temperature principal components (“climate fields”) used to represent the gridded data changes dramatically over time. Mann et al 2009 stated that only two principal components are used prior to AD1600.

Before 1600 C. E., the low-frequency component of the surface temperature reconstructions is described as a linear combination of just two leading patterns of temporal variation, so that regional features in the temperature field are represented by a spatiotemporally filtered approximation.

Since the subpolar gyre is a relatively fine detail of global climate, there is no possibility of it being distinguished in the two-PC reconstruction prior to AD1600, making comparisons before and after AD1600 rather pointless. The AD1600 breakpoint can be clearly seen in the standard deviation of the gyre reconstruction and the pseudo-AMOC reconstruction, as the post-AD1600 series have much larger standard deviations. In addition, the underlying Mannian proxy data has been so heavily smoothed that it’s hard to say what an annual ARMA(1,1) model really means.

Given that there’s no way that a reconstruction using contaminated Finnish sediments, stripbark bristlecone chronologies, truncated MXD series and nondescript tree ring chronologies can rise above phrenology and have actual significance, this implies that Mann and Rahmstorf have erroneously calculated the benchmarks for their calculation of statistical significance. While Rahmstorf claimed that their validation methods were “standard”, Ross and I sharply criticized the related MBH98 approach to verification – a criticism that has not been rebutted in the “literature”. The only commentary thus far was the Texas sharpshooting of Wahl and Ammann, which fell far short of being a rebuttal.

This entry was written by Stephen McIntyre, posted on Mar 29, 2015 at 5:07 PM, filed under Uncategorized and tagged rahmstorf, rahmstorf-2015, RegEM, weights. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

50 Comments

Laws of Nature

Posted Mar 29, 2015 at 6:29 PM | Permalink

Dear Steve,

I can only applaud you for your effort to shoot that nonsense down, one day there will be “climate scientists” speaking out loud about this!
(I am referring to that large group of government employed people turning a blind eye on this so far.. do they have no ethics at all? This is really getting painful to watch!)
Until then unfortunately you have to make your point again and again!
I think there is a small typo in “es a definitive explanation for not showing pre-AD1900 values.” the “1” should go away!

Other then that yet again a clear and convincing article!
Thanks,
LoN
Tom C

Posted Mar 29, 2015 at 6:59 PM | Permalink

So, is this a correct summary of the strategy these guys have been pursuing for 17 years now:

1) there are few proxies, of the many hundreds that are possibly relevant to reconstructions, that happen, for one reason or another, to have changed a lot in the 20th century

2) these can be used to reconstruct just about any climate phenomenon of interest given the invocation of “teleconnections”

3) by cherry picking PC number and time period under consideration one can get these proxies to dominate the analysis and,

4) presto! It’s “worse than we thought”
- David A
  
  Posted Mar 30, 2015 at 6:24 AM | Permalink
  
  Tom, from this
  “According to weird Mannian rules, sometimes the addition of a single nondescript series can change the number of retained regularization parameters. In a change from one to two or two to three regularization parameters, the weights assigned to individual proxies in the RegEM calculation can change dramatically and, even more importantly, change sign. This is nowhere discussed in peer reviewed literature by publicly funded academics, but is the case nonetheless.”
  
  It appears that you could…
  
  Add proxies where needed.
  
  Weight proxies as needed.
  
  Change how you use proxies, as needed, throughout the reconstruction.
  
  Change the sign of proxies as needed.
  
  Where is it laid out exactly what was done, or is this another multiple year effort to find out exactly what was done?
  - Jeff Norman
    
    Posted Mar 30, 2015 at 7:20 AM | Permalink
    
    … Truncate where necessary.
    - Clark
      
      Posted Apr 1, 2015 at 11:19 AM | Permalink
      
      Don’t forget the temperature record splicing step!
kim

Posted Mar 29, 2015 at 7:04 PM | Permalink

This whole madness of the crowd has been a ‘choose the answer, pose the right question’. Nature very much resents this cheating, insofar as she’s capable of resenting.
==================
- michael hart
  
  Posted Mar 31, 2015 at 5:46 PM | Permalink
  
  Once you question their answer, you can also guess their pose.
  - kim
    
    Posted Mar 31, 2015 at 8:17 PM | Permalink
    
    Ring around a rosie,
    A pocketful of posie;
    Answers! Answers!
    Fall questions down.
    =============
observa

Posted Mar 29, 2015 at 7:37 PM | Permalink

Like Black Swan events I think I’ve stumbled across the ‘Zwentibold Moment’ in the field of climatology, but please please give Mr McIntyre some of the credit with all those Nobel nominations.
Michael Jankowski

Posted Mar 29, 2015 at 8:12 PM | Permalink

These folks just love to “hide” whatever they know calls their analysis into question.
amac78

Posted Mar 29, 2015 at 8:58 PM | Permalink

> Given that there’s no way that a reconstruction using contaminated Finnish sediments, stripbark bristlecone chronologies, truncated MXD series and nondescript tree ring chronologies can rise above phrenology and have actual significance…

Mann08, the zombie paper that just won’t die. You’d think its authors would be trying to ease it into obscurity. Not so, obviously.
- kim
  
  Posted Mar 29, 2015 at 9:03 PM | Permalink
  
  On they trudge, certain, blessed, wounded.
  =========
Willis Eschenbach

Posted Mar 29, 2015 at 9:18 PM | Permalink

Well done, sir. Or perhaps I should say, done to a turn …

w.
Geoff Sherrington

Posted Mar 29, 2015 at 9:18 PM | Permalink

Given that most proxy calibration is done against a historic, instrument record, I hope that this is not OT.
If the historic temperature is wrong, so are essentially all outcomes.
For my home country, Australia, we have done this historic comparison of official data from the late 19th to early 20th centuries.
As Chris Gilham concludes, we can see evidence for Australia warming by 0.3 deg C over the last century or so, but we cannot see the official claim of 0.9 or 1.0 deg C or more.
Chris: “I found that the 1954 Year Book expanded on the 1953 Year Book list of weather stations considered by the bureau back then as representative of Australia’s climate.
Converted from fahrenheit and collated, the 84 stations show average Australian maxima increased from 24.2C in 1911-40 to 24.7C in 2000-2014, with average minima up from 12.3C to 12.4C. That equates to a mean increase of 0.3C from 18.3C to 18.6C. The previous comparison of 44 weather stations listed in the 1953 Year Book, as submitted to the review panel, calculated a mean increase of 0.4C.
For those interested in raw raw historic temps, attached is an Excel with a sheet tabulating all 84 stations, another with capital cities deleted, another tabulating 66 stations that have BoM RAW temps available for 1911-40 Year Book comparison, and another sheet comparing 29 of the weather stations that have since become part of ACORN (1911-40 max adjusted -0.1C and min -0.5C).
- David A
  
  Posted Mar 30, 2015 at 6:30 AM | Permalink
  
  Geoff, your questioning of the surface record is certainly valid. As you say…”If the historic temperature is wrong, so are essentially all outcomes.”
  
  In many ways it may well be that the historic surface record is as manipulated as the proxies.
  - kim
    
    Posted Mar 30, 2015 at 7:21 AM | Permalink
    
    GCMs, proxy reconstructions and temperature series, all hopelessly manipulated. There’s gotta be a way out of this mess.
    ==============
- Ron Clutz
  
  Posted Mar 31, 2015 at 3:47 PM | Permalink
  
  Geoff, I would like data in excel spreadsheets. Please contact me at my blog:
  https://rclutz.wordpress.com/2015/03/17/adjustments-multiply-warming-at-us-crn1-stations/
Gerald Machnee

Posted Mar 29, 2015 at 9:23 PM | Permalink

Looks like the cherries are in bloom.
I am sure that this will be completely resolved at Unrealclimate.
- Jeff Alberts
  
  Posted Mar 30, 2015 at 8:07 PM | Permalink
  
  D’Arrigo pie…
Geoff Sherrington

Posted Mar 29, 2015 at 9:30 PM | Permalink

My web hosting is playing up and the cat sat on the post button before I finished above. The said spread sheet can be emailed to you. I am at sherro1 at optusnet dot com dot au
Have people in other countries done similar comparisons?
I suspect that this work throws much doubt on the validity of a good deal of past work with global warming, before one even gets to the stage of dissection of statistical methods.
Hans Erren

Posted Mar 30, 2015 at 12:38 AM | Permalink

Hide the incline
- sue
  
  Posted Mar 30, 2015 at 6:03 AM | Permalink
  
  If I’m reading this correctly, The volcanoes done it… http://www.nature.com/ncomms/2015/150330/ncomms7545/full/ncomms7545.html
HAS

Posted Mar 30, 2015 at 1:28 AM | Permalink

BTW O/T but congrats on The Weblog Awards http://2015.bloggi.es/

Lifetime Achievement: Climate Audit http://climateaudit.org
- Richard Drake
  
  Posted Mar 30, 2015 at 6:14 AM | Permalink
  
  +1
  - kim
    
    Posted Mar 30, 2015 at 7:18 AM | Permalink
    
    Also note other interesting other rans.
    =======================
    - ThinkingScientist
      
      Posted Mar 30, 2015 at 8:38 AM | Permalink
      
      Indeed! I see other winners include JoNova and Breitbart…
      Much well deserved congrats to CA though.
    - mrsean2k
      
      Posted Mar 30, 2015 at 12:12 PM | Permalink
      
      “It’s a sad commentary on the level of intellect when a blog about a Celebrity Dachsund is the overall winner. There are so many noble pursuits that we can spend our precious ti…, er…, aaaahh, look at his little *face* though !”
- dfhunter
  
  Posted Mar 31, 2015 at 6:18 PM | Permalink
  
  O/T but like HAS & others just wanted to add my congrats to Steve on the well deserved award in recognition of the graft/effort (on his own time/expense) he has put in at CA over the years (so “Lifetime Achievement” fits the bill)
  & congrats also to all the regulars/commentators past & present for helping making this blog interesting & fun to follow (as a lurker mostly 🙂
Salamano

Posted Mar 30, 2015 at 6:03 AM | Permalink

…”It is possible that reconstructions using AD800 and earlier networks do not pass Mannian validation, but this is not a given: in the Mann et al 2008 example shown above, both the reconstruction before and after the step change pass Mannian verification criteria or they would not have been shown.”…

I would assume that this is what their reply would be. ‘It doesn’t pass for the “Gyre” reconstruction, prior to AD900.’ Perhaps they would also say that it’s simply a given if they don’t state otherwise, and that that itself is an appropriate thing to say for a researcher.

…”Given that there’s no way that a reconstruction using contaminated Finnish sediments, stripbark bristlecone chronologies, truncated MXD series and nondescript tree ring chronologies can rise above phrenology and have actual significance.”…

Isn’t this indeed what’s accepted, according to what’s currently available in the climate science literature? (outside of comments that say that various parts of individual reconstructions “should not be used” or “might not be best”, etc.) How can statements like this be made above blog-level to achieve the desired relevance? These researchers have declared a mechanism and declared whatever correlation significance, and published it. Those three things come together to make the point, and would have to need something published opposing it (and standing as such), for the counter-point is made, yes?

If the literature accepts a mechanism for their correlation to be valid, then wouldn’t, from their stand-point, this whole thing rise above phrenology?
- David A
  
  Posted Mar 30, 2015 at 6:38 AM | Permalink
  
  I would like to hear what the causative mechanism is?
  
  The main aspects of the gyre are salinity, Temperature, and speed. It would be good to hear how mixing and matching such unrelated proxy have a stable and consistent interpretation throughout the study period.
  
  Of course if one is allowed “… weird Mannian rules, sometimes the addition of a single nondescript series can change the number of retained regularization parameters. In a change from one to two or two to three regularization parameters, the weights assigned to individual proxies in the RegEM calculation can change dramatically and, even more importantly, change sign. This is nowhere discussed in peer reviewed literature by publicly funded academics, but is the case nonetheless.” then I think this method can correlate to all past observation; perhaps graduating to a unified theory of everything?
Jit

Posted Mar 30, 2015 at 6:58 AM | Permalink

The step change at AD900, if shown, would call the entire reconstruction into doubt. So it’s no surprise that it should be excluded. The language is very cute: “skilful after AD900” does not axiomatically equal “no good before AD900”. But my worry is that it isn’t scientific. Ignoring the caveats about the reconstruction’s validity, scientists are obligated to draw attention to any adverse data or results, not hide it behind cute language and over claim for the remainder. In other words the inconvenient bits of the curve should be shown, and their importance – or unimportance – to the story argued in the text.
hswiseman

Posted Mar 30, 2015 at 8:30 AM | Permalink

First Comment captures it-This is painful to watch. Tons of $$ being expended on bad science with no legitimate oversight, lapped up by a completely ignorant media and disgorged to the public as unassailable truth. Much worse than we thought, indeed.
Frank

Posted Mar 30, 2015 at 10:35 AM | Permalink

Steve: These reconstructions are allegedly skillful vs red noise pseudo-proxies. How many “degrees of freedom” (number of PC’s being retained? or other choices) are available to someone doing a reconstruction with RegEM?

Suppose, I took two sets of red noise pseudo-proxies and fit one set to the historical period making use of all of the flexibility available to me as a “reconstructor”. Using those same choices, would the first set of pseudo-proxies appear skillful compared with the second set? What if there were modest differences in the redness of the noise?

Suppose I first fit red noise pseudo-proxies to the historical period using all of the choices available to the “reconstructor”. If I used the same RegEM choices with the real proxies, would it be possible to show that the pseudo-proxies were more skillful than the real proxies?
Perhaps or Maybe?

Posted Mar 30, 2015 at 10:46 AM | Permalink

If I use adjusted data from selected sources to hide an incline, do I get to keep my membership in the Climate Scientist Club?
David Socrates

Posted Mar 30, 2015 at 11:47 AM | Permalink

Geoff,

Very interesting to read the 0.3C to 0.4C per century Australian temperature trend. Isn’t it bizarre that, whenever independent people look at a long term trend in a particular region, it is typically in the range 0.3C to 0.6C per century (and therefore un-alarming)?

For example, each of the following trends are all for data from 1850 onwards:

1. UK Met Office HadCRUT4 worldwide temperature series: 0.46C per century
2. UK Central England temperature series: 0.46C per century
3. New Zealand NIWA 7-station temperature series: 0.47C per century
4. Northern Ireland Armagh observatory temperature series: 0.6C per century

It is reasonable to assume that these are from parts of the world where record keeping was carried out with scientific rigour. Yet ‘official’ alarmist figures are typically two or three times as high.

David Cosserat
Craig Loehle

Posted Mar 30, 2015 at 11:51 AM | Permalink

There are 2 key rules for quantitative analysis:
1) Don’t use complex tools that you don’t understand.
2) Math & stats are only valid if their assumptions are met. If a matrix is not invertible (is degenerate) then you can’t use certain techniques. If it is nearly degenerate, results will be unstable. Again, stop. If you fail to sample your target population (as in Lew’s study) you can’t draw inferences to that population. Stop. If data are horribly skewed, either transform or stop. Stop stop stop when you can’t verify that the method is valid. Getting through the jungle may entail just slashing away with a machete, but this is not how math and stats are done. Unfortunately, software enables people to use tools they don’t have the faintest understanding of.
mrsean2k

Posted Mar 30, 2015 at 12:07 PM | Permalink

Forgive my ignorance, but how does an artefact like that step-change “make it” into a method?

I assume in less heated fields this sort of thing is weeded out by testing the method in a lot of scenarios with real and synthetic data before declaring it fit for purpose / adding appropriate caveats to use?

At this point is any purpose served by determining what causes that behaviour, or is it enough to observe that it happens?
opluso

Posted Mar 30, 2015 at 1:52 PM | Permalink

This may be OT, but Mann’s use of RegEM was noted (and gently criticized) at least as far back as 2001 (see Schneider, http://clidyn.ethz.ch/papers/imputation.pdf). Has Mann adjusted his RegEM techniques since Mann, et al. 1998? And if so, have the results been improved, or only made worse?

I get the impression that everything Mann does is simply adding a fresh layer of icing to his original 1998 cake. It’s getting a bit stale.

Steve: Mann has made numerous changes to his RegEM method. The problem shown her arises in the TTLS variation where the number of regularization parameters varies. As I’ve said endlessly, the problem is not the failure of the multivariate method to be complicated enough, but the inconsistency of the proxies. In the present case, the main issue is phrenology: stripbark bristlecones are not “proxies” for Atlantic ocean currents.
- Craig Loehle
  
  Posted Mar 30, 2015 at 2:17 PM | Permalink
  
  Maybe it is fruitcake–those never spoil…
- opluso
  
  Posted Apr 1, 2015 at 10:23 PM | Permalink
  
  A recent peer-reviewed analysis of Mann’s RegEM-TTLS technique (among others) can be found in Wang, et al. (2014) http://www.clim-past.net/10/1/2014/cp-10-1-2014.html
  
  Steve: thanks for the link. Unfortunately, this article is very non-insightful into the properties of the technique. Old CA posts are more analytic.
  - opluso
    
    Posted Apr 2, 2015 at 7:49 AM | Permalink
    
    Given that the authors supported an alternative in GraphEM, while gently criticising Mann’s RegEM for proxy reconstruction, perhaps the publication’s editors would be open to a new M&M critique of Mann’s proxy manipulations.
Bitter&Twisted

Posted Mar 30, 2015 at 4:40 PM | Permalink

Wot no Nick Stokes?
Where will I get my laughs?
Adrian_O

Posted Mar 31, 2015 at 2:31 AM | Permalink

Steve, CONGRATULATIONS for the all categories Lifetime Achievement Award for your blog.
I know of no one more deserving than you!
- Don Keiller
  
  Posted Mar 31, 2015 at 3:41 AM | Permalink
  
  +1 on that Adrian.
  Marvellous and well-deserved, Steve.
stevefitzpatrick

Posted Mar 31, 2015 at 4:25 PM | Permalink

“It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty–a kind of leaning over backwards. For example, if you’re doing an experiment, you should report everything that you think might make it invalid–not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked–to make sure the other fellow can tell they have been eliminated.”

I guess Rahmstorf et al somehow figure this Feynman comment does not apply to creative proxy reconstructions.
Frank

Posted Mar 31, 2015 at 4:57 PM | Permalink

Steve wrote: “In the present case, the main issue is phrenology: stripbark bristlecones are not “proxies” for Atlantic ocean currents.”

Unfortunately, you also cite the claim of “skilful reconstruction [of gyre SST] back to AD 900 and beyond (95% significance compared to a red-noise null).” Do you object to the claim that the reconstruction is skillful? Or the hypothesis that gyre temperature is predictive of AMOC flow?

Phrenology makes no sense, but neither did QM. As Feynman explains in QED, it doesn’t matter whether a particular theory makes sense or whether you like it – its value should be judged by the accuracy of the predictions it makes. Herein, you have shown that Rahmstorf’s predictions don’t make sense before or after the period presented in their paper. That should be good enough, but is still leaves me wondering what tricks led to the claim of skillful reconstruction.

Steve: they use some Mannian techniques that are similar to those criticized by us in McIntyre and McKitrick 2005 (GRL). In our critique, we observed that, in another context, PEter Phillips, a very distinguished econometrician, had diagnosed false claims of “significance” as arising from incorrect benchmarking of distributions of statistics being measured. The RE statistic has no theoretical distribution and benchmarking depends on simulations. We argued in MM2005 that MAnn’s simulations did not properly simulate all the features of his method and that when more accurately simulated the true benchmarks were much higher and the claimed RE statistic was no longer “significant”. This critique has never been rebutted (other than the unconvincing Texas sharpshooting of Wahl and Ammann that I do not include as a “rebuttal”) but has been ignored by Mann and others. Rahmstorf et al claim (without citation) that the defective Mannian techniques are “standard” in the field – a claim that seems exaggerated to me. Since stripbark bristlecones and contaminated sediments are not proxies for Atlantic ocean currents, it is my belief that the RE statistics claimed to be “significant” are not actually significant.
- nutso fasst
  
  Posted Apr 1, 2015 at 9:26 AM | Permalink
  
  “Mannian techniques are ‘standard’ in the field…”
  
  Do they define the boundaries of “the field?”
Jimmy Haigh

Posted Apr 1, 2015 at 2:30 AM | Permalink

In years to come, Universities may offer complete courses in “Mannian Statistics”…
Frank

Posted Apr 2, 2015 at 2:26 PM | Permalink

Steve: Thank you for the informative reply. I personally have little faith in annually resolved proxies that are unable to reproduce annual fluctuations in temperature (low correlation coefficient), but are alleged (via RE) to reproduce long-term changes. If I understand your reply above, one needs Monte Carlo simulations (with an appropriate amount of auto-correlation in the noise) to interpret the meaning of the RE statistic. The North report called for reporting more than just the RE statistic, but that recommendation continues to be ignored. Have real statisticians demonstrated under what conditions a significant RE, but low R2, might be meaningful. Does pseudo-proxy data created from temperature output from a climate model demonstrate that RE.

If one analyzed tree-ring data through periods with a “divergence” problem, one probably will find that R2 shows that TRW is still influenced by local temperature in the divergence period while RE shows that trees magically stopped responding to local temperature after 1960. (Perhaps this is too naive.)

Steve: Rahmstorf et al report a CE statistic, but say that even negative statistics are “significant” according to their benchmarking. There’s a trick in the benchmarking as obviously the reconstructions aren’t significant. If “proxies” really are proxies, then the verification r2 will be significant. To its shame, North did not make this obvious observation, but instead suggested the CE statistic, which doesn’t have a known distribution either. Not a proud moment for North. There was some controversy among the panelists of the North panel about how North presented its findings at the press conference – I’ve been meaning to write about this for a number of years.
dynam01

Posted Apr 3, 2015 at 1:50 AM | Permalink

Reblogged this on I Didn't Ask To Be a Blog.