McShane and Wyner, previewed in August, has now been published by Annals of Applied Statistics as a Discussion Paper here with an accompanying editorial by Michael Stein and discussions by 13 different groups (one of which is a short comment by Mc and Mc, with an excellent Rejoinder.
BTW using Firefox, I wasn’t able to open the papers by clicking, but I was able to download the papers.
CA readers are well aware of my own view that the fundamental problem in paleoclimate is not the need for some novel multivariate method, but better proxies and reconciliation of discordant existing “proxies”. CA readers are also aware that Team reconstructions use highly stereotyped proxies over and over again in different guises – bristlecones, Yamal – sort of an ongoing version of the Dead Parrot skit in Monty Python. McShane and Wyner used the Mann et al 2008 data set, which quixotically introduced the Tiljander sediments, the modern portion of which was contaminated with bridge-building sediments.
Given the central role of these specific proxies in the target data set, I checked the various discussions to see if anyone mentioned either bristlecones or upside-down Tiljander. Given the academic-ness and non-engineering-ness of the discussion, these fundamental issues of data quality are, needless to say, not noticed by the new entrants to the discussion.
Berliner of Hu McCulloch’s Ohio State, in a short comment, says sensible things about the poor quality of the proxies without specifically attending to nuances like bristlecones or upside-down Tiljander. My guess is that he would further roll his eyes if he were aware that this sort of thing is so deeply embedded in the field.
Other than a brief mention by Ross and I, the only discussants to mention bristlecones and Tiljander were Schmidt, Mann and Rutherford.
Schmidt et al analyse a subset of Mann et al 2008 proxies which excluded the Tiljander proxies (which they coyly described only “potentially contaminated” – ignoring both the clear original statements and subsequent clarifications by Mia Tiljander that the modern portion was contaminated.) However, this subset includes multiple Graybill bristlecone series – data sets well known to have selected by Graybill for strip bark. The NAS 2006 panel had recommended that this data be “avoided” in reconstructions; Mann et al 2008 said that they were adhering to the NAS 2006 recommendations, but used bristlecone chronologies anyway. CA readers are well aware of the pea-under-the-thimble character of the Mann et al 2008 sensitivity analyses related to Tiljander and bristlecones – they purported to show that bristlecones didn’t matter – in their highly publicized nodendro reconstruction – using upside-down contaminated Tiljander sediments, and then to show that the upside down sediments didn’t “matter” by using bristlecones. In a grey supplementary information to a different article (Mann et al 2009), key conclusions about the nodendro reconstruction were quietly withdrawn (without placing a notice attached to the original article.)
Schmidt et al further disseminate the disinformation that Mann et al 2008 performed a meaningful sensitivity test on the impact of bristlecones and upside-down Tiljander – remarkably failing to cite the partial withdrawing of results in Mann et al 2009.
The further elimination of 4 potentially contaminated “Tiljander” proxies [as tested in M08; M08 also tested the impact of removing tree-ring
40 data, including controversial long “Bristlecone pine” tree-ring records. Recent work, c.f. Salzer et al 2009, however demonstrates those data to contain a reliable long-term temperature signal], which yields a set of 55 proxies, further reduces the level of peak Medieval warmth (Figure 1a, c.f. Fig 14 in MW; See also Supplementary Figures S1-S2).
Salzer et al 2009, referred to here, notably did not cite Ababneh’s discordant results on Sheep Mountain bristlecones, where she was unable to replicate Graybill’s results. The failure to reconcile Ababneh’s results has been well known to CA readers for a long time and it is bizarre that people attempt to reconstruct past climate using data sets where conflicting results are simply ignored, rather than reconciled.
In a response reminiscent of Wegman’s imperious dismissal of after-the-fact changes to MBH methodology by Wahl and Ammann to “get” the desired result (see Wahl and Ammann, Texas sharpshooters), McShane and Wyner dismiss Schmidt et al’s post hoc ad hoc editing of the data set:
The process by which the complete set of 95/93 proxies is reduced to 59/57/55 is only suggestively described in an online supplement to Mann et al. (2008)3. As statisticians we can only be skeptical of such improvisation, especially since the instrumental calibration period contains very few independent degrees of freedom. Consequently, the application of ad hoc methods to screen and exclude data increases model uncertainty in ways that are unmeasurable and uncorrectable.
McShane and Wyner comment that there is an error in Schmidt et al Figure 2, which they were able to diagnose precisely through examination of online code. They archly observe that the error arose through “improper centering”. Plus ca change.
Before proceeding, however, we must note a troubling problem with SMR Figure 2. Visual inspection of the plots reveals an errant feature: OLS methods appear to have non-zero average residual in-sample! Upon examining the code SMR did provide, we confirmed that this is indeed the case and discovered the models were fit incorrectly. The culprit, ironically, is an improper centering of the fitted values.