Oral Argument 1: Context

I have an audio copy of the oral argument in Mann v Steyn, which I’ve posted up (see link at the end of this post). One of things often under-estimated by those readers (especially at WUWT) who are bloodthirsty for litigation as a means of settling scores is that it’s not easy for litigation lawyers to fully assimilate a complicated history. In the oral argument of the anti-SLAPP motion, both the lawyers and judges seem too often to be playing blind man’s bluff with the facts, making a decision both unpredictable and probably somewhat random.

I plan to do separate posts on the oral arguments of each lawyer. John Williams, Mann’s lawyer, frequently misrepresented the facts (as he did in the written brief). Michael Carvin, National Review’s lawyer, was not only too ignorant of the facts to stuff the misrepresentations of Mann’s lawyer, but made some bizarre gaffes that made me cringe listening to it. In my opinion, Carvin’s representation was only passable when he was tub thumping about the First Amendment in a context that did not require knowledge of the facts in this case. Andrew Grossman, CEI’s lawyer, seemed to me to be the person who understood the facts reasonably well, but he got sidetracked onto technical issues of evidence and, unwisely in my opinion, let Carvin handle the rebuttal for both parties.

In preparing notes on the oral argument, I got diverted into the need for explication on several fronts.

Most of the legal concepts involved in libel defence are unfamiliar to readers. On the other hand, the judges are unfamiliar with the facts, which, unfortunately, are sometimes either poorly represented or not represented at all in the briefs.

The leading cases (Malkovich, Moldea, Guilford, Harte-Hanks) are common ground to the lawyers and judges, but not to readers.  In this series, I’ll include some discussion of the main libel defences in play in this proceeding. Because Mann’s lawsuit claims libel not simply from the term “fraudulent”, but also from epithets ranging from “ringmaster of the tree ring circus”, “intellectually bogus” to “data manipulation” and data “torture”, the suit necessarily involves a wide range of libel law.

With all the attention paid to “Mike’s Nature trick” and “hiding the decline”, you’d think that the relevant procedures would have been carefully explained in the briefs. But they haven’t. Three different diagrams are involved in the various controversies (the WMO 1999 cover, the IPCC 2001 spaghetti graph and the Mann et al 1998-99 hockeystick diagram).  In my opinion, CA posts are not only the most authoritative source on these procedures, but the only source which carefully describes the procedures, free of disinformation.  Carvin, on behalf of National Review, completely failed to understand the differences between the diagrams and thus his factual statements tend to be unintelligible or uninterpretable. (Carvin did forcefully made some First Amendment arguments, but, in doing so, too often failed to observe that various opinions were not only permitted, but reasonable.)   During the closing phases of the rebuttal argument, the judges turned their attention to important questions of disclosure, issues that were not addressed in the written briefs as clearly as they might have been.

Assertions from John Williams, Mann’s lawyer, are even less reliable. His overt misrepresentations about the findings of various inquiries has been documented in previous CA posts.  Unnoticed in the oral argument and reply briefs was that Williams had slipped an untrue and deceptive characterization of “Mike’s Nature trick” into their most recent written brief, which otherwise mostly tracked his original January 2013 (almost word for word in many sections). I’ll discuss this new disinformation in a separate post.

While much of the recent controversy (including some of Simberg’s references) focused on issues regarding the “trick”, Steyn had described Mann’s particular hockey stick as “fraudulent” as long ago as 2006 (h/t David Appell). In Steyn’s earlier criticism, Steyn had specifically referred ing to Mann’s (undisclosed) use of a biased algorithm in the production of his original Hockey Stick . Inter-related were  contemporary controversies about Mann’s withholding of adverse verification statistics and misrepresentation of the supposed robustness of his reconstruction to presence/absence of tree rings, especially stripbark bristlecones.   These issues are not directly mentioned in any of the “eight” inquiries that Mann and his lawyers listed as ones that the defendants ought to have been aware of, though they were touched on in the 2006 NAS panel and Wegman report, neither of which were listed as inquiries of which the defendants ought to have been aware of.  As noted in the past, Mann lied to the NAS panel about not calculating the verification r2 statistic.

Mann’s brief prominently cited the 2007 IPCC Assessment Report in support of the claim that various criticisms of his Hockey Stick didn’t matter.  CA readers will recall that the language of the 2007 IPCC Assessment was not an “independent” assessment, but resulted from surreptitious correspondence between Eugene Wahl, then a close associate of Mann’s, and IPCC Lead Author Keith Briffa (of East Anglia) and that the destruction of this correspondence was carried out by Wahl shortly after receiving an email from Mann containing Jones’ notorious request to destroy the emails.  This topic came up in the closing stages of the oral argument and Carvin’s uninformed and incompetent response about the destruction of emails and their relevance to Steyn’s accusation simply beggars belief.

Because Steyn and National Review have parted ways, Carvin and National Review seem to have been unaware of the long backstory and more or less presented the dispute (from National Review’s perspective) as little more than a purely academic controversy over the validity of tree rings as a temperature proxy, leaving the judges completely mystified on why Mann, as opposed to any one of hundreds of scientists, was at issue.  I do not see how the judges could possibly understand the articles without understanding Mann’s distinctive role in the Climategate emails and that the widespread calls for misconduct investigations were not “commissioned by” either CEI or National Review, nor did either institution play any role in prompting the investigation at Penn State that was the topic of Simberg’s commentary. Nor did either institution play any role in the formation of any of the other inquiries, such as they were, other than CEI’s petition for reconsideration of the EPA Endangerment Finding.

The only misconduct inquiry to take evidence from Mann himself appears to have been the one at Penn State, an institution, which, as is well known, subsequently received intensely unfavorable publicity for its failure to properly investigate misconduct by Jerry Sandusky.  Simberg’s article was written on the remarkable occasion of former FBI director Louis Freeh recommending criminal charges against Penn State president Graham Spanier for his failures in connection with the investigation of Sandusky’s misconduct.  CEI’s written brief discussed this context,  but, in retrospect, much less forcefully than it might have, while National Review ignored it.

Recently, misconduct and misconduct investigations have been widely publicized in the recent U.S. controversies about police misconduct and police misconduct investigations. No one seriously contends that a report of a misconduct inquiry necessarily puts an end to discussion or controversy.  It is hard to contemplate the amount of controversy that would result if an external review of procedures in a police misconduct investigation resulted in a police chief being charged criminally for obstruction.  Further, if a police chief was charged in respect to one misconduct investigation, one can presume that there would be vociferous demands that other misconduct investigations be re-examined. Although these analogies seem obvious, they were not pursued in the briefs or oral argument.

In the case of the Mann misconduct investigation, major defects in the procedure were already known.  For example, there was the astonishing communication from a member of the Penn State Inquiry Committee that William Easterling, who was said to have “recused” himself due to conflict of interest, had actually interfered with the Inquiry Committee to prevent them from interviewing me. Or that it was Graham Spanier who re-assured the Penn State community about the supposed thoroughness of the investigation into Mann’s conduct.

While CEI’s brief took note of one aspect of academic misconduct, they overlooked Penn State policy AD-47, which was actually at issue for the Investigation Committee. In the oral argument, Carvin did not appear to understand the scope of academic misconduct investigations and, bizarrely, did not appear to understand how the term “falsification” is defined in academic codes of conduct, a confusion that led him into a particularly cringeworthy gaffe.

The definition of academic misconduct as it applies to this case needs to be reviewed and I’ll do that separately.

In my prior commentary on this case, I mostly focused on Mann’s misrepresentations in regard to the various investigations, as it seemed to me that the case could be decided most easily on Mann’s failure to demonstrate “actual malice”.  As a result, I haven’t commented on the “actionability” of the various epithets.  While WIlliams has attempted to assimilate all terms as accusations of “fraud”, it seems to me that there are very large differences between allegations of “ringmaster of the tree-ring circus”, “intellectually bogus”, “data manipulation”, “data torture”, “academic misconduct” and “fraudulent hockey stick” and that these very different allegations cannot be armwavingly assimilated. This distinction is particularly relevant to CEI and Simberg, who did not use the word “fraud”.

Rather than trying to deal with the language on an overall basis, it seems worthwhile to look at each epithet individual.  Both Grossman and Williams commented in oral argument about the term “data manipulation”, with Williams’ reply appearing to me to be a major gaffe.  I’ll also discuss an interest precedent regarding use of the word “bogus” that was cited in the National Review brief. (The word “bogus” was one of a number of epithets used by Harry Edwards, then the Chief Judge of the D.C. Circuit, in an academic article responding to critics of the D.C. Circuit).  In Carvin’s closing, Carvin forcefully reminded the judges of EPA’s finding in relation to the word “fraudulent” in respect to charges against Mann, reminding them EPA determined that the term when applied to the arguments of Mann’s opponents, meant no more than that those arguments were “scientifically flawed” – a point previously noted in CEI’s reply brief. Though very late in the proceedings, this point seemed to give some pause to the judges.

While there are many interesting and complicated issues pertaining to the actionability of the language,  it seems to me (as it has for a long time) that it is relatively easy to decide the case on Mann’s failure to establish “actual malice” as understood in U.S. libel law.  In my own commentary to date on this case, I’ve focused on the flagrant misrepresentations of the findings of the various inquiries in Mann’s brief and the dependence of his actual malice argument on those misrepresentations. Mann’s lawyer offered only a single case in support (Harte-Hanks), but it can be trivially distinguished from the facts in the present case.

If a Canadian court were approaching this matter (using the style of Canadian decision given U.S. law), if it could decide the case on Mann’s failure to show evidence of “actual malice” as defined under U.S. law (as I believe to be required on what Mann has produced to the court), a Canadian court would, in many cases, abstain from decision or commentary on actionability issues, lest it make a bad precedent on controversial facts that were poorly argued by the lawyers, but would dismiss Mann’s case on the narrowest issue of his failure to provide evidence supporting “actual malice” as defined in U.S. libel law. Such a decision would, in this case, leave everyone disappointed – an outcome that might well appeal to the D.C. judges as well as being just.

The link to the audio is in two parts: Part 1; Part 2.  Stay tuned for more discussion. On those topics where I’ve indicated an intent to comment in more detail, I’d prefer that commenters wait for this more detailed commentary rather than pre-empting a more detailed exposition.


IPCC Lead Author and the Nazca Vandalism

nazca linesIPCC Lead Author Sven Teske, as alertly observed by Shub Niggurath, was one of the leaders of the vandalism of the Nazca lines during the recent Lima conference.

Several years ago, I had criticized Teske in his role as IPCC Lead Author, a criticism also taken up by Mark Lynas.

Like the Nazca vandalism, Teske, a Greenpeace employee and activist, had promoted the Greenpeace scenario in the IPCC special report on renewables. Teske had been Lead Author of the chapter responsible for critical assessment of the feasibility of the Greenpeace renewables scenario – an assessment that was not carried out in the chapter or report, despite expectations of policy-makers and the public.

The Greenpeace scenario was then equally uncritically promoted in the IPCC press release, from which the following statement was widely distributed:

Close to 80 percent of the world‘s energy supply could be met by renewables by mid-century if backed by the right enabling public policies a new report shows.

WG3 Co-Chair Ottmar Edenhofer defended Teske at the time as having been nominated by the German Government:

Sven Teske was nominated as an author by the German government and selected by the WGIII as Lead author in the IPCC’s continuous effort to draw on the full range of expertise, and this includes NGOs and business as well as academia.

Reply to Laden and Hughes on Sheep Mountain

A couple of days ago, Greg Laden published a response from Malcolm Hughes to my recent Sheep Mountain article. In today’s post, I’ll show that the “response” was both unresponsive and absurd. Continue reading

“Unprecedented” Model Discrepancy

Judy Curry recently noted that Phil Jones’ 2014 temperature index (recently the subject of major adjustments in methodology) might be a couple of hundredths of degree higher than a few years ago and alerted her readers to potential environmental NGO triumphalism. Unsurprisingly, it has also been observed in response that the hiatus continues in full force for the satellite records, with 1998 remaining the warmest satellite year by a considerable margin.

Equally noteworthy however – and of greater interest to CA readers where there has been more focus on model-observation discrepancy   – is that the overheating discrepancy between models and surface temperatures in 2014 was the fourth highest in “recorded” history and that the 5 largest warm discrepancies have occurred in the past 6 years.  The cumulative discrepancy between models and observations is far beyond any previous precedent. This is true for both surface and satellite comparisons.

In the figure below, I’ve compared CMIP4.5 RCP4.5 models to updated surface observations (updating a graphic used here perviously), adding a lower panel showing the discrepancy between observations and CMIP5 RCP4.5 model mean.

ci_GLB_tas_1920_twopanel

 

Figure 1. Top panel.  CMIP RCP4.5 model mean (black) and 5-95% percentile envelope (grey) compared to HadCRUT4 (red). Dotted blue – the projection of the hiatus/slowdown (1997-2014) to 2030; dotted red – a projection in which observations catch up to CMIP5 RCP4.5 model mean by 2030.  Bottom panel – discrepancy between CMIP5 RCP4.5 model mean and HadCRUT4 observations.  All values basis 1961-1990.

 

During the hiatus/slowdown, HadCRU changed their methodology:  the changes in methodology contribute more to the slight resulting trend in HadCRUT4 than the trend in common with the older methodology. But even stipulating the change in method, 2014 observed surface temperatures are somewhat up from 2013, but still only at the bottom edge of the confidence interval envelope for CMIP5 models.   Because the CMIP5 model mean goes up relentlessly, the 2014 uptick in HadCRUT4 is far too little to catch up to the discrepancy, which remains at near-record levels.  I’ve also shown two scenarios out to 2030. The dotted blue line continues the lower trend during the hiatus, while the dotted red line shows a catch-up to model mean by 2030.  Reasonable people can disagree over which of the two scenarios is more likely.  In either scenario, the cumulative discrepancy continues to build and reach unprecedented levels.

In the second graphic, I’ve done an identical plot for satellite temperature (RSS TLT), centering over 1979-1990 since satellite records did not start until 1979. The discrepancy between model TLT and observed TLT is increasingly dramatic.

ci_GLB_tlt_1920_twopanelF IGURE 2. As above, but for TLT satellite records.

 

Reasonable people can disagree on why the satellite record differs from the surface record, but the discrepancy between models and observations ought not to be sloughed off because the 2014 value of Phil Jones’ temperature index is a couple of hundredths higher than a few years ago.

The “warmest year”, to its shame, neglected Toronto, which experienced a bitter winter and cool summer last year. For now, we can perhaps take some small comfort in the fact that human civilization has apparently continued to exist, perhaps even thrive, even in the face of the “warmest year”.

UPDATE Dec 12
Some readers wondered why I showed RSS, but not UAH. In past controversies, RSS has been preferred by people who dislike the analysis here, so I used it to be accommodating. Here is the same graphic using UAH.

ci_GLB_tlt_1920_UAH_twopanel
Figure 3. As Figure 2, but with UAH.

 

 

Sheep Mountain Update

Several weeks ago,  a new article (open access) on Sheep Mountain (Salzer et al 2014 , Env Res Lett) was published, based on updated (to 2009) sampling at Sheep Mountain.

One of the longstanding Climate Audit challenges to the paleoclimate community, dating back to the earliest CA posts, was to demonstrate out-of-sample validity of proxy reconstructions, by updating inputs subsequent to 1980. Because Graybill’s bristlecone chronologies were so heavily weighted in the Mann reconstruction,  demonstrating out-of-sample validity at Sheep Mountain and other key Graybill sites is essential to validating the Mann reconstruction out of sample.

The new information shows dramatic failure of the Sheep Mountain chronology as an out-of-sample temperature proxy, as it has a dramatic divergence from NH temperature since 1980, the end of the Mann et al (and many other) reconstructions.  While the issue is very severe for the Mann reconstructions, it affects numerous other reconstructions, including PAGES2K. Continue reading

Anti-SLAPP Hearing Today

Mann v CEI, National Review, Simberg, Steyn and their amici is being argued today. Amici for Steyn, CEI, Simberg and NR include: American Civil Liberties Union, the Reporters Committee for Freedom of the Press, American Society of News Editors, the Association of Alternative Newsmedia, the Association of American Publishers, Inc., Bloomberg L.P., the Center for Investigative Reporting, the First Amendment Coalition, First Look Media Inc., Fox News Network, Gannett Co. Inc., the Investigative Reporting Workshop, the National Press Club, the National Press Photographers Association, Comcast Corporation, the Newspaper Association of America, the North Jersey Media Group Inc., the Online News Association, the Radio Television Digital News Association, the Seattle Times Company, the Society of Professional Journalists, Stephens Media LLC, Time Inc., Tribune Publishing, the Tully Center for Free Speech, D.C. Communications, Inc. and the Washington Post.

Disappointingly, Scott Mandia and the costumed vigilantes of the Climate Response Team elected not to appear as Mann amici. (Nor anyone else.)

New Data and Upside-Down Moberg

I’ve been re-examining SH proxies for some time now, both in connection with PAGES2K and out of intrinsic relevance.  In today’s post, I’ll report on a new (relatively) high-resolution series from  the Arabian Sea offshore Pakistan (Boll et al 2014, Late Holocene primary productivity and sea surface temperature variations in the northeastern Arabian Sea: implications for winter monsoon variability, pdf).  The series has considerable ex ante interest on a couple of counts. Alkenones yield temperature proxies that have a couple of important advantages relative to nearly all other temperature “proxies”: they are calibrated in absolute temperature (not by anomalies); and they yield glacial-interglacial patterns that make “sense”. No post hoc screening or trying to figure out which way is up.  In the extratropics, their useful information is limited to summer season, but so are nearly all other proxies. Though more or less ignored in IPCC AR5, the development of alkenone series has arguably been one of the most important paleoclimate developments in the past 10 years and is something that I pay attention to.

But there is a big conundrum in trying to use them for 20th century comparisons: all of the very high resolution alkenone series to date are from upwelling zones and show a precipitous decline (downward HS) in 20th century temperatures. See discussion here.  These precipitous declines have been very closely examined by specialists, who conclude, according to my reading, that this is not a “divergence” breakdown of the proxy-temperature relationship, but rather an actual decrease in local SST in the upwelling zone, attributed (plausibly) to increased upwelling.

Because upwelling zones form only a small fraction of the ocean (though an important fraction due to biological productivity),  it is important to obtain corresponding high-resolution alkenone series from non-upwelling zones. The Boll et al 2014 is the first such example that I’ve seen and, in my opinion, it sheds very interesting new light on the vexed issue of two-millennium temperature. Continue reading

Data Torture in Gergis2K

Reflecting on then current scandals in psychology arising from non-replicable research,  E. Wagenmakers, a prominent social psychologist,  blamed many of the problems on “data torture”.  Wagenmakers attributed many data torture problems on ex post selection of methods. In today’s post, I’ll show an extraordinary example of data torture in the PAGES2K Australasian reconstruction.

Wagenmakers on Data Torture

 

Two accessible Wagenmakers’ articles on data torture are An Agenda for Purely Confirmatory Research pdf and a Year of Horrors pdf.

In the first article, Wagenmakers observed that psychologists did not define their statistical methods before examining the data, creating a temptation to tune the results to obtain a “desired result”:

we discuss an uncomfortable fact that threatens the core of psychology’s academic enterprise: almost without exception, psychologists do not commit themselves to a method of data analysis before they see the actual data. It then becomes tempting to fine tune the analysis to the data in order to obtain a desired result—a procedure that invalidates the interpretation of the common statistical tests. The extent of the fine tuning varies widely across experiments and experimenters but is almost impossible for reviewers and readers to gauge.

Wagenmakers added:

Some researchers succumb to this temptation more easily than others, and from presented work it is often completely unclear to what degree the data were tortured to obtain the reported confession.

It is obvious that Wagenmakers’ concerns are relevant to paleoclimate, where ad hoc and post hoc methods abound and where some results are more attractive to researchers.

Gergis et al 2012

As is well-known to CA readers, Gergis et al did ex post screening of their network by correlation against their target Australasian region summer temperature.   Screening reduced the network from 62 series to 27.  For a long time, climate blogs have criticized ex post screening as a bias-inducing procedure -a bias that is obvious, but which has been neglected in academic literature.  For the most part, the issue has been either ignored or denied by specialists.

Gergis et al 2012, very unusually for the field, stated that they intended to avoid screening bias by screening on detrended data, describing their screening process as follows:

For predictor selection, both proxy climate and instrumental data were linearly detrended over the 1921-1990 period to avoid inflating the correlation coefficient due to the presence of the global warming signal present in the observed temperature record. Only records that were significantly (p<0.05) correlated with the detrended instrumental target over the 1921-1990 period were selected for analysis. This process identified 27 temperature-sensitive predictors for the SONDJF warm season.

Unfortunately for Gergis and coauthors, that’s not what they actually did. Their screening was done on undetrended data. When screening was done in the described way, only 8 or so proxies survived.  Jean S discovered this a few weeks after publication of the Gergis et al article on May 17, 2012.  Two hours after Jean S’ comment at CA, coauthor Neukom notified Gergis and Karoly of the problem.

Gergis and coauthors, encouraged by Gavin Schmidt and Michael Mann, attempted to persuade the Journal of Climate editors that they should be allowed to change the description of their methodology to what they had actually done. However, the editors did not agree, challenging the Gergis coauthors to show the robustness of their results. The article was not retracted. The University of Melbourne press statement continues to say that it was published on May 17, 2012, but has been submitted for re-review (and has apparently been under review for over two years now.)

PAGES2K

The PAGES2K Australasian network is the product of the same authors. Its methodological description is taken almost verbatim from Gergis et al 2012.  Its network is substantially identical to the Gergis 2012 network: 20 of 27 Gergis proxies carry forward to the P2K network. Several of the absent series are from Antarctica, covered separately in P2K.  The new P2K network has 28 series, now including 8 series that had been previously screened out.  The effort to maintain continuity even extended to keeping proxies in the same order in the listing, even inserting new series in the precise empty spaces left by vacating series.

Once again, the authors claimed to have done their analysis using detrended data:

All data were linearly detrended over the 1921-1990 period and AR(1) autocorrelation was taken into account for the calculation of the degrees of freedom [55].

This raises an obvious question:  in the previous test using detrended data, only a fraction passed.  So how did they pass the detrended test this time?

Read their description of P2K screening and watch the pea:

The proxy data were correlated against the grid cells of the target (HadCRUT3v SONDJF average). To account for proxies with different seasonal definitions than our target SONDJF season (for example calendar year averages) we calculate the correlations after lagging the proxies for -1, 0 and 1 years. Records with significant (p < 0.05) correlations with at least one grid-cell within a search radius of 500 km from the proxy site were included in the reconstruction. All data were linearly detrended over the 1921-1990 period and AR(1) autocorrelation was taken into account for the calculation of the degrees of freedom [55]. For coral record with multiple proxies (Sr/Ca and ä18O) with significant correlations, only the proxy record with the higher absolute correlation was selected to ensure independence of the proxy records.

Gergis et al 2012 had calculated one correlation for each proxy, but the above paragraph describes ~27 correlations: three lag periods (+1,0,-1) by nine gridcells ( not just the host gridcell, but the W,NW,N, NE,E,SE,S and SW gridcells, all of which would be within 500 km according to my reading of the above text.) The other important change is the change from testing against a regional average to testing against individual gridcells, which, in some cases, are not even in the target region.

Discussion

Gergis’  test against multiple gridcells takes the peculiar Mann et al 2008 pick-two methodology to even more baroque lengths.  Thinking back to Wagenmakers’ prescription of ex ante methods, it is hard to imagine Gergis and coauthors ex ante proposing that they test each proxy against nine different gridcells for “statistical significance”. Nor does it seem plausible that much “significance” can be placed on higher correlations from a contiguous gridcell, as compared to the actual gridcell.  It seems evident that Gergis and coauthors were doing whatever they could to salvage as much of their network as they could and that this elaborate multiple screening procedure was simply a method of accomplishing that end.  Nor does it seem reasonable to data mine after the fact for “significant” correlations between three different lag periods, including one in which the proxy leads temperature.

Had the PAGES2K coauthors fully discussed the background and development of this procedure from its origin in Gergis et al 2012, it seems hard to believe that a competent reviewer would not have challenged them on this peculiar screening procedure.  Even if such data torture were acquiesced in (which is dubious), it should have mitigated by requiring adjustment of the t-statistic standard to account for the repeated tests: with 27 draws, the odds of a value that is “95% significant” obviously change dramatically.  When the draws are independent, there are well-known procedures for doing so. Using the Bonferroni correction with 27 “independent” tests, the t-statistic for each individual test would have to be  qt(1- 0.05/27,df) rather than qt(1-.05,df).  For typical detrended autocorrelations, the df is ~55. This changes the benchmark t-statistc from ~1.7 to 3.0.  The effective number of independent tests would be less than 27 because of spatial correlation, but even if the effective number of independent tests was as few as 10, it increases the benchmark t-statistic to 2.7.  All this is without accounting for their initial consideration of 62 proxies – something else that ought to be accounted for in the t-test.

While all of these are real problems, the largest problem with the Neukom-Gergis network is grounded in the data:  the long ice core and tree ring series don’t have a HS shape. However, there is a very strong trend in coral d18O data after the Little Ice Age and especially in the 20th century.  Splicing the two dissimilar proxy datasets results in hockey sticks even without screening.   Such splicing of unlike data in the guise of “multiproxy” has been endemic in paleoclimate since Jones et al 1998 and is underdiscussed. It’s something that I plan to do.

There are other peculiarities in the Gergis dataset.  Between Gergis et al 2012, PAGES2K and Neukom et al 2014,  numerous proxies are assigned to inconsistent calendar years.  If a proxy is assigned to a calendar year that is inconsistent with the calendar year of its corresponding temperature series, the calculated correlation will be less than it really is.  Some of the low detrended correlations of Gergis et al 2012 appear to have arisen from errors in proxy year assignment. I noticed this with Oroko which I analysed in detail: it ought to pass a detrended correlation test given the splicing of instrumental data and therefore failure of a detrended correlation test requires close examination.

PAGES2K and Nature’s Policy against Self-Plagiarism

Nature’s policies on plagiarism state:

Duplicate publication, sometimes called self-plagiarism, occurs when an author reuses substantial parts of his or her own published work without providing the appropriate references.

The description of the Australasian network of PAGES2K (coauthors Gergis, Neukom, Phipps and Lorrey) is almost entirely lifted in verbatim or near-verbatim chunks from Gergis et al, 2012 (withdrawn and under re-review), in apparent violation of Nature’s policy against self-plagiarism.

Continue reading

Gergis2K and the Oroko “Disturbance-Corrected” Blade

Only two Gergis proxies (both tree ring) go back to the medieval period: Oroko Swamp, New Zealand and Mt Read, Tasmania, both from Ed Cook.  Although claims of novelty have been made for the Gergis reconstruction, neither of these proxies is “new”, with both illustrated in AR4 and Mt Read being used as early as Mann et al 1998 and Jones et al 1998.

In today’s post, I’ll look in more detail at the Oroko tree ring chronology, which was used in three technical articles by Ed Cook (Cook et al 2002 Glob Plan Chg; Cook et al 2002 GRL; Cook et al 2006) to produce temperature reconstructions.  In Cook’s earliest article (2002 Glob Plan Chg), Cook showed a tree ring chronology which declined quite dramatically after 1957.  Cook reported that there was a very high correlation to instrumental summer temperature (Hokitika, South Island NZ) between 1860 and 1957, followed by a “collapse” in correlation after 1957 – a decline attributed by Cook to logging at the site.  For his reconstruction of summer temperature, Cook  “accordingly” replaced the proxy estimate with instrumental temperature after 1957, an artifice clearly marked in Cook’s original articles, but not necessarily in downstream multiproxy uses.

Gergis et al 2012 (which corresponds to PAGES2K up to a puzzling one year offset) said that they used “disturbance-corrected” data for Oroko:

“for consistency with published results, we use the final temperature reconstructions provided by the original authors that includes disturbance-corrected data for the 213 Silver Pine record…( E. Cook, personal communication)

By “disturbance correction” , do they mean the replacement of proxy data after 1957 by instrumental data? Or have they employed some other method of “disturbance correction”?

Assessment of this question is unduly complicated because Cook never archived Oroko measurement data or, for that matter, any of the chronology versions or reconstructions appearing in the technical articles.  Grey versions of the temperature reconstruction (but not chronology) have circulated in connection with multiproxy literature (including Mann and Jones 2003, Mann et al 2008, Gergis et al 2012 and PAGES2K 2013).  In addition, two different grey versions occur digitally in Climategate letters from 2000 and 2005, with the later version clearly labeled as containing a splice of proxy and instrumental data.  The Gergis version is clearly related to the earlier grey versions, but, at present, I am unable to determine whether the “disturbance correction” included an instrumental splice or not.

There’s another curiosity.  As noted above, Cook originally claimed a high correlation to instrumental temperature up to at least 1957, and, based, on their figures, the correlation to 1999 would still have been positive, even if attenuated, but Mann and Jones 2003 reported a negative correlation (-0.25) to instrumental temperature.  However, Gergis et al 2012 obtained opposite results, once again asserting a statistically significant positive correlation to temperature.  To the extent that there had been splicing of instrumental data into the Gergis version, one feels that claims of statistical significance ought to be qualified.  Nonetheless, the negative correlation claimed in Mann and Jones 2003 is puzzling: how did they obtain an opposite sign to Cook’s original study?

As to the Oroko proxy itself,  it does not have anything like a HS-shape. It has considerable centennial variability. Its late 20th century values are somewhat elevated (smoothed 1 sigma basis 1200-1965), but nothing like the Gergis 4-sigma anomaly.  It has no marked LIA or MWP. It has elevated values in the 13th century, but it has low values in the 11th century, the main rival to the late 20th century, and these low 11th century values  attenuate reconstructions where 11th and 20th century values are close.  The HS-ness of the Gergis2K reconstruction does not derive from this series.

The Oroko Swamp site is on the west (windward) coast of South Island, New Zealand at 43S at low altitude (110 m).   In December 2012, during family travel to New Zealand South Island, we visited a (scenic) fjord on the west coast near Manapouri (about 45S).   These are areas of constant wind and very high precipitation. They are definitely nowhere near altitude or latitude treelines. Cook himself expressed surprise that a low-altitude chronology would be correlated to temperature, but was convinced by the relationship (see below).

In today’s post, I’ll parse the various versions far more closely than will interest most (any reasonable) readers.  I got caught up trying to figure out the data and want to document the versions while it’s still fresh in my mind. Continue reading

Follow

Get every new post delivered to your Inbox.

Join 3,422 other followers