The Two Jeffs on Emulating Steig

The two Jeffs ( C and Id) have interesting progress reports on emulating Steig using unadorned Tapio Schneider code here. Check it out. One of the first questions that occurred to third party readers was whether RegEM somehow increased the proportional weight of Peninsula stations to continental stations as compared to prior studies. Jeff C observes:

As I became more familiar, it dawned on me that RegEM had no way of knowing the physical location of the temperature measurements. RegEM does not know or use the latitude and longitude of the stations when infilling, as that information is never provided to it. There is no “distance weighting” as is typically understood as RegEM has no idea how close or how far the occupied stations (the predictor) are from each other, or from the AWS sites (the predictand).

Jeff notes that the Peninsula is less than 5% of the land mass, but has over 35% of the stations (15 of 42). Jeff shows that the reported Steig trend is cut in half merely through geographic grouping, saying:

Again, I’m not trying to say this is the correct reconstruction or that this is any more valid than that done by Steig. In fact, beyond the peninsula and coast data is so sparse that I doubt any reconstruction is accurate. This is simply to demonstrate that RegEM doesn’t realize that 40% of the occupied station data came from less than 5% of the land mass when it does its infilling. Because of this, the results can be affected by changing the spatial distribution of the predictor data (i.e. occupied stations).

The irrelevance of geography is something that we’ve observed in other Mannian methods, starting right from the rain in Maine (which falls mainly in the Seine.) In MBH98, geographic errors didn’t “matter” either. The rain in Spain/Kenya error in Mann 2008 only “mattered” because the hemisphere changed. Had the error stayed in the same hemisphere, it wouldn’t have “mattered”. Gavin Schmidt and Eric Steig took umbrage at someone bothering to notice a geographic error in the Supplementary Information. At the time, I noted that I wasn’t sure whether the error was a typo or, as in the MBH and Mann 2008 cases, was embedded in the information files themselves. In either case, I didn’t expect the error to “matter” simply because I didn’t expect that Steig’s methods care whether a site was correctly located – a point that is a corollary to the results of the two Jeffs. Take a look.

Andrew Sullivan on "Why I Blog"

Andrew Sullivan, a well-known writer, has been blogging since 2001 and won the 2008 Best Blog award (displayed at his site.) In November, prior to this competition, he published an excellent essay on blogging in the Atlantic Monthly, one that I read at the time and meant discuss. Give it a read.

While his lessons are conclusions are directed at literary and political writers, there are many well-expressed observations that resonate with me – now that I’ve been blogging for 4 years (hard to imagine.)

Blogging as a Form
Sullivan observes the importance of the reader community in the ethos of a blog and how the blog host can shape this ethos:

The role of a blogger is … similar in this way to the host of a dinner party. He can provoke discussion or take a position, even passionately, but he also must create an atmosphere in which others want to participate.

We can think of examples of other blogs, where one feels that the blog host/hosts fail to do that. One of the reasons for my insistence on readers avoiding food fights and my disappointment with readers when these happen is summarized in the above metaphor. You expect people to be polite at a dinner party; I expect them to be polite here.

He observes the sort of companionship that develops at a blog – something that I find quite noticeable here:

It renders a writer and a reader not just connected but linked in a visceral, personal way. The only term that really describes this is friendship. And it is a relatively new thing to write for thousands and thousands of friends….

Sullivan observes that blogs end up exposing the blogger’s personality:

the atmosphere will inevitably be formed by the blogger’s personality.

and

You end up writing about yourself, since you are a relatively fixed point in this constant interaction with the ideas and facts of the exterior world.

A reader recently questioned what a column on a squash tournament was doing on a climate blog. Well, I like playing squash and the tournament was a big deal for me, so I wrote about it. No one’s obligated to read it. I don’t read Lucia’s knitting posts, but I like the idea that she writes them. My editorial guess is that, while personal touches are distracting and irrelevant in a journal publication, they add to a blog.

Delicate Flowers
I’m continually surprised at what delicate flowers are modern climate scientists. None of them seem to like being discussed here. We’ve seen this recently with Steig’s intemperate reactions to online discussion; while Steig’s outbursts have been particularly extreme, we’ve seen surprising anger not just from the original Team, but from Santer, Peter Brown and other dendros, etc. In this context, it’s perhaps reassuring to read Sullivan’s characterization of writers as “sensitive, vain souls”, who’ve had pretty cloistered lives, and are not used to feedback that is “instant, personal, and brutal.”

Again, it’s hard to overrate how different this is. Writers can be sensitive, vain souls, requiring gentle nurturing from editors, and oddly susceptible to the blows delivered by reviewers. They survive, for the most part, but the thinness of their skins is legendary. Moreover, before the blogosphere, reporters and columnists were largely shielded from this kind of direct hazing. Yes, letters to the editor would arrive in due course and subscriptions would be canceled. But reporters and columnists tended to operate in a relative sanctuary, answerable mainly to their editors, not readers. For a long time, columns were essentially monologues published to applause, muffled murmurs, silence, or a distant heckle. I’d gotten blowback from pieces before—but in an amorphous, time-delayed, distant way. Now the feedback was instant, personal, and brutal.

My guess is that a considerable portion of the angriness that informs the animosity of a number of climate scientists to this blog is nothing more than an expression of this human instinct. Having said that, they’d also be wise to understand that belligerent and contemptuous outbursts, no matter how clever they may seem to the scientist, usually end up sounding as merely rude, subtracting from the dignity of the scientist involved.

Peer Review
We often hear about the wonders of journal peer review as a gold standard for quality, with pointed and often snide comparison to the fact that one can publish a blog column at the push of a button. Sullivan:

No columnist or reporter or novelist will have his minute shifts or constant small contradictions exposed as mercilessly as a blogger’s are.

Sullivan’s description of the editorial overhead for a literary column sounds an awful lot like actual peer review (as opposed to theoretical peer review):

I’d edited a weekly print magazine, The New Republic, for five years, and written countless columns and essays for a variety of traditional outlets. And in all this, I’d often chafed, as most writers do, at the endless delays, revisions, office politics, editorial fights, and last-minute cuts for space that dead-tree publishing entails. …

Every professional writer has paid some dues waiting for an editor’s nod, or enduring a publisher’s incompetence, or being ground to literary dust by a legion of fact-checkers and copy editors. If you added up the time a writer once had to spend finding an outlet, impressing editors, sucking up to proprietors, and proofreading edits, you’d find another lifetime buried in the interstices.

Blogging seems to have a deceptive ease:

Blogging—even to an audience of a few hundred in the early days—was intoxicatingly free in comparison…

with one click of the Publish Now button, all these troubles [with editors] evaporated.

However, Sullivan found that he quickly received online reviews that were far more severe than anything from his editors:

Alas, as I soon discovered, this sudden freedom from above was immediately replaced by insurrection from below. Within minutes of my posting something, even in the earliest days, readers responded. E-mail seemed to unleash their inner beast. They were more brutal than any editor, more persnickety than any copy editor, and more emotionally unstable than any colleague.

He argues that this online immediate review is just as effective as anything from editors:

And so blogging found its own answer to the defensive counterblast from the journalistic establishment. To the charges of inaccuracy and unprofessionalism, bloggers could point to the fierce, immediate scrutiny of their readers. Unlike newspapers, which would eventually publish corrections in a box of printed spinach far from the original error, bloggers had to walk the walk of self-correction in the same space and in the same format as the original screwup. The form was more accountable, not less, because there is nothing more conducive to professionalism than being publicly humiliated for sloppiness. Of course, a blogger could ignore an error or simply refuse to acknowledge mistakes. But if he persisted, he would be razzed by competitors and assailed by commenters and abandoned by readers.

There’s a considerable truth to that – my slightest mis-step seems to provoke immediate and vociferous demands for correction. I certainly don’t claim infallibility (how could anyone who supports audits and due diligence), but I do try to be careful and the relative rareness of accusation of error may provide reassurance to readers. Plus I try very hard to provide original sources and documentation to readers.

Sullivan makes an interesting observation on the importance of the hyperlink and access to original sources as adding a depth to blog postings, that, in a sense, is unique to the form:

But the superficiality masked considerable depth—greater depth, from one perspective, than the traditional media could offer. The reason was a single technological innovation: the hyperlink. An old-school columnist can write 800 brilliant words analyzing or commenting on, say, a new think-tank report or scientific survey. But in reading it on paper, you have to take the columnist’s presentation of the material on faith, or be convinced by a brief quotation (which can always be misleading out of context). Online, a hyperlink to the original source transforms the experience.

Quite so. Even when I’m critiquing someone, I try as much as possible to make original materials (and calculations) available to readers. One can think of other blogs that don’t – which rely on paraphrasing and re-stating their opponents’ positions, rather than providing access and analysis of original materials. In a way, the extensive citation of turnkey R code can be construed as an extended riff on the hyperlink idea- it sure puts interested readers in touch with the original materials in a way that is incomprehensible in traditional publications.

Paddling
Sullivan observes acutely that blogging as a form of publication is not the same as a print article. He cites Drudge’s aphorism that a blog is a “broadcast, not a publication.”

as Matt Drudge told me when I sought advice from the master in 2001, the key to understanding a blog is to realize that it’s a broadcast, not a publication. If it stops moving, it dies. If it stops paddling, it sinks.

I hadn’t thought about it in those terms, but I think that I’ve sort of adapted to this empirically. I work in bits and pieces, I’ll work one theme for a while, abandon it for some months and perhaps return to it later. Annoys some critics and some friends, but things keep moving along – where, I don’t know, but they keep moving.

The lack of finish obviously annoys some observers, who point to my lack of output in the formal journals over the last few years, but, in fairness to me, this lack of output in formal journals has been accompanied by copious output in this forum.

Many readers are looking for answers and I warn such readers that, if they’re looking for “answers”, they’d better go some place else. My interests are in process and in questions. It seems that “meandering” and “unresolved” posts have some distinguished examples, with Sullivan contrasting the “meandering, questioning, unresolved dialogues” of one famous philosopher with the “definitive, logical treatises” of another. While I make no claim to special authority or accomplishment for anything presented here, Sullivan observes that “meandering, questioning, unresolved dialogues” are a “skeptic’s spirit translated into writing”, which is perhaps why I’ve become comfortable with the form.

Sullivan concludes:

For centuries, writers have experimented with forms that evoke the imperfection of thought, the inconstancy of human affairs, and the chastening passage of time. But as blogging evolves as a literary form, it is generating a new and quintessentially postmodern idiom that’s enabling writers to express themselves in ways that have never been seen or understood before. Its truths are provisional, and its ethos collective and messy. Yet the interaction it enables between writer and reader is unprecedented, visceral, and sometimes brutal. And make no mistake: it heralds a golden era for journalism.

Blogs are not a substitute for academic journals. Readers are inclined to make much more grandiose claims in this regard than I do. They’re not better – they’re different. They are a sui generis form of publication, that is evolving pretty rapidly. The Sullivan article is here.

Smerdon et al 2008 on RegEM

Smerdon et al 2008 is an interesting article on RegEM, continuing a series of exchanges between Smerdon and the Mann group that has been going on for a couple of years.

We haven’t spent much time here on RegEM as we might have. I did a short note in Nov 2007 here.

In July and August 2006 open review(s) of Bürger and Cubasch (CPD, 2006) of Mann (dba Anonymous Reviewer #2) referred to “correct” RegEM, referring to Rutherford et al 2005.

On July 10, 2006, Jean S commented on Rutherford-Mann 2005 “adaptations”, noting three important “adaptations:
1. use of a “hybrid”: separate application of RegEM to “low-frequency” and “high-frequency” as separated by Mannian versions of Butterworth filters;
2. stepwise RegEM
3. an unreported “standardization” step. CA readers were aware by this time that short-segment standardization could have a surprising impact on reconstructions – a point that was then very much in the news with the confirmation of this point in the North and Wegman reports being very fresh at the time. Jean S observed of this unreported standardization:

The above code “standardizes” all proxies (and the surface temperature field) by subtracting the mean of the calibration period (1901-1971) and then divides by the std of the calibration period. I’m not sure whether this has any effect to the final results, but it is definitely also worth checking. If it does not have any effect, why would it be there?

The unreported standardization step noted by Jean S was subsequently determined to be at the heart of an important defect described in Smerdon and Kaplan 2007.

Mann et al 2005 had supposedly tested the RegEM methodology used in the Rutherford et al 2005 reconstruction, then presented as mutually supporting the MBH reconstruction (although Rutherford et al 2005 could be contested on alternate grounds than those discussed by Smerdon, since Rutherford et al 2005 used Mannian PCs without apology.) Smerdon and Kaplan 2007 findings are summarized as follows (in Smerdon et al 2008):

Mann et al 2005 attempted to test the R05 RegEM method using pseudoproxies derived from the National Center for Atmospheric Research (NCAR) Climate System Model (CSM) 1.4 millennial integration… Mann et al 2005 did not actually test the Rutherford et al 2005 technique, which was later shown to fail appropriate pseudoproxy tests (Smerdon and Kaplan 2007). The basis of the criticism by Smerdon and Kaplan (2007) focused on a critical difference between the standardization procedures used in the M05 and R05 studies (here we define the standardization of a time series as both the subtraction of the mean and division by the standard deviation over a specific time interval). Their principal conclusions were as follows: 1) the standardization scheme in M05 used information during the reconstruction interval, a luxury that is only possible in the pseudoclimate of a numerical model simulation and not in actual reconstructions of the earth’s climate; 2) when the appropriate pseudoproxy test of the R05 method was performed (i.e., the data matrix was standardized only during the calibration interval), biases and variance losses throughout the reconstruction interval; and 3) the similarity between the R05 and Mann et al. (1998) reconstructions, in light of the demonstrated problems with the R05 technique, suggests that both reconstructions may suffer from warm biases and variance losses.

In their Reply to Smerdon and Kaplan 2007 (Mann et al 2007b), they claimed that the selection of the ridge parameter using generalized cross validation (GCV), as performed in R05 and M05, was the source of the problem:

The problem lies in the use of a particular selection criterion (Generalized Cross Validation or ‘GCV’) to identify an optimal value of the ‘ridge parameter’, the parameter that controls the degree of smoothing of the covariance information in the data (and thus, the level of preserved variance in the estimated values, and consequently, the amplitude of the reconstruction).

Smerdon et al 2008 (JGR) delicately observed that this assertion was supported only by arm-waving:

The authors do not elaborate any further, however, making it unclear why such conclusions have been reached.

Smerdon et al 2008 report that “explanation” of Mann et al 2007a, 2007b for the problem is invalid, stating:

These results collectively rule out explanations of the standardization sensitivity in RegEMRidge that hinge on the selection of the regularization parameter, and point directly to the additional information (i.e., the mean and standard deviation fields of the full model period) included in the M05 standardization as the source of the differences between M05- and R05-derived reconstructions. It should be noted further that this information, especially in terms of the mean, happens to be “additional” only because of a special property of the dataset to which RegEM is applied herein: missing climate data occur during a period with an average temperature that is significantly colder than the calibration period. This property clearly violates an assumption that missing values are missing at random, which is a standard assumption of EM (Schneider 2006). If the missing data within the climate field were truly missing at random, there presumably would not be a significant systematic difference between the M05 and R05 standardizations, and hence corresponding reconstructions. The violation of the randomness assumption, however, is currently unavoidable for all practical problems of CFRs during the past millennium and thus its role needs to be evaluated for available reconstruction techniques.

Finally, when the application of RegEM-Ridge is appropriately confined to the calibration interval the method is particularly sensitive to high noise levels in the pseudoproxy data. This sensitivity causes low correlation skill of the reconstruction and thus a strong “tendency toward the mean” of the regression results. It therefore will likely pose some challenges to any regularization scheme applied to this dataset when the SNR in the proxies is high. We thus expect RegEMTTLS, which according to M07a does not show standardization sensitivity, to have significantly higher noise tolerance and skill than RegEM-Ridge. The precise reasons and details of this skill increase is a matter for future research. It remains a puzzling question, however, as to why the R05 historical reconstruction that was derived using RegEM-Ridge and the calibration-interval standardization (thus expected to be biased warm with dampened variability) and the M07a historical reconstruction that used RegEM-TTLS (thus expected not to suffer significantly from biases) are not notably different. The absence of a demonstrated explanation for the difference between the performance of RegEM-Ridge and RegEM-TTLS, in light of the new results presented herein, therefore places a burden of proof on the reconstruction community to fully resolve the origin of these differences and explain the present contradiction between pseudoproxy tests of RegEM and RegEM-derived historical reconstructions that show little sensitivity to the method of regularization used.

While Mann is normally not reticent about citing papers under review, Smerdon et al 2008 is, for some reason, not cited in either Mann et al 2008 or Steig et al 2009.

In my opinion, there are other issues with the RegEM project, quite aside from these ones. These relate more to exactly what one is trying to do with a given multivariate methodology.

References:
Bürger, G., and U. Cubasch. 2005. Are multiproxy climate reconstructions robust. Geophysical Research Letters 32, no. L23711: 1-4.
—. 2006. On the verification of climate reconstructions. Climate of the Past Discussions 2: 357-370.
Mann, M. E., S. Rutherford, E. Wahl, and C. Ammann. 2005. Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate. Journal of Climate 18, no. 20: 4097-4107.
Mann, M. E., S. Rutherford, E. Wahl, and C. Ammann. 2007a. Robustness of proxy-based climate field reconstruction methods. J. Geophys. Res 112. (revised Feb 2007, published June 2007) url
Mann, M. E., S. Rutherford, E. Wahl, and C. Ammann. 2007b. Reply to Smerdon and Kaplan. Journal of Climate 20: 5671-5674. url (Nov 2007)
Mann, M.E. 2006. Interactive comment on “On the verification of climate reconstructions” by G. Bürger and U. Cubasch. Climate of the Past Discussions 2: S139-S152. url
Rutherford, S., M. E. Mann, T. J. Osborn, R. S. Bradley, K. R. Briffa, M. K. Hughes, and P. D. Jones. 2005. Proxy-Based Northern Hemisphere Surface Temperature Reconstructions: Sensitivity to Method, Predictor Network, Target Season, and Target Domain. Journal of Climate 18, no. 13: 2308-2329.
Smerdon, J. E., J. F. González-Rouco, and E. Zorita. 2008. Comment on “Robustness of proxy-based climate field reconstruction methods” by Michael E. Mann et al. J. Geophys. Res 113. url
Smerdon, J. E., and A. Kaplan. 2007. Comments on “Testing the Fidelity of Methods Used in Proxy-Based Reconstructions of Past Climate”: The Role of the Standardization Interval. Journal of Climate 20: 5666-5670. url
Smerdon, J. E., A. Kaplan, and D. Chang. 2008. On the origin of the standardization sensitivity in RegEM climate field reconstructions. Journal of Climate 21: 6710-6723. url

Badminton & Racquet Club Pro-AM

My squash club has wonderful squash doubles pro-am that runs over the first few weeks of February. The pro at our club, Eric Baldwin, does a great job organizing and it is well supported by the pros around Toronto, which is an very active squash community. Jonathon Power is now a member, as is Gary Waite, who was #1 doubles player for about 15 years. Both are interesting and personable. Jonathan played in our Pro-Am two years ago and Gary has played in it for about 6 years. It’s a real privilege for the amateurs who get to play. (When recently asked for affiliation in submissions to academic journals, I contemplated saying Badminton and Racquet Club, but decided on Climate Audit instead.)

Eric tries to balance the teams so that the best amateur gets last pick of who he plays with. The losing team in each game gets a 3-point head start in the next game. So that it’s a recipe for 5 game matches. In the first round, 4 of 8 matches were decided 15-14 in the fifth. Squash is one of the few sports where you have 2-way match points – sort of like a buzzer beater in basketball where a team is one point down. (Hockey has sudden death but the puck is only in one end at a time.)

Last night, the proprietor of this blog got to play against Paul Price, currently part of the #1 ranked squash doubles team in the world.

One of the club members made up promotional posters for the various players, one of which featured the proprietor of Climate Audit. To spare the unwary, you are not obliged to look at this poster merely because you came to the blog, but continue reading if you’re brave.
Continue reading

Gavin on McKitrick and Michaels

Ross writes:

Has Gavin posted on his IJOC paper? [SM – yes. At RC today.] I will head over tomorrow to have a look. Not today–it’s very sunny here and the ice rink beckons. I am not sure how replication plays into the issue, since I posted my data and code from the get-go.

I got a copy of Gavin’s paper (from Jos de Laat) 2 weeks ago, and Gavin sent me his data promptly in response to my request. I have been having a lot of fun with it. In his paper, Gavin shows that the coefficients weaken a bit when RSS is substituted for UAH. But he doesn’t report the joint F tests which form the basis of the conclusions about contamination. They are still strongly significant. He also points out that the coeff’s are significant when GISS data are swapped in, and claims this shows the effects are spurious. I am sure I am not the only person who has actually read the paper and noticed that to the extent they are significant the coefficients on GISS data take the opposite signs to those on the observational data. Far from showing the effects are spurious, it shows that the observations negate a significant pattern in the modeled data and make it significant in the other direction. That’s called an observable effect, not a spurious result. And since it’s a comparison of ‘clean’ GCM data versus observations it has as much causal interpretation as the 3-part ‘signal detection’ methodology in the IPCC reports. I.e. in this case it’s a “signal detection” result for non-climatic contamination of the surface data. Gavin doesn’t report a chi-squared test of parameter equivalence between coeff’s estimated on modeled and observed data (akin to the outlier test and hausman test shown in MM07), but that’s OK because he posted his data, so I have done it and parameter equivalence is strongly rejected.

The issue of spatial autocorrelation is a huge red herring. Over a year ago I responded to the RC posting on this by writing a paper showing that spatial AC is not significant and even if we control for it anyway the results all hold up. The JGR would not publish my response unless Rasmus submitted his comment. I challenged him to do so in December of 2007 and he said it might take a while since he was getting busy with work. That’s the last I heard from him on it. I also deal with the topic in a more recent paper under review elsewhere, which I have reason to believe Rasmus has read. But when I write up my response to Gavin’s paper I’ll be sure to give the spatial AC issue a thorough discussion.

Editorial note (SM): I have not worked through any of these papers and accordingly have no personal view or opinion on the details. Schmidt acknowledges that Ross’ due diligence package was complete. Schmidt sneers at the problems that others encounter in trying to get to the starting point (for e.g. Steig) that would have been provided by a proper due diligence package, such as the one that Ross provided, but, inconsistently, appears to have consulted Ross’ well-prepared due diligence package in his own paper.

Lest Sweetness Be Wasted on the Desert Air

Full many a gem of purest ray serene
The dark unfathomed caves of ocean bear:
Full many a flower is born to blush unseen,
And waste its sweetness on the desert air.

Thomas Gray (English Poet, 1716-1771)

From time to time, we here at Climate Audit have the opportunity to draw attention to otherwise obscure flowers that would otherwise “waste their sweetness”. Today is one such opportunity. On Jan 31, Larry Solomon wrote a column on Antarctica here.

On Feb 2, 2008, while realclimate coauthor Gavin Schmidt was worried that a delicate flower newly in bloom in Climate Audit might “waste its sweetness in the desert air”, his coauthor Michael Mann contributed what we can only describe as a “gem of purest ray serene” at Google News here (or National Post here, which I excerpt in full below.

Comment by Michael E. Mann, Director, Earth System Science Center, Penn State
google news comment

Fossil fuel industry shill Solomon continues to lie to public – Feb 2, 2009

In his latest piece in the tabloid the “National Post,” Mr. Lawrence Solomon, a widely recognized purveyor of fossil fuel-funded disinformation, continues to use the forum provided to him by the Post to spread lies about scientists and scientific research in the area of global climate change.

Doing the bidding of the fossil fuel industry that financially supports his disinformation efforts, Mr. Lawrence repeatedly lies about my work, the work of my colleagues, the findings of the scientific community, and even the judgments of the world’s leading scientific organizations and journals.

Mr Lawrence full well knows, for example, that my own work on paleoclimate reconstructions from more than a decade ago has been reproduced by many groups, and vindicated in a report by the U.S. National Academy of Sciences, and even more recently, in the fourth assessment report of the Intergovernmental Panel on Climate Change (for a lay-person’s guide to this report, you might check out my recent book “Dire Predictions: Understanding Global Warming” published by DK/Pearson). Mr. Lawrence disingenuously implies otherwise by citing a partisan attack several years ago by Joe Barton, the leading recipient of fossil fuel industry money in the U.S. House of Representatives (and who is often referred to as “smokey Joe Barton” for his support of industry’s right to pollute our environment). Barton’s attacks were decried by newspapers editorials around the world, which likened it to a modern-day McCarthyism, using word’s like “witch hunt” and “inquisition.” The discredited Barton attacks were dismissed by organizations like the American Association of the Advancement of Science, which dismissed the attacks as an attempt to intimidate scientists whose findings may prove troublesome to industry special interests. The National Academy of Sciences responded by performing a legitimate scientific review of my findings and similar work by others in the scientific community, and the academy endorsed or key findings, noting that a host of additional studies since have confirmed them. The media reported the NAS findings as “Science Panel Backs Study on Warming Climate” (New York Times), “Backing for Hockey Stick Graph (BBC), and so on.

Mr Solomon of course knows about all of this, but still chooses to mislead and lie to his readers. His statements about our current study in ‘Nature’ (which from his article you’d think I was the sole author — in fact, I was only the 4th in a team of 6 co-authors) which studies the long-term warming of Antarctica and its causes, are unusually disingenuous and specious. As described in detail elsewhere (e.g. the website “RealClimate.org” which I co-founded), our latest study is not contradicted by weather records at all, despite Mr. Solomon’s dishonest attempt to imply otherwise by misrepresenting and cherry-picking anecdotal observations. Our study reproduces the well known cooling of the Antarctic interior which took place during the 1970s through 1990s (and is believed to be, as confirmed by our study, due to stratospheric ozone depletion which was greatest over that particular time interval). However, by combining the available temperature observations, we show that the longer-term pattern for Antarctica on the whole is one of warming, and this is consistent with the expected response to the long-term increase in greenhouse gas concentrations. Finally, Mr Lawrence goes on to attack the journal ‘Nature.’ Gee, who should we trust here? The most dishonest industry advocate in the climate change debate, or the world’s most prestigious peer-reviewed scientific journal. You be the judge.

How ironic that Mr Lawrence uses the word “shame” in his disinformation piece. For he is perhaps the most shameful and dishonest actors in the climate change disinformation machine. Some people indeed have no shame. Nonetheless, in Mr Solomon’s case, the judgment of history will be his condemnation.

Larry Solomon replied today here.

Mann and Solomon, like many before them, have quite opposite accounts of the Wegman and North panel reports. For interested readers, I compiled relevant quotes here and both reports, the testimony of both Wegman and North, and Wegman’s answers to supplementary questions are linked from this site.

Note: Apart from regular provisions at RC, Mann connoisseurs may enjoy these two other flowers by the Maestro here or as Referee # 2 here.

Steig’s “Corrections”

Roman M has already done one post on the impact of the Harry error. Ryan O has also done so [see comment here]. As has Steig.

I show below some graphics that I’ve just done on AWS recon trends.

At Steig’s website, he now states:

awsreconcorrected.txt is a correction to the above file, using corrected AWS data at two AWS stations. See the note under “Raw Data”, below. [Note that corrections have been made to AWS stations “Harry” and “Racer Rock”. See the READER web site for details. ] The resulting differences in the reconstruction are too small to be discernable in Figure S3, or in the trends for individual stations given in Table S1 in the Supplementary Information that accompanies the paper in Nature. (Corrections for all stations in the Table S1 are in the third decimal place (that is <0.01 degrees C/decade)). The mean trend for Antarctica changes by less than 0.004 degrees C/decade. The mean trend for West Antarctica changes by less than 0.02 degrees C/decade. Note also there is a typo in Table S2. The correct coordinates for station ‘Harry’ are 83.0 S 238.6 E. Note that none of these corrections have any impact on the satellite-based reconstructions; no AWS data were used in those reconstructions.

Several different trend period are used in the article. In the main text, Figure 4 uses a period of 1979-2003 as shown below (for AVHRR data.)

Steig Figure 4e. Comparison of reconstructed and modelled mean annual temperature trends (degrees Celsius per decade) for the periods; e–h, 1979–2003.

Figure S3 referred to above shows trends for AWS stations, but for incompatible periods: left – 1957-2006 and right 1969-2000.

Figure 2. Steig Figure S3 excerpt.

To “facilitate comparison” with Steig et al Figure 4, I calculated trends for the 1979-2003 period for the AWS reconstruction, with trends being illustrated more or less emulating the style of Figure S3 (I’ll post scripts up in the comments.) I’ve labeled some of the stations in play.

Harry is the station with the largest 1979-2003 trend in the entire reconstruction, as you can readily see below. Harry, Racer Rock, Clean Air, Elaine and Butler Island, have all been mentioned in recent comments with amendments of one type or another being required to the first four stations (though only Racer Rock and Harry changes are incorporated in the present update.)

The Harry trend (the one with the screwed up data) is a very distinct feature in West Antarctica.


Figure S3. Spatial pattern of temperature trends (°C/decade) from reconstruction using AWS data. a) Mean annual trends for 1957-2006. b) Mean annual trends for 1969-2000, to facilitate comparison with ref. (2).

I just downloaded the “corrected” versions from Steig’s website, yielding the plot below. In this period, I submit that it is clearly not the case that the “resulting differences in the reconstruction are too small to be discernable”.


Figure 4. 1979-2003 trends in Steig AWS reconstruction (corrected).

I’ll experiment a bit tomorrow to see why there are such differences in results for the 1979-2003 period and the 1969-2000 period that Steig discusses.

Reblog this post [with Zemanta]

Deconstructing the Steig AWS Reconstruction

So how did Steig et al. calculate the AWS reconstruction? Since we don’t have either the satellite data or the exact path they followed using the RegEM Matlab files, we can’t say for sure how the results were obtained.

However, using a little mathemagic, we can actually take the sequences apart and then calculate reconstructed values for those times when the reconstructions have been replaced by the measured temperature values.

I realize that there are problems with the AWS data sets, but for our purposes that is irrelevant. This analysis can easily be repeated for the updated results if and when Prof. Steig does the recalcs.

We are given 63 reconstructed sequences each extending monthly from 1957 to 2006. The first step is to look only at the part of those sequences which is not affected by either the measured values nor by the satellite induced manipulation – the time period 1957 – 1979.

We begin by using a principal component analysis (and good old R) on the truncated sequences. For our analysis, we will need three variables previously defined by Steve Mc.: Data, Info, and recon_aws. In the scripts, I also include some optional plots which are not run automatically. Simply remove the # sign to run them.

#Run this section if you don’t have the variables Data, Info and recon_aws
download.file(“http://data.climateaudit.org/data/steig/Data.tab&#8221;,”Data.tab”,mode=”wb”)
load(“Data.tab”)
download.file(“http://data.climateaudit.org/data/steig/Info.tab&#8221;,”Info.tab”,mode=”wb”)
load(“Info.tab”)
download.file(“http://data.climateaudit.org/data/steig/recon_aws.tab&#8221;,”recon_aws.tab”,mode=”wb”)
load(“recon_aws.tab”)
dimnames(recon_aws)[[2]]=Info$aws_grid$id
#start the calculations
early.rec = window(recon_aws,start=1957,end = c(1979,12))
rec.pc = princomp(early.rec)
#plot(rec.pc$sdev)
rec.pc
# Comp.1 Comp.2 Comp.3 Comp.4 …
#1.334132e+01 6.740089e+00 5.184448e+00 1.267015e-07 …

There are 63 eigenvalues in the PCA. The fourth largest one is virtually zero. This makes it very clear that the reconstructed values are a simple linear combination of only three sequences, presumably calculated by the RegEM machine. The sequences are not unique (which does not matter for our purposes).

We can use the first three principal components from the R princomp procedure:

rec.score = rec.pc$scores[,1:3]
#matplot(time(early.rec),rec.score,type = “l”)
#flip the PCs for positive trend
#does not materially affect the procedures
rec.score = rec.score%*%diag(sign(coef(lm(rec.score ~ time(early.rec)))[2,]))

reg.early =lm(early.rec[,1:63]~rec.score[,1]+rec.score[,2]+rec.score[,3])
#SM: this gives coefficients for all 63 reconstructions in terms of the 3 PCs
reccoef=coef(reg.early)
pred.recon = predict(reg.early)
max(abs(predict(reg.early)-early.rec)) #[1] 5.170858e-07

I have flipped the orientation of the PCs so that the trend coincides with a positive trend in time for each component. This also doesn’t make a difference to our result (I sound like a member of the team now) because the coefficient which multiplies the PC in the reconstruction will automatically change sign to accommodate the change. pred.recon is the reconstruction using only the three PCs . The largest difference between Steig’s values and pre.con is .0000005 so the fit is good!

Now, what about the time period from 1980 to 2006? The method we have just used will not work because, for each series, some of the reconstructed values have been replaced by measurements from the AWS. We will assume that exactly three “PCs” were used with the same combining coefficients as in the early period.

The solution is then to find intervals in 1980 to 2006 time range where we have three (or more) sites which do not have any actual measurements during that particular interval. Then, using simple regression, we can reconstruct the PC sequences. It turns out that the problem can be reduced to just two intervals: 1980 – 1995 and 1996-2006:

#laterecon
#figure out first and last times for data for each aws
dat_aws = window(Data$aws[,colnames(early.rec)],end=c(2006,12))
not_na = !is.na(dat_aws)
trange = t(apply(not_na,2,function(x) range(time(dat_aws)[x])))

#construct 1980 to 1995
#identify late starters
sort(trange[,1])[61:63]
# 89828 21363 89332
#1996.000 1996.083 1996.167
names1 = c(“89828”, “21363”, “89332”)
reg1 = lm(rec.score ~ early.rec[,names1[1]]+ early.rec[,names1[2]]+ early.rec[,names1[3]])
rec.pc1 = cbind(1,window(recon_aws[,names1],start=1980,end =c(1995,12)))%*%coef(reg1)
#matplot(rec.pc1,type = “l”)

#construct 1980 to 1995
#identify earlier finishers
sort(trange[,2])[1:3]
# 89284 89834 89836
#1992.167 1994.750 1995.917
names2 = c(“89284″,”89834″,”89836”)
reg2 = lm(rec.score ~ early.rec[,names2[1]]+ early.rec[,names2[2]]+ early.rec[,names2[3]])
rec.pc2 = cbind(1,window(recon_aws[,names2],start=1996))%*%coef(reg2)
#matplot(rec.pc2,type = “l”)

Now we put the two together and check to see how the fit is (ignoring the elements that have been replaced)

#check for fit 1980 to 2006
late.rec = window(recon_aws,start=1980)
estim.rec = cbind(1,rbind(rec.pc1,rec.pc2))%*%reccoef
diff.rec = estim.rec-late.rec
max(abs(diff.rec[!not_na]))#[1] 5.189498e-07
matplot(time(late.rec),diff.rec,type=”l”,main = “Diff Between Recons and Measured Temps”,ylab = “Anom (C)”,xlab = “Year”)

Not bad! Again we have a very good fit. The differences visible in the plot are the differences between the reconstruction we just did and the “splicing” of the AWS data by Steig. Some of the larger ones are likely due to the already identified problems with the data sets.
Finally we put everything together into a single piece:

#merge results
recon.pcs = ts(rbind(rec.score,rec.pc1,rec.pc2), start = 1957,frequency=12)
colnames(recon.pcs) = c(“PC1″,”PC2″,”PC3”)
matplot(time(recon.pcs),recon.pcs,type = “l”,main = “AWS Reconstruction PCS”,xlab=”Year”, ylab =”PC Value”)
legend(1980,-40,legend = c(“PC1”, “PC2″,”PC3”),col = c(1,2,3), lty = 1)

#trends (per year) present in the PCs
reg.pctrend = lm(recon.pcs ~ time(recon.pcs))
coef(reg.pctrend)
# PC1 PC2 PC3
#(Intercept) -199.522419 -62.90221792 -222.1577310
#time(recon.pcs) 0.101605 0.03189134 0.1128271

The trend slopes (which are per year) should be taken with a grain of salt since the PCs are multiplied by different coefficients (often quite a bit less than one).

Where can we go from here? I think one interesting problem might be to evaluate the effect of surface stations on the PC sequences (only pre-1980). This might give some insight on what stations drive the results. As well, AWS positions appear to be related to the coefficients which determine the weight each PC has on that AWS reconstruction.

Finally, measures of “validation” can now be calculated using the observed AWS data to see how well the reconstruction fits. All possible without knowin’ how they dunnit!

Reblog this post [with Zemanta]

More Changes at BAS

More changes at 6 ocean stations at BAS listed here , together with a revised “credit” here. One of the 6 stations is Chatham Island, where I noticed a problem on June 13, 2008 when I was trying to replicate GISS methodology using Wellington NZ as an example. John Goetz pinned the erroneous source to BAS. A little later, we noticed that GISS had made reference a few days earlier (June 9, 2008) to manually adjusting these records, something that we wondered about at the time. GISS reported:

June 9, 2008:… some errors were noticed on http://www.antarctica.ac.uk/met/READER/temperature.html (set of stations not included in Met READER) that were not present before August 2007. We replaced those outliers with the originally reported values. Those two changes had about the same impact on the results than switching machines (in each case the 1880-2007 change was affected by 0.002°C). See graph and maps.

A few days ago, pondering Gavin Schmidt’s newly discovered zeal for correcting faulty station data, recalling our prior issues with Chatham Island, I wondered whether GISS had showed similar zeal in correcting Chatham Island and other station problems referred to in their June 9 update note. I checked Chatham Island and it hadn’t been changed – apparently Gavin the Mystery Man hadn’t spent his Super Bowl evening correcting Chatham Island records. So I notified them. They thanked me and said that they would acknowledge me, which they did.

I also wondered whether Gavin’s zeal had extended to correcting GISS’ own version of Harry had been corrected – it hadn’t. It was still uncorrected several days later. A CA reader mischievously suggested that I inform Hansen at GISS of the problem with their Harry data. I did so on Feb 4. I was copied the next day on a businesslike reply from Reto Ruedy to Hansen, explaining that Harry did not enter into GISTEMP calculations – a point that I agree with. Indeed, in my very first post on this topic, I observed that Steig used a considerable amount of data that did not meet GISTEMP quality control standards (including, as it turns out, Harry.)

Later that day, I was copied on an email from Ruedy to BAS, in which he re-transmitted an email to BAS of May 16, 2008 asking them to correct information from various stations, including Chatham Island. For some reason, BAS had never bothered making the changes requested by Reto Ruedy and, even more remarkably, Gavin hadn’t stayed up late ensuring that the BAS record was corrected.

Upon learning that NASA GISS had notified BAS of these errors on May 16, 2008, I sent BAS an email agreeing with Ruedy’s request that the records be correcting and requesting that they correct their notice to reflect NASA GISS’s priority in this matter:

For what it’s worth, I agree with this request. I also note that NASA GISS observed this problem prior to my doing so and request that you correct the notice on your page to reflect this. Regards, Steve McIntyre

Today they corrected the 6 records using NASA GISS’s information as a source. If they are using NASA GISS as a source, then it seems a little circular for NASA GISS to continue using BAS as a dset0 source, so Ruedy will have to put his thinking cap on this issue. The revised notice today credits both Reto Ruedy and myself, for pointing out the problems, but all of the errors were identified by NASA GISS.

This was an example where the identification of the Chatham Island error was truly independent. I noticed the problem in the course of analyzing data for Wellington NZ on June 13 and, at the time, was unaware of the June 9 update at GISS, which, in any event, did not mention Chatham Island. Suppose that the GISS update had said that there was a problem with Chatham Island without saying what it was, I had then consulted the station records and observed the faulty 1988 and 1989 values, which are easily observed once one knows that there’s a problem with these records, and had then rushed off a midnight email to BAS, saying the next day to GISS – if you hadn’t played “games”, then maybe you would got the “credit” with a smiley attached. Until the Gavin Affair, I couldn’t imagine such undignified behavior.

Gavin’s Complaint

Many bemused blog readers know by now that Gavin Schmidt aka Mystery Man has taken a few hours off from his dedicated and long-standing interest in station data integrity to file a complaint to the University of Colorado about a post that Roger Jr wrote about the Gavin Affair.

Gavin has refused Pielke permission to expressed a desire that Pielke not publish his side of the complaint (Pielke observes that the emails came from an unprotected .gov email.)

Roger Jr’s first account of the Gavin Affair is on Feb 4 here, an account of Gavin’s first demands here – note particularly this comment and the matter continues on in new post today.

If you wish to comment on this, please ensure that you also register your comments at Roger’s blog.