The new Santer et al. paper, Forced and unforced ocean temperature changes in Atlantic and Pacific tropical cyclogenesis regions, purports to show that sea surface temperature (SST) changes in the Pacific Cyclogenesis Region (PCR) and the Atlantic Cyclogenesis Region are caused by anthopogenic global warming (AGW). They claim to do this by showing that models can’t reproduce the warming unless they include AGW forcings. In no particular order, here are some of the problems with that analysis.
1) The models are "tuned" to reproduce the historical climate. By tuned, I mean that they have a variety of parameters that can be adjusted to vary the output until it matches the historical trend. Once you have done that tuning, however, it proves nothing to show that you cannot reproduce the trend when you remove some of the forcings. If you have a model with certain forcings, and you have tuned the model to recreate a trend, of course it cannot reproduce the trend when you remove some of the forcings … but that only tells us something about the model. It shows nothing about the real world. This problem, in itself, is enough to disqualify the entire study.
2) The second problem is that the models do a very poor job of reproducing anything but the trends. Not that they’re all that hot at reproducing the trends, but what about things like the mean (average) and the standard deviation? If they can’t reproduce those, then why should we believe their trend figures? After all, the raw data, and it’s associated statistics, are what the trend is built on.
Fortunately, they have reported the mean and standard deviation data. Unfortunately, they have not put 95% confidence intervals or trend lines on the data … so I have remedied that oversight. Here are their results:

(Original Caption) Fig. 4. Comparison of basic statistical properties of simulated and observed SSTs in the ACR and PCR. Results are for climatological annual means (A), temporal standard deviations of unfiltered (B) and filtered C) anomaly data, and least-squares linear trends over 1900–1999 (D). For each statistic, ACR and PCR results are displayed in the form of scatter plots. Model results are individual 20CEN realizations and are partitioned into V and No-V models (colored circles and triangles, respectively). Observations are from ERSST and HadISST. All calculations involve monthly mean, spatially averaged anomaly data for the period January 1900 through December 1999. For anomaly definition and sources of data, refer to Fig. 1. The dashed horizontal and vertical lines in A–C are at the locations of the ERSST and HadISST values, and they facilitate visual comparison of the modeled and observed results. The black crosses centered on the observed trends in D are the 2 sigma trend confidence intervals, adjusted for temporal autocorrelation effects (see Supporting Text). The dashed lines in D denote the upper and lower limits of these confidence intervals. I only show Figs. 4A and 4B. The left box is Fig. 4A, and the right box is 4B
I have added the red squares around the HadISST mean and standard deviation, along with the trend lines and expected trend lines. Regarding Fig. 4A, which shows the mean temperatures of the models and observations, the majority of the models show cooler SSTs than the observations. Out of the 59 model runs shown, only three of them are warmer in both regions. Two of them are over two degrees colder in both regions, which in the tropical ocean is a huge temperature difference. Only one of the 59 model runs is within the 95% confidence interval of the mean.
Next, look at the trend lines in 4A. In the real world, when the Atlantic warms up by one degree, the Pacific only warms by about a third of a degree. Even if the mean temperatures are incorrect, we would expect the models to reproduce this behaviour. The trend line of the models does not show this relationship.
The standard deviations (Fig. 4B) are even worse. There are no model results anywhere close to the observations. The majority of the models tend to overestimate the variability in the Pacific, and underestimate the variability in the Atlantic. This is probably because the variability is inherently larger in the Atlantic (standard deviation 0.35°), and lower in the Pacific (standard deviation 0.24). However, this difference is not captured by the models. The trend line (thick black line) shows that on average, the model Pacific variability is 90% of the Atlantic variability, when it should be only 60%. The light dotted line shows where we would expect the model results to be clustered, if they captured this difference in variability. Only a few of the models are close to this line.
3) All of this begs the question of whether we can use standard statistical procedures on this data. All of the data is strongly autocorrelated (Pacific, lag(1) autocorrelation = 0.80, Atlantic = 0.89). In their caption to Fig. 4 they say that they are adjusting for autocorrelation in the trend sigma. Unfortunately, they have not done the same regarding the standard deviations shown in Fig. 4B.
In addition to being autocorrelated, the Pacific data is strongly non-normal (Jarque-Bera test, p =
). Here is the histogram of the Pacific data.

As you can see, the data is quite skewed and peaked. Thus, even when we adjust for autocorrelation, it is unclear how much we can trust the standard statistical methods with this data.
4) There are likely more problems with this paper … but this is just a first analysis.
My conclusion? These models are not ready for prime time. They are unable to reproduce the means, the standard deviations, or the relationship between the two ocean regions. I do not think that we can conclude anything from this study, other than that the models need lots of work.
w.